-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
While loop #32
While loop #32
Conversation
python/mxnet/ndarray/contrib.py
Outdated
2) optionally outputs for each step | ||
The signature of ``fun'' can be: | ||
* def func(loop_vars): new_loop_vars | ||
or * def func(loop_vars): (output, new_loop_vars) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new_loop_vars should have the same number of elements as loop_vars. The shape and data type of NDArrays in new_loop_vars should match the ones in loop_vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
solved
python/mxnet/ndarray/contrib.py
Outdated
|
||
``loop_vars'' is a list of NDArrays that represent loop variables. | ||
|
||
``max_iterations'' is a python or NDArray scalar that defines the maximal number of iterations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why max_iterations should be a NDArray scalar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be a python scalar, or a NDArray scalar, or a numpy ndarray scalar, _to_python_scalar below will convert it to a python scalar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
solved
to the given type | ||
""" | ||
if isinstance(inputs, ndarray.NDArray): | ||
inputs = inputs.asscalar() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need to cast the type here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. inputs will be cast to the given type
inside the try block so that any cast error will be caught and reported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
solved
python/mxnet/ndarray/contrib.py
Outdated
TODO(Junru): review the documentation, it has been out-of-date. | ||
Corner case #1: what if cond is always false? Do we return "outputs"? | ||
""" | ||
def _to_python_type(inputs, type, name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this function is to cast to Python scalar? If so, please rename it to make it more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
solved
python/mxnet/ndarray/contrib.py
Outdated
result = tuple(result) | ||
if not isinstance(result, tuple): | ||
raise ValueError("Invalid return type of func: %s" % (type(result).__name__)) | ||
if len(result) == 2 and (isinstance(result[1], list) or isinstance(result[1], tuple) or len(loop_vars) == 1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the last one isn't len(result[1])?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
solved
src/operator/control_flow.cc
Outdated
// func_outputs[num_out_data: ] are new_loop_vars, need to allocate new memory | ||
for (size_t i = params.num_out_data; i < outputs.size(); ++i) { | ||
size_t j = i - params.num_out_data; | ||
func_outputs[i] = NDArray(inputs[j].shape(), inputs[j].ctx(), true, inputs[j].dtype()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new NDArrays are created in every iteration. we need to avoid creating new NDArrays in the inference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mark it as TODO par our discussion
const OpContext& ctx, | ||
const std::vector<NDArray>& inputs, | ||
const std::vector<OpReqType>& req, | ||
const std::vector<NDArray>& outputs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since while loop always returns NDArrays whose size is max_iteration, we need a way to communicate with users where the valid data is in the output array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
state.n_iterations provides the actual number of iterations we have done. To communicate with users, we could expose it to a NDArray scalar, or something else.
Also, users could use an NDArray scalar like i
to indicate this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marked as skipped par our discussion
src/operator/control_flow.cc
Outdated
// a var | ||
igrads[i] = (step == 0) | ||
? outputs[i] | ||
: NDArray(outputs[i].shape(), outputs[i].ctx(), true, outputs[i].dtype()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. we can potentially avoid memory allocation here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marked as TODO as per our discussion
src/operator/control_flow.cc
Outdated
for (size_t i = params.num_out_data; i < outputs.size(); ++i) { | ||
size_t j = i - params.num_out_data; | ||
CHECK_EQ(func_inputs[j].shape(), func_outputs[i].shape()); | ||
func_inputs[j] = func_outputs[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func_inputs have two types of inputs. Some of them are loop vars and the others are referenced in the Python closure. Do you keep the variables in the closure in the beginning of the vector? The output loop vars have the same order as the input loop vars?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please refer to my comments in _union_inputs
src/operator/control_flow.cc
Outdated
// copy done, call InferShape | ||
g.attrs["shape"] = std::make_shared<dmlc::any>(std::move(shapes)); | ||
g = exec::InferShape(std::move(g)); | ||
if (g.GetAttr<size_t>("shape_num_unknown_nodes") != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to copy the inferred shapes back first before returning. Shape inference takes multiple rounds. each round propagates shape info. having unknown shapes doesn't mean the shape inference fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please allow me to summarize it:
Before shape inference, should
a) check consistency and possibly assign: loop_vars
and new_loop_vars
After shape inference in a subgraph, should
b) check consistency and possibly assign: loop_vars
and new_loop_vars
c) check consistency and possibly assign: subg_in
(shape of inputs of the subgraph) and in_shape
(shape of inputs of the while_loop)
I guess a) could be omitted when a correct control flow is provided, because type inference is done in many rounds, and c) will cover a) finally. But it is still something necessary because we need to report it if users made any mistake.
This is not limited to shape inference, I should also do it in type inference etc.
TODOs:
- shape inference
- type inference
- storage type inference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
solved
sync to latest code
No description provided.