Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Add tutorial Gotchas using NumPy #12007

Merged
merged 4 commits into from
Aug 29, 2018

Conversation

Ishitori
Copy link
Contributor

@Ishitori Ishitori commented Aug 2, 2018

Description

Add tutorial Gotchas using NumPy. Explain about blocking calls and how to minimize the impact.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Code is well-documented:
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Adds new tutorial

@Ishitori Ishitori requested a review from szha as a code owner August 2, 2018 19:41

One key difference is in the way calculations are executed. Every `NDArray` manipulation in Apache MXNet is done in asynchronous, non-blocking way. That means, that when we write code like `c = a * b`, where both `a` and `b` are `NDArrays`, the function got pushed to the [Execution Engine](https://mxnet.incubator.apache.org/architecture/overview.html#execution-engine), which starts the calculation. The function immediately returns back, and the user thread can continue execution, despite the fact that the calculation may not have been completed yet.

`Execution Engine` builds computation graph which may reorder or combine some calculations, but it honors dependency order: if there are other manipulation with `c` done later in the code, the `Execution Engine` will start doing them once the result of `c` is available. We don't need to write callbacks to start execution of subsequent code - the `Execution Engine` is going to do it for us.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

builds the


## NumPy operators vs. NDArray operators

Despite the fact that [NDArray API](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html) was specifically designed to be similar to `NumPy`, sometimes it is not easy to replace existing `NumPy` computations. The main reason is that not all operators, that are available in `NumPy`, are available in `NDArray API`. [This regularly updated page](https://github.com/apache/incubator-mxnet/issues/3199) contains the list of `NDArray API` operators in progress, where:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The page linked doesn't seem to be updated in 2 years. Is that the only reference we can provide? We should probably at least remove the words 'regularly updated'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to mention the Issue. You can redirect to the API page
http://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#the-ndarray-class

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't that info moved over to Confluence?

Copy link
Member

@nswamy nswamy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this really useful document. Looks good,
tagging @aaronmarkham for reviewing readability


## NumPy operators vs. NDArray operators

Despite the fact that [NDArray API](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html) was specifically designed to be similar to `NumPy`, sometimes it is not easy to replace existing `NumPy` computations. The main reason is that not all operators, that are available in `NumPy`, are available in `NDArray API`. [This regularly updated page](https://github.com/apache/incubator-mxnet/issues/3199) contains the list of `NDArray API` operators in progress, where:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to mention the Issue. You can redirect to the API page
http://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#the-ndarray-class

@nswamy nswamy added Feature Doc NDArray pr-awaiting-review PR is waiting for code review labels Aug 9, 2018
@szha szha removed their request for review August 9, 2018 19:05
@sandeep-krishnamurthy
Copy link
Contributor

@aaronmarkham - ping

Copy link
Contributor

@aaronmarkham aaronmarkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tutorial! Just a few suggestions.


Instead of using NumPy arrays Apache MXNet offers its own array implementation named [NDArray](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html). `NDArray API` was intentionally designed to be similar to `NumPy`, but there are differences.

One key difference is in the way calculations are executed. Every `NDArray` manipulation in Apache MXNet is done in asynchronous, non-blocking way. That means, that when we write code like `c = a * b`, where both `a` and `b` are `NDArrays`, the function got pushed to the [Execution Engine](https://mxnet.incubator.apache.org/architecture/overview.html#execution-engine), which starts the calculation. The function immediately returns back, and the user thread can continue execution, despite the fact that the calculation may not have been completed yet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: got --> was or is


To get the result of the computation we only need to access the resulting variable, and the flow of the code will be blocked until the computation results are assigned to the resulting variable. This behavior allows to increase code performance while still supporting imperative programming mode.

Refer to [this tutorial](https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html), if you are new to Apache MXNet and would like to learn more how to manipulate NDArrays.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to link with specific information to boost the performance of indexing and SEO. So... Refer to the intro tutorial to NDArray if you are new to MXNet...


## NumPy operators vs. NDArray operators

Despite the fact that [NDArray API](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html) was specifically designed to be similar to `NumPy`, sometimes it is not easy to replace existing `NumPy` computations. The main reason is that not all operators, that are available in `NumPy`, are available in `NDArray API`. [This regularly updated page](https://github.com/apache/incubator-mxnet/issues/3199) contains the list of `NDArray API` operators in progress, where:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't that info moved over to Confluence?

<NDArray 10 @cpu(0)> <!--notebook-skip-line-->


### Search for an operator on [Github](https://github.com/apache/incubator-mxnet/pulls)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using the label?
https://github.com/apache/incubator-mxnet/labels/Operator
Or add closed to it?

@sandeep-krishnamurthy
Copy link
Contributor

@lshitori - Can you please address few suggested changes by @aaronmarkham and this is all ready to go!

@sandeep-krishnamurthy sandeep-krishnamurthy added pr-awaiting-response PR is reviewed and waiting for contributor to respond and removed pr-awaiting-review PR is waiting for code review labels Aug 20, 2018
@Ishitori
Copy link
Contributor Author

@sandeep-krishnamurthy @aaronmarkham @larroy @rahul003 thanks for the review. Updated the tutorial based on your comments

@sandeep-krishnamurthy sandeep-krishnamurthy merged commit 1f0d6ba into apache:master Aug 29, 2018
anirudh2290 pushed a commit to anirudh2290/mxnet that referenced this pull request Sep 19, 2018
* Add tutorial Gotchas using NumPy

* Forcing build

* Code review fix

* Forcing build
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Doc Feature request NDArray pr-awaiting-response PR is reviewed and waiting for contributor to respond
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants