dynamic custom operator support #15921

samskalicky · 2019-08-15T21:21:02Z

Description

Enhancements to dynamic library loading to support custom operators in libraries.

added MXTensor/MXDtype structure, and versioning to lib_api.h
added similar NNVM register op-like capability for custom ops
operators are found in the library, and re-registered in MXNet during library loading
operators are re-registered from mx.nd.op to mx.nd and mx.sym.op to mx.sym shortcuts
Created a new example library in "example/lib_ops" with a GEMM operator

Initially, this project was proposed on the CWiki, however the design has evolved since the initial proposal. The current design is described below.

Design

The goal of this PR to to enable operators to be implemented in separate libraries and loaded at runtime

The main constraint is to maintain a low-level C-types only boundary between MXNet and the library to simplify the building and compiling of external libraries.

Working backwards from the user, users register operators with easy-to-use function prototypes like:

int myForward(std::map<std::string,std::string> attrs,
               std::vector<MXTensor> inputs, 
               std::vector<MXTensor> outputs);

Users' Forward (ie. FCompute) functions are called by a helper function _opCallFCompute that converts the C-types passed across the library boundary to the familiar STL types. This function is implemented in the lib_api.h header file that users compile with their library.

int _opCallFCompute(fcomp_t fcomp, const char* const* keys, const char* const* vals, int num,
                    const int64_t** inshapes, int* indims, void** indata, int* intypes, int num_in,
                    const int64_t** outshapes, int* outdims, void** outdata, int* outtypes, int num_out);

In MXNet's C API, the _opCallFCompute function is found in the library. A lambda function fcomp_conv is created for each operator loaded from the library to convert from MXNet-types to C-types. Then these C-types are passed to _opCallFCompute.

auto fcomp_conv = [=](const nnvm::NodeAttrs& attrs,
                      const OpContext& ctx,
                      const std::vector<TBlob>& inputs,
                      const std::vector<OpReqType>& req,
                      const std::vector<TBlob>& outputs);

The same design is used for: parseAttrs, inferShape, inferType, etc.

Finally, an operator is re-registered in MXNet with the lambda function like:

nnvm::Op &regOp = dmlc::Registry<nnvm::Op>::Get()->__REGISTER_OR_GET__(name);
regOp.set_attr<FCompute>("FCompute<cpu>",fcomp_conv);

Once the C API returns back to Python in the load function in library.py, we regenerate the Python bindings and re-register the operator shortcuts to mx.nd and mx.sym.

After the load function returns back to the user's Python code, they can then use their operators just like any other operate:

mx.nd.myCustomOp(A,B,C)

Current Features

custom CPU operators
stateless & stateful operators
custom subgraph operators
Memory resource request

Future/Next-steps

(to be done in a separate PR)

custom GPU operators
Random number generator resource request
sparse data types
migrate lambda functions in MXLoadLib in src/c_api/c_api.cc to classes defined elsewhere
Documentation, add the "library" python package to the namespace to the doc: https://mxnet.apache.org/api/python/docs/api/ ?

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

updated path to be absolute in example/lib_api/test.py
update the Makefile/CMakeLists.txt to build the new example in lib_ops instead of lib_api
moved library import in python/mxnet/init.py to after ndarray/symbol
added operator reregistration to python/mxnet/library.py
added operator discovery/registration in MXLoadLib in src/c_api/c_api.cc

Comments

moved library import order to after ndarray/symbol

re-registered ops from mx.nd.op to mx.nd

samskalicky · 2019-08-20T07:22:39Z

@wkcn while this PR is not quite done yet, it would be great to get some early feedback since the design/implementation has changed since our initial discussion. Let me know what you think, thanks!

…to dynamic_op

Makefile

include/mxnet/lib_api.h

src/c_api/c_api.cc

…to dynamic_op

…-mxnet into dynamic_op

…to dynamic_op

rondogency · 2019-10-24T07:02:10Z

@wkcn 1.6 code freeze is tomorrow, so are you ok with this one not going into the 1.6 release? It is because none of us have time to maintain it until late November. After code freeze then we can merge it on Friday, and user can use nightly build to access this feature.

wkcn · 2019-10-24T11:21:42Z

@rondogency No problem : )

wkcn · 2019-11-28T06:12:43Z

Hi @samskalicky and @rondogency , is it ready to merge this PR after CI passes?

samskalicky · 2019-12-06T00:14:01Z

Hi @samskalicky and @rondogency , is it ready to merge this PR after CI passes?

Yes! We're soooooo ready to merge :)

Thanks @zachgk for reruning the unix_cpu job!

wkcn · 2019-12-06T02:27:26Z

I will merge this PR after the CI passes.
Thank all contributors!

rondogency · 2019-12-06T07:14:48Z

@wkcn Big thank to Jackie for the merging work!

TaoLv · 2019-12-09T04:50:42Z

tests/python/unittest/test_extensions.py


 def check_platform():
    return platform.machine() not in ['x86_64', 'AMD64']

 @unittest.skipIf(check_platform(), "not all machine types supported")
 @unittest.skipIf(is_cd_run(), "continuous delivery run - ignoring test")
-def test_library_loading():
+def test_custom_op():


It has a strong assumption that the case will be called from mxnet root folder. Otherwise, the libsample_lib.so will not be found.

$ cd tests/python/unittest/ $ nosetests -v test_extensions:test_custom_op test_extensions.test_custom_op ... ERROR ====================================================================== ERROR: test_extensions.test_custom_op ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/lvtao/miniconda3/envs/mxnet/lib/python3.6/site-packages/nose/case.py", line 198, in runTest self.test(*self.arg) File "/home/lvtao/Workspace/mxnet-official/tests/python/unittest/test_extensions.py", line 41, in test_custom_op raise MXNetError("library %s not found " % lib) mxnet.base.MXNetError: library libsample_lib.so not found ---------------------------------------------------------------------- Ran 1 test in 0.005s FAILED (errors=1)

Is it possible to use mx.libinfo.find_lib_path to find the library?
https://github.com/apache/incubator-mxnet/blob/93228649340bcacb8056d47d8f6f8a78a9805ae4/python/mxnet/libinfo.py#L26

Gotcha I will fix it in the next PR

Add random number generator support for custom operator libraries. Design: We pass from MXNet the initialized and seeded states, located on CPU and GPU, to custom library. So user could use those seeds to generate deterministic values from a given seed passed to MXNet. Basically this workflow: mx.random.seed(128) r1 = mx.nd.some_custom_random_op(data) mx.random.seed(128) r2 = mx.nd.some_custom_random_op(data) assert (r1 == r2) This PR does not let custom library generate exactly the same sequence of random numbers comparing to MXNet This is a continuation of the custom operator project #15921 and #17270

Add random number generator support for custom operator libraries. Design: We pass from MXNet the initialized and seeded states, located on CPU and GPU, to custom library. So user could use those seeds to generate deterministic values from a given seed passed to MXNet. Basically this workflow: mx.random.seed(128) r1 = mx.nd.some_custom_random_op(data) mx.random.seed(128) r2 = mx.nd.some_custom_random_op(data) assert (r1 == r2) This PR does not let custom library generate exactly the same sequence of random numbers comparing to MXNet This is a continuation of the custom operator project apache#15921 and apache#17270

…18069) * Dynamic subgraph compile support (#17623) This PR adds support for passing the NDArrays from the existing optimize_for API down to the reviewSubgraph function in an external library. It also adds a new API for HybridBlock called optimize_for that can partition the model without running a forward pass. Feature changes Adds new API to HybridBlock optimize_for that partitions the model but does not call the cachedOp Modifies the subgraph library example to optionally require args to be provided Adds annotation on subgraph inputs for the name of the original param so that inputs can be mapped and passes annotations to input nodes of subgraphs Adds support for tensors in MKLDNN format, calls Reorder2Default New tests Adds a new test to partition operators that directly consume params add a new model to test where ops to be partitioned have args/params Bug Fixes fixes bug in passing ids vector by value instead of by reference fixes bug in passing copies of attributes instead of by reference fixes bug where _cached_graph was not updated after partitioning fixes memory leak where user-specified attributes on subgraph ops were not freed if subgraph was rejected fixes problem incorrectly indexing into shape/dtype maps when annotating the graph Docs Updates the README doc with the latest changes described above * Adding sparse support to MXTensor for custom operators (#17569) * Added enum for sparse storage * Add structure for Dense and Sparse * redesign the data structure for MXSparse * pull out aux data from sparse NDArray * Added more sparse arguments to API interface * Passed sparse from c_api to lib_api.h and set in MXTensor * Fix indent * fix segfault * Fix NDArray to MXTensor errors * Add a sample of sparse(CSR) transpose * Make CSR transpose temporarily work by hardcoding * Fixed sparse output size(Refined) * Add tests for symbolic and stateful ops * Added a sample for row sparse transpose * Added real row sparse transpose * Fix output size issue by adding lambda for CheckAndAlloc() * Fix mixed storage formats error * Added infer storage type function * resolve comments * Set inferSType as optional function * Resolve comments * Add error messages * Resolve comments * verify transpose ops results * fix sanity check * update MX_LIBRARY_VERSION to 5 * Custom Operator Random Number Generator Support (#17762) Add random number generator support for custom operator libraries. Design: We pass from MXNet the initialized and seeded states, located on CPU and GPU, to custom library. So user could use those seeds to generate deterministic values from a given seed passed to MXNet. Basically this workflow: mx.random.seed(128) r1 = mx.nd.some_custom_random_op(data) mx.random.seed(128) r2 = mx.nd.some_custom_random_op(data) assert (r1 == r2) This PR does not let custom library generate exactly the same sequence of random numbers comparing to MXNet This is a continuation of the custom operator project #15921 and #17270 Co-authored-by: guanxinq <58794120+guanxinq@users.noreply.github.com> Co-authored-by: Ziyi Mu <ziyi.mu@columbia.edu>

* Dynamic subgraph compile support (#17623) This PR adds support for passing the NDArrays from the existing optimize_for API down to the reviewSubgraph function in an external library. It also adds a new API for HybridBlock called optimize_for that can partition the model without running a forward pass. Feature changes Adds new API to HybridBlock optimize_for that partitions the model but does not call the cachedOp Modifies the subgraph library example to optionally require args to be provided Adds annotation on subgraph inputs for the name of the original param so that inputs can be mapped and passes annotations to input nodes of subgraphs Adds support for tensors in MKLDNN format, calls Reorder2Default New tests Adds a new test to partition operators that directly consume params add a new model to test where ops to be partitioned have args/params Bug Fixes fixes bug in passing ids vector by value instead of by reference fixes bug in passing copies of attributes instead of by reference fixes bug where _cached_graph was not updated after partitioning fixes memory leak where user-specified attributes on subgraph ops were not freed if subgraph was rejected fixes problem incorrectly indexing into shape/dtype maps when annotating the graph Docs Updates the README doc with the latest changes described above * Adding sparse support to MXTensor for custom operators (#17569) * Added enum for sparse storage * Add structure for Dense and Sparse * redesign the data structure for MXSparse * pull out aux data from sparse NDArray * Added more sparse arguments to API interface * Passed sparse from c_api to lib_api.h and set in MXTensor * Fix indent * fix segfault * Fix NDArray to MXTensor errors * Add a sample of sparse(CSR) transpose * Make CSR transpose temporarily work by hardcoding * Fixed sparse output size(Refined) * Add tests for symbolic and stateful ops * Added a sample for row sparse transpose * Added real row sparse transpose * Fix output size issue by adding lambda for CheckAndAlloc() * Fix mixed storage formats error * Added infer storage type function * resolve comments * Set inferSType as optional function * Resolve comments * Add error messages * Resolve comments * verify transpose ops results * fix sanity check * update MX_LIBRARY_VERSION to 5 * Custom Operator Random Number Generator Support (#17762) Add random number generator support for custom operator libraries. Design: We pass from MXNet the initialized and seeded states, located on CPU and GPU, to custom library. So user could use those seeds to generate deterministic values from a given seed passed to MXNet. Basically this workflow: mx.random.seed(128) r1 = mx.nd.some_custom_random_op(data) mx.random.seed(128) r2 = mx.nd.some_custom_random_op(data) assert (r1 == r2) This PR does not let custom library generate exactly the same sequence of random numbers comparing to MXNet This is a continuation of the custom operator project #15921 and #17270 Co-authored-by: guanxinq <58794120+guanxinq@users.noreply.github.com> Co-authored-by: Ziyi Mu <ziyi.mu@columbia.edu>

samskalicky requested a review from szha as a code owner August 15, 2019 21:21

samskalicky requested review from anirudh2290 and eric-haibin-lin as code owners August 16, 2019 07:17

roywei added the Operator label Aug 19, 2019

Sam Skalicky added 6 commits August 20, 2019 04:22

fixed example to use absolute path

5030a65

added example for custom ops, added support for custom op registration

23a226a

added fcompute registration for loaded operators

67c22c0

moved library import order to after ndarray/symbol

changed dynamic ops to be contrib

915c1d5

added num in/out

f568e3d

removed contrib op registration

8e12588

re-registered ops from mx.nd.op to mx.nd

samskalicky force-pushed the dynamic_op branch from cef93e8 to 8e12588 Compare August 20, 2019 05:28

added support for infer shape, updated example to call operator

1e27a47

Sam Skalicky and others added 9 commits August 20, 2019 07:47

fixed whitespace

9aecf86

fixed whitespace

02deacf

fixed whitespace

cf9350d

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

ada3895

…to dynamic_op

added temporary support for operator multi-registration

38e77a5

insanity checked

7b8f6a2

update docblocks

5b817bd

small format fix

3bccfbe

fix unittest with correct library

a8c19c8

samskalicky commented Aug 26, 2019

View reviewed changes

Makefile Outdated Show resolved Hide resolved

implement InferType

2f34471

samskalicky commented Aug 27, 2019

View reviewed changes

include/mxnet/lib_api.h Outdated Show resolved Hide resolved

samskalicky commented Aug 27, 2019

View reviewed changes

src/c_api/c_api.cc Show resolved Hide resolved

Sam Skalicky added 4 commits August 27, 2019 04:41

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

3502aa9

…to dynamic_op

Merge branch 'dynamic_op' of https://github.com/samskalicky/incubator…

52e687b

…-mxnet into dynamic_op

Merge branch 'master' of https://github.com/apache/incubator-mxnet in…

5438a35

…to dynamic_op

initial support for resource manager, temp space

592249a

szha and others added 3 commits October 22, 2019 12:04

Merge branch 'master' into dynamic_op

2b2c6a4

remove merge conflict

ed8ac16

add setdltensor for easy use and add docs

56b0e28

wkcn added 2 commits November 27, 2019 00:12

Merge branch 'master' into dynamic_op

1bd166e

CI

50c8aea

wkcn added 2 commits November 28, 2019 14:13

re-trigger CI

34a9ee9

ci

9910c39

ci

5fd4314

wkcn added the pr-awaiting-merge Review and CI is complete. Ready to Merge label Dec 6, 2019

wkcn merged commit ae472c2 into apache:master Dec 6, 2019

wkcn mentioned this pull request Dec 6, 2019

[Discussion] 1.7.0 Roadmap #16864

Open

samskalicky mentioned this pull request Dec 7, 2019

[RFC] Custom Operator Part 2 #17006

Open

TaoLv reviewed Dec 9, 2019

View reviewed changes

samskalicky mentioned this pull request Dec 13, 2019

Dynamic subgraph property #17034

Merged

4 tasks

rondogency mentioned this pull request Jan 11, 2020

Dynamic custom operator GPU support #17270

Merged

4 tasks

olk mentioned this pull request Feb 6, 2020

MXNET-1444/MXNET-1445: replace NULL by nullptr literal #17536

Closed

guanxinq mentioned this pull request Feb 11, 2020

Adding sparse support to MXTensor for custom operators #17569

Merged

7 tasks

rondogency mentioned this pull request Mar 4, 2020

Custom Operator Random Number Generator Support #17762

Merged

wkcn mentioned this pull request May 9, 2020

[MXNet Extensions] Include lib_api.h in the pre-built pip package #18267

Open

szha mentioned this pull request Aug 15, 2020

[Development] MXNet 2.0 Update #18931

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dynamic custom operator support #15921

dynamic custom operator support #15921

samskalicky commented Aug 15, 2019 •

edited

samskalicky commented Aug 20, 2019

rondogency commented Oct 24, 2019

wkcn commented Oct 24, 2019

wkcn commented Nov 28, 2019 •

edited

samskalicky commented Dec 6, 2019

wkcn commented Dec 6, 2019

rondogency commented Dec 6, 2019

TaoLv Dec 9, 2019

wkcn Dec 9, 2019

rondogency Dec 9, 2019

dynamic custom operator support #15921

dynamic custom operator support #15921

Conversation

samskalicky commented Aug 15, 2019 • edited

Description

Design

Current Features

Future/Next-steps

Checklist

Essentials

Changes

Comments

samskalicky commented Aug 20, 2019

rondogency commented Oct 24, 2019

wkcn commented Oct 24, 2019

wkcn commented Nov 28, 2019 • edited

samskalicky commented Dec 6, 2019

wkcn commented Dec 6, 2019

rondogency commented Dec 6, 2019

TaoLv Dec 9, 2019

Choose a reason for hiding this comment

wkcn Dec 9, 2019

Choose a reason for hiding this comment

rondogency Dec 9, 2019

Choose a reason for hiding this comment

samskalicky commented Aug 15, 2019 •

edited

wkcn commented Nov 28, 2019 •

edited