Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

MXNet Extensions enhancements #17885

Merged
merged 35 commits into from
Apr 21, 2020
Merged

Conversation

samskalicky
Copy link
Contributor

@samskalicky samskalicky commented Mar 22, 2020

Description

Enhancements to MXNet Extensions (subgraph property, custom ops, custom graph passes). Addresses a few points from #17236. More description on the way!

Features

  • Enhancement to supportedOps to allow for graph coloring (specifying which subgraph an op should be partitioned into)
  • Enhancement for partitioners to use Selector class instead of supportedOps
  • Added a new example for custom subgraph op using selector class
  • Clean up lib_api.h and library loading in c_api.cc
  • Added option to quiet library loading printing ops, partitioners, passes found in library
  • Add support for custom graph passes, and docs/README
  • Add a new example for custom pass libraries with 2 example passes
  • Added support for allocating new args/aux within the pass, or replacing existing args/aux
  • Add new custom pass lib to Makefile/CMakeLists.txt
  • Add building sparse custom op libraries to Makefile/CMakeLists.txt
  • Add compiling custom libs with lowest supported C++11, building with C++17 for testing with MXNet

SupportedOps Enhancements for graph coloring

In the custom partitioner API, custom library writers could implement the SupportedOps API to specify which ops to include in a subgraph by setting True/False for each node ID in the graph. This PR adds support for writers to identify which specific subgraph to include a node into by specifying a subgraph ID integer for each node, or -1 to indicate that a node can go into any subgraph.

Selector class for subgraph creation

In the custom partitioner API, custom library writers could implement the SupportedOps API to specify which ops to include in a subgraph. But some custom partitioners may want more control. This PR, adds support for the internal MXNet subgraph property SubgraphSelector class to be implemented in a custom library. Custom library writers can choose to implement SupportedOps API or CustomOpSelector class for their partitioner.

Custom Graph Passes

This PR adds the ability to register a custom graph pass in a library. Working backwards from the custom library writer, the Pass API is implemented with the following interface:

MXReturnValue myPass(const std::string& in_graph, const std::string** out_graph,
                     const std::unordered_map<std::string, std::string>& options,
                     const std::unordered_map<std::string, MXTensor>& args,
                     const std::unordered_map<std::string, MXTensor>& aux,
                     const PassResource& res)

The model graph is passed as a JSON string, and users provide a new JSON string for the modified graph. Options specified by the MXNet user at the Python level are passed in the options map. If the MXNet user provided the args/aux at the Python level they are provided to the pass in the args/aux arguments. The PassResource class res exposes two functions: alloc_arg and alloc_aux that allow the custom library writer to allocate NDArrays for new or replacement args/aux within the custom pass. Custom passes are registered in the library with this syntax:

REGISTER_PASS(myPassName)
.setBody(myPassFunc);

As custom passes are found in the library during loading, a lambda function is registered:

auto pass_lambda = [=] (nnvm::Graph&& g) {
    ...
    CHECK(callGraphPass(...));
    ...
    return out_graph;
};

nnvm::PassFunctionReg& pass = dmlc::Registry<nnvm::PassFunctionReg>::Get()->__REGISTER__(myPassName);
pass.set_body(pass_lambda);
pass.set_change_graph(true);

From the MXNet front-end, users call custom passes using the same optimize_for API. The backend name argument to the optimize_for API is attempted to be looked up in registered subgraph backends first, and if missing is attempted to be looked up in registered graph passes:

if (mxnet::op::SubgraphBackendRegistry::Get()->backend_map_.count(backend_name) > 0) {
    // use subgraph backend
    ...
} else if (dmlc::Registry<nnvm::PassFunctionReg>::Find(backend_name) != nullptr) {
    // use graph pass
    ...
    g = ApplyPass(std::move(g), backend_name);
    ...
}

Compilation

The lib_api.h was designed to be as generic as possible to allow users to compile their custom library with any version of C++ (C++11 or higher) and GLIBC to fit the needs of their application. This does not have to correspond to the version of C++ or GLIBC used to compile MXNet. To test this, we will compile with C++11 to check for compiler errors, but will also compile and test with C++17 to match was MXNet is using.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@samskalicky
Copy link
Contributor Author

@mxnet-bot run ci [sanity]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [sanity]

@samskalicky samskalicky requested a review from leezu as a code owner April 7, 2020 23:27
@samskalicky
Copy link
Contributor Author

samskalicky commented Apr 7, 2020

@mxnet-bot run ci [sanity]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [sanity]

@samskalicky
Copy link
Contributor Author

@mxnet-bot run ci [sanity]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [sanity]

include/mxnet/lib_api.h Outdated Show resolved Hide resolved
if(USE_CUDA)
add_library(customop_gpu_lib SHARED ${CMAKE_CURRENT_SOURCE_DIR}/example/extensions/lib_custom_op/relu_lib.cu)
target_include_directories(customop_gpu_lib PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}/include/mxnet)
endif()
if(MSVC)
if(UNIX)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those things can be deleted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean you don't need to add -shared even for customop_gpu_lib, just change the lines inside MSVC block

std::vector<MXTensor> outputs,
OpResource res) {
MXReturnValue backward(const std::unordered_map<std::string, std::string>& attrs,
std::vector<MXTensor>* inputs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't forget to update this change to lib_custom_op README

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@rondogency
Copy link
Contributor

also please lib_api update version to 10

@samskalicky
Copy link
Contributor Author

also please lib_api update version to 10

changed to version 7

Copy link
Member

@ptrendx ptrendx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@rondogency rondogency left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the hard work for writing custom graph pass support and making MXNet extension well organized! I have left a few minor suggestions. If you feel it is not worth a CI run, feel free to ignore them.

if(USE_CUDA)
add_library(customop_gpu_lib SHARED ${CMAKE_CURRENT_SOURCE_DIR}/example/extensions/lib_custom_op/relu_lib.cu)
target_include_directories(customop_gpu_lib PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}/include/mxnet)
endif()
if(MSVC)
if(UNIX)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean you don't need to add -shared even for customop_gpu_lib, just change the lines inside MSVC block

Custom Operator support was merged (#15921, #17270) and is not available in versions of MXNet prior to v1.7.0.
To access the feature now, please install MXNet by compiling from source using master or using the previously mentioned commits, downloading one of the nightly builds, or from a release of MXNet 1.7.0+.
For running the following example, it doesn’t matter if it is a CUDA, MKLDNN or plain MXNet build; the custom operator doesn’t interact with the execution of other native MXNet operators.
To run the following example, the build type of MXNet doesn’t matter since the custom operator doesn’t interact with the execution of other native MXNet operators.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is better to add prerequisite here or run an example like "This requires GCC > 5 or CUDA > 9 to run the examples"

* register custom ops for library authors
* register custom ops, partitioner, and passes
* for library authors
* See example/extension/lib_custom_op/README.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can rephrase it to "APIs to write extension library, see ... for registering custom operators, ... for custom partitioners, ... for custom graph passes"

@@ -45,7 +49,7 @@
#endif

/* Make sure to update the version number everytime you make changes */
#define MX_LIBRARY_VERSION 6
#define MX_LIBRARY_VERSION 7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still feel it is better to make it 10

also it is better to add to c_api.cc line 339 version checking message something like "please update lib_api.h to match the version supported by MXNet backend"

@ptrendx
Copy link
Member

ptrendx commented Apr 21, 2020

Synced offline with @samskalicky and @rondogency - they are ok with doing the last changes proposed by @rondogency in another PR targeting master specifically, so this PR is ready to merge.

@ptrendx ptrendx merged commit e761f84 into apache:master Apr 21, 2020
@@ -200,6 +208,9 @@ If the number of input and output tensors are fixed, you can use hard-coded numb
* **inferType**: This function takes three arguments. The 1st argument is the attributes (same as above). The 2nd argument is the a list of input data types corresponding to the input tensors. The 3rd argument is the placeholder for output tensor data types you need to assign.
For example, if this operator has one input and one output, and data type doesn’t change, then you can do `outtypes[0] = intypes[0]` to populate the data type.

* **inferSType**: This function takes three arguments. The 1st argument is the attributes (same as above). The 2nd argument is the a list of input storage types corresponding to the input tensors. The 3rd argument is the placeholder for output storage types you need to assign.
For example, if this operator has one input and one output, and data type doesn’t change, then you can do `outtypes[0] = intypes[0]` to populate the data type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data type doesn’t change -> data storage type doesn’t change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a list of input storage types corresponding to the input tensors -> a list of input storage types corresponding to the input tensors (dense, row_sparse, or CSR). For details, see https://cwiki.apache.org/confluence/display/MXNET/A+Guide+to+Implementing+Sparse+Operators+in+MXNet+Backend

It would be good to include the link above in case people wonder why/if inferSType is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a whole overview for Sparse. Maybe you can help us add another section to the readme about that


```python
import mxnet as mx
mx.library.load(‘libmypass_lib.so’)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‘libmypass_lib.so’ -> 'libmypass_lib.so'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what am i missing? looks the same to me...

@leezu
Copy link
Contributor

leezu commented Apr 21, 2020

@ptrendx let's not merge commits named "[WIP]" to master. You can edit the name prior to merge

@ptrendx
Copy link
Member

ptrendx commented Apr 21, 2020

Oops, you are right, sorry, I missed that.

@samskalicky
Copy link
Contributor Author

@ptrendx let's not merge commits named "[WIP]" to master. You can edit the name prior to merge

We didnt want to rerun the whole CI. did we fix the problem where renaming the PR reruns CI @leezu ?

@samskalicky samskalicky changed the title [WIP] MXNet Extensions enhancements MXNet Extensions enhancements Apr 21, 2020
samskalicky added a commit to samskalicky/incubator-mxnet that referenced this pull request Apr 21, 2020
* add debug prints to debug error in CI

* add debug prints to debug error in CI

* remove prints

* initial commit

* enabled calling create for selector

* connected selector to call external class

* added code to remove temp graph attrs

* fixed build issues

* changed shape inference to use different attr names

* fixed selector class

* cleaned up APIs

* fixed sanity

* updated build for extensions

* sanity fix

* refactored MXLoadLib into separate functions

* undo rebase

* finished merge

* enabled verbose in library loading

* fixed example

* added passing args/aux down to graph pass

* added creating new args/aux for graph passes

* fixed return args/aux

* fixed sanity

* whitespace

* fixed lint

* updated perl API, README, added pass_lib to cmake build flow

* fixed mistake with relu example lib

* fixed perl syntax

* addressed comments

* addressed more comments

* fixed compile issues

Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-148.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-217.us-west-2.compute.internal>
This was referenced Apr 21, 2020
samskalicky added a commit to samskalicky/incubator-mxnet that referenced this pull request Apr 21, 2020
* add debug prints to debug error in CI

* add debug prints to debug error in CI

* remove prints

* initial commit

* enabled calling create for selector

* connected selector to call external class

* added code to remove temp graph attrs

* fixed build issues

* changed shape inference to use different attr names

* fixed selector class

* cleaned up APIs

* fixed sanity

* updated build for extensions

* sanity fix

* refactored MXLoadLib into separate functions

* undo rebase

* finished merge

* enabled verbose in library loading

* fixed example

* added passing args/aux down to graph pass

* added creating new args/aux for graph passes

* fixed return args/aux

* fixed sanity

* whitespace

* fixed lint

* updated perl API, README, added pass_lib to cmake build flow

* fixed mistake with relu example lib

* fixed perl syntax

* addressed comments

* addressed more comments

* fixed compile issues

Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-148.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-217.us-west-2.compute.internal>
@leezu
Copy link
Contributor

leezu commented Apr 21, 2020

< We didnt want to rerun the whole CI. did we fix the problem where renaming the PR reruns CI @leezu ?

@samskalicky when merging the commit, Github allows to edit the commit message. So the commit message can be different from the PR title. Indeed we can't edit the PR title without retriggering the CI as of now (cc @ChaiBapchya)

ptrendx pushed a commit that referenced this pull request Apr 22, 2020
* add debug prints to debug error in CI

* add debug prints to debug error in CI

* remove prints

* initial commit

* enabled calling create for selector

* connected selector to call external class

* added code to remove temp graph attrs

* fixed build issues

* changed shape inference to use different attr names

* fixed selector class

* cleaned up APIs

* fixed sanity

* updated build for extensions

* sanity fix

* refactored MXLoadLib into separate functions

* undo rebase

* finished merge

* enabled verbose in library loading

* fixed example

* added passing args/aux down to graph pass

* added creating new args/aux for graph passes

* fixed return args/aux

* fixed sanity

* whitespace

* fixed lint

* updated perl API, README, added pass_lib to cmake build flow

* fixed mistake with relu example lib

* fixed perl syntax

* addressed comments

* addressed more comments

* fixed compile issues

Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-148.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-217.us-west-2.compute.internal>

Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-148.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-217.us-west-2.compute.internal>
TaoLv pushed a commit that referenced this pull request Apr 23, 2020
* add debug prints to debug error in CI

* add debug prints to debug error in CI

* remove prints

* initial commit

* enabled calling create for selector

* connected selector to call external class

* added code to remove temp graph attrs

* fixed build issues

* changed shape inference to use different attr names

* fixed selector class

* cleaned up APIs

* fixed sanity

* updated build for extensions

* sanity fix

* refactored MXLoadLib into separate functions

* undo rebase

* finished merge

* enabled verbose in library loading

* fixed example

* added passing args/aux down to graph pass

* added creating new args/aux for graph passes

* fixed return args/aux

* fixed sanity

* whitespace

* fixed lint

* updated perl API, README, added pass_lib to cmake build flow

* fixed mistake with relu example lib

* fixed perl syntax

* addressed comments

* addressed more comments

* fixed compile issues

Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-148.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-217.us-west-2.compute.internal>

Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-148.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-217.us-west-2.compute.internal>
AntiZpvoh pushed a commit to AntiZpvoh/incubator-mxnet that referenced this pull request Jul 6, 2020
* add debug prints to debug error in CI

* add debug prints to debug error in CI

* remove prints

* initial commit

* enabled calling create for selector

* connected selector to call external class

* added code to remove temp graph attrs

* fixed build issues

* changed shape inference to use different attr names

* fixed selector class

* cleaned up APIs

* fixed sanity

* updated build for extensions

* sanity fix

* refactored MXLoadLib into separate functions

* undo rebase

* finished merge

* enabled verbose in library loading

* fixed example

* added passing args/aux down to graph pass

* added creating new args/aux for graph passes

* fixed return args/aux

* fixed sanity

* whitespace

* fixed lint

* updated perl API, README, added pass_lib to cmake build flow

* fixed mistake with relu example lib

* fixed perl syntax

* addressed comments

* addressed more comments

* fixed compile issues

Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-148.us-west-2.compute.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-217.us-west-2.compute.internal>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants