Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[Keras] Support MXNET backend for Keras. #4173

Closed
shivarajugowda opened this issue Dec 9, 2016 · 25 comments
Closed

[Keras] Support MXNET backend for Keras. #4173

shivarajugowda opened this issue Dec 9, 2016 · 25 comments

Comments

@shivarajugowda
Copy link
Contributor

I have been working on supporting MXNET as a backend for Keras(A popular neural networks Python library which currently supports TensorFlow or Theano). I am hopeful the endeavor is win-win for both projects. Keras will benefit from the MXNET’s multi-device/multi-node support and MXNET will get broader exposure. The task will also test out and most probably enhance MXNET API capabilities for a broader set of audience.

To this end, I have started the process and am able to get low-hanging APIs checked off. I could say I have 25% of the work done. The amount of changes required in Keras is not huge we just need to add one more file on the same lines as of tensorflow_backend.py
I think most of the work would be in figuring out how to map the functionalities to MXNET APIs and implementing the missing ones.

The backend tests are a good way to go about implementing it and tracking our progress.

Here is my rough current status as measured in terms of those tests.

APIs converted:

  • test_linear_operations
  • test_shape_operations
  • test_elementwise_operations
  • test_nn_operations
  • test_value_manipulation
  • test_random_normal
  • test_random_uniform
  • test_random_binomial

APIs I am currently working on.

  • test_gradient
  • test_rnn
  • test_rnn_no_states
  • test_conv2d
  • test_conv3d
  • test_pool2d
  • test_pool3d

APIs I think might need changes/updates to MXNET.

  • test_function
  • test_ctc
  • test_ctc_decode_greedy
  • test_ctc_decode_beam_search
  • test_one_hot
  • test_sparse_dot
  • test_sparse_concat
  • test_map
  • test_foldl
  • test_foldr

Things that are currently missing but are nice to have:

  • Python distutil distribution for MXNET.
  • Provide core functionality in NDArray. For some of the functionalities I had to use Symbol module.

First things first, I want to know if this is inline with the MXNET community’s needs and something which you agree needs to be pursued and is worthwhile. If we agree then I can use the issue as a high-level task to track and update my progress and also request more info/features and help with implementing them.

@piiswrong
Copy link
Contributor

piiswrong commented Dec 9, 2016

@shivarajugowda
Thanks for the wonderful effort!
Yes this is something we want. In fact we have been talking about doing this for a while but didn't have the man power.

Could you post your code somewhere so we can track progress and other people interested in this can pitch in?

@piiswrong
Copy link
Contributor

piiswrong commented Dec 9, 2016

Also please work with the nnvm branch (instead of master) since it will be released soon.

with nnvm branch all functions are available in both symbol and ndarray.

@shivarajugowda
Copy link
Contributor Author

@piiswrong Good to know this would be of value. Yes I will start checking in (whatever I have), but all the changes as of now are in Keras branch. I will point to it in here once I checkin(by Monday). As and when I need changes to the MXNET I will create a separate issue and summarize it here. Also appreciate pointing out to use NNVM branch, I was using the master branch till now. I will check out the new symbol in NNVM branch.

@ivenzor
Copy link

ivenzor commented Dec 10, 2016

Good work!

@anjishnu
Copy link
Contributor

+1, great work. Would love to see this happen.

@shivarajugowda shivarajugowda changed the title Support MXNET backend for Keras. [Keras] Support MXNET backend for Keras. Dec 11, 2016
@jspisak
Copy link
Contributor

jspisak commented Dec 12, 2016

This is really great! Thanks @shivarajugowda for jumping in on this..

@shivarajugowda
Copy link
Contributor Author

Here is where you can monitor the progress on the Keras end.

Branch : https://github.com/shivarajugowda/keras
Issue: keras-team/keras#1313

@shivarajugowda
Copy link
Contributor Author

Adding support of element wise !=, >, >=, <, <=comparison operators as a part of this. #4182

@shivarajugowda
Copy link
Contributor Author

For Keras, we need support for Sparse Matrices, I see a proposal in #1524. Not sure how far off we are in terms of progress. For the time being I am using the Dense matrix underneath.

@shivarajugowda
Copy link
Contributor Author

Filed #4248 and #4249 to support mean and std deviation operators.

@fchollet
Copy link

For Keras, we need support for Sparse Matrices

If you leave out sparse tensor support, very little functionality would be lost. As long as you raise appropriate, helpful exceptions in the backend, it would be fine.

Do you foresee any issues with K.rnn, K.gradients? These two would be the tricky ones.

@shivarajugowda
Copy link
Contributor Author

shivarajugowda commented Dec 16, 2016

@fchollet thanks for the input, I am figuring out loop and if constructs in the context of RNN. Apart from MXNET, I am also looking at Tensorflow and Theano code. Will get more time during the Christmas break, will keep the updates posted here.

@piiswrong
Copy link
Contributor

piiswrong commented Dec 17, 2016

gradient is in there (gradient pass) but not exposed through c api. @tqchen @jermainewang Is it possible to expose it?

@tqchen
Copy link
Member

tqchen commented Dec 19, 2016

There are essentially two approaches possible. Keras takes a pure declarative symbolic approach for both network definition and parameter update, because the existing frameworks on keras works are declarative.

However, it is not necessarily to be so in MXNet, and the parameter update part is automatically handled with imperative code. They do not necessarily be incompatible with keras API, most of keras API are for the network definition generations.

So I would suggest the follow approach

  • Make network configuration API fully compatible with keras
  • Wrap the update logic with the imperative module API(they are not long anyway).

I know this may take a bit of additional effort, but it also takes benefit from mutli GPU API available in mxnet module.

As a second approach, we can reuse the gradient pass in mxnet and take a purely declarative approach, which I view will take a bit more effort, and may not directly come with multi-gpu support.

@tqchen
Copy link
Member

tqchen commented Dec 19, 2016

My comment do not be block any of the existing issues, but instead break the goals into two parts (and two layers of compatibility)

    1. Compatibility of network graph generation API(without requirement of gradient, and scan)
    • The model fitting logic swapped by module API in mxnet, with API compatibility
    1. Compatibility of the underlying execution code, gradient generation and optimization
      • I feel this part could be more framework dependent, especially when it comes to multiGPU and multi machine

I am all for both directions, but break it into two part will make the milestone easier, I am in favor of quickly achieving 1, so everything is functioning, and possibly stabbing 2 later.

@shivarajugowda
Copy link
Contributor Author

@tqchen I am probably not familiar with the terminology used in the MXNET context. I could follow some broader thought but I couldn't follow all of the details. @tqchen, @piiswrong How about a Webex/Hangout to go over this and also validate that I am in the right direction. Let me know and I can setup one if and when you have some time.

@piiswrong
Copy link
Contributor

@shivarajugowda Yes we should have a meeting. What time zone are you in?

@shivarajugowda
Copy link
Contributor Author

@piiswrong I am in Pacific Time Zone(CA, bay area). I am available anytime tomorrow.

@piiswrong
Copy link
Contributor

piiswrong commented Dec 20, 2016 via email

@shivarajugowda
Copy link
Contributor Author

shivarajugowda commented Dec 23, 2016

I have integrated conv2d and pool2d.
MXNET is missing support for 3D Convolution (#4301) and Pooling with 3D kernel.

@shivarajugowda
Copy link
Contributor Author

NDArray.onehot_encode() only support indices in 1D. We need support for multiple dimension.

@shivarajugowda
Copy link
Contributor Author

Update:
I have mapped a few more operators(2D convolution/pooling, map_fn, foldl, foldr, etc) and I am pursuing @tqchen and @piiswrong suggestion of "Compatibility of network graph generation API(without requirement of gradient, and scan) The model fitting logic swapped by module API in mxnet, with API compatibility" for a simple example of "Keras/keras/examples/mnist_mlp.py". The operators are mapped for this example and I am working on using the MXNET module api underneath the Keras Model.fit().

@imranshaikmuma
Copy link

imranshaikmuma commented Jul 20, 2017

is this still open? what is the progress? does keras has api for mxnet now?
i dont see in keras documentation. I like MXNET context thing!!! please let me if it is available on keras through mxnet backend

@shivarajugowda
Copy link
Contributor Author

This issue can be closed now, the dmlc folks have a fork of Keras working with MXNET. https://github.com/dmlc/keras
https://medium.com/@julsimon/apache-mxnet-support-in-keras-83de7dec46e5

@imranshaikmuma
Copy link

i am getting the following error:
image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants