Added SELU, ELU and Swish as activation functions for Gluon Interface #9111

anjishnu · 2017-12-17T12:15:04Z

Description

added SELU and ELU activation functions as recommended on this thread #8422

Checklist

Essentials

Passed code style checking (make lint)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
[x ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

dongzhuoyao · 2017-12-17T12:52:42Z

well done!

- Use module API instead of the deprecated model API - Reduce the number of epochs to 10 (which is sufficient) - When using GPU, use GPU 0 and not GPU 1. Many machines will have only one GPU - Use CPU by default. Make it easy for user to switch to GPU is needed.

kobenaxie · 2017-12-18T02:45:01Z

Why not use 'HybridBlock' ?

chinakook · 2017-12-18T09:01:40Z

How about Swish(https://arxiv.org/abs/1710.05941) activation function?

anjishnu · 2017-12-18T13:42:56Z

@kobenaxie I'm not too sure about Blocks vs. HybridBlocks - what would be the difference? If it's just changing the class I inherit from, I can do that right now.

@chinakook That's a great idea! - Once this is merged we can extend it with other activations.

@cjolivier01 I added the license, and it seems the test is failing now because of non-ascii characters in the author's name in the comment section, even though I've specified utf-8 encoding do I need to remove those as well?

Added assertion to validate achieved accuracy.

* Update for MXNet 1.0. PEP8 fixes to code and misc improvements. Tested under Python 2.7 and Python 3.6 on Sagemaker. * Remove defunct tutorial page * Remove defunct demo * Remove duplicate material

* add shared storage in windows * fix * lint * fix * fix * fix * fix process.h

* Updating the Python readme * Addressing PR comments for /python dir

* Example updates to make it work on Python3, free of lint issues, more clear and easy to use * Addressing PR comments for CFN-xs example updates

* Update linear-regression.md Reduce number of epoch to 20, as validation accuracy doesn't improve any further. Added assertion to check for achieved accuracy in preparation for tutorial regression. * fixed location of assertion moved assertion to correct location

* More details to the windows build process * Removed the package instructions.

* csr slice, gpu implementation * update comments * test already exists * common impl of csr slice on dim one * remove unnecessary stream->wait * add doc * trigger

* [BugFix][CoreML Converter] Dense layers w/o bias. The code was initially assuming bias to be true for dense layer. This change fixes that. Also, added a unit test to verify the change. This is related to issue: #8628 * attr -> attrs.

* usability improvements support for py3 and mac Signed-off-by: Rahul <rahulhuilgol@gmail.com> * more py3 changes * fix cpickle changes py3 Signed-off-by: Rahul <rahulhuilgol@gmail.com> * fix cpickle changes py3 Signed-off-by: Rahul <rahulhuilgol@gmail.com> * fix cpickle changes py3 Signed-off-by: Rahul <rahulhuilgol@gmail.com> * change float used to index to int Signed-off-by: Rahul <rahulhuilgol@gmail.com> * change float used to index to int Signed-off-by: Rahul <rahulhuilgol@gmail.com> * change float used to index to int Signed-off-by: Rahul <rahulhuilgol@gmail.com> * fix xrange Signed-off-by: Rahul <rahulhuilgol@gmail.com> * import compatibility Signed-off-by: Rahul <rahulhuilgol@gmail.com> * int numpy deprecation Signed-off-by: Rahul <rahulhuilgol@gmail.com> * add note about future Signed-off-by: Rahul <rahulhuilgol@gmail.com> * cpu and windows messages Signed-off-by: Rahul <rahulhuilgol@gmail.com> * change cpickle to pickle for both py2 and py3 Signed-off-by: Rahul <rahulhuilgol@gmail.com> * remove six dependency Signed-off-by: Rahul <rahulhuilgol@gmail.com>

* Usability improvements for some examples * some more modifications * formatting fixes * shape fixed * comments added * fix * fix * comments addressed * fix * num-gpus changed to 1

* Some changes to make dqn compatible with python 3 * Update README * union of dict_items does not work in python2.7. Change to list

* add zero-grad for rounding ops * Update test_symbol.py

* changed url references from dmlc to apache/incubator-mxnet * updated to cuda9 and cudnn7

* warning_autotune * fix

* Fixed documentation in mnist tutorial * dimensions of weight and bias were incorrectly reported and the formula for FC layer didn't mention that weight is transposed. * Added a brief description of broadcasting. * Added a link to sym.broadcast_to() call that explains broadcasting and also links to numpy broadcasting semantics. * Included a brief conceptual explanation of broadcasting * Fixed a grammar error in the description of MLPs.

* Improve usability for the bilstm example * Remove argparse from infer_sort since it changes existing usage

* Add unittest for float16 min and max * Add mshadow fix

* Add wikitext-2 data for rnnlm example in gluon. * Add Wikitext2 for rnnlm. * Add performance data in WikiText-2.

removing whitespace

anjishnu · 2017-12-25T10:57:42Z

This commit is becoming a bit of a mess.
I have only changed the following files:

tests/python/unittest/test_gluon_contrib.py
incubator-mxnet/python/mxnet/gluon/contrib/activations.py
incubator-mxnet/python/mxnet/gluon/contrib/__init__.py

I tried to merge with upstream/master and all those commits have appeared and are polluting this PR. The python2 tests are also failing for some reason which seems completely unrelated to my changes, does anyone know what might be happening here?

mseeger · 2017-12-28T10:32:29Z

This is all a bit of a mess. Why have so many commits? This should be a single commit. Please squash your commits and use rebase instead of merge.

piiswrong · 2018-01-31T20:02:10Z

The original author seems to have disappeared.

Can one of the committers take over?
@yzhliu @szha

szha · 2018-02-01T04:52:15Z

working on it in the above PR.

anjishnu · 2018-02-01T10:40:30Z

Sorry guys, I got a bit busy on some deadlines - need more practice with github PRs and making clean and concise.

added SELU and ELU activation functions

c56fe91

Kumar and others added 2 commits December 17, 2017 23:00

added license

b261e0e

srochel and others added 23 commits December 18, 2017 10:16

added assertion to validate accuracy (#9107)

d23b91d

Added assertion to validate achieved accuracy.

Remove defunct demo (#9060)

5858d62

* Update for MXNet 1.0. PEP8 fixes to code and misc improvements. Tested under Python 2.7 and Python 3.6 on Sagemaker. * Remove defunct tutorial page * Remove defunct demo * Remove duplicate material

add shared storage in windows (#8967)

df25378

* add shared storage in windows * fix * lint * fix * fix * fix * fix process.h

Updating the Python readme (#9075)

fb254c4

* Updating the Python readme * Addressing PR comments for /python dir

FCN example updates (#9066)

b53a13e

* Example updates to make it work on Python3, free of lint issues, more clear and easy to use * Addressing PR comments for CFN-xs example updates

image-classification example fix python3 compatibility (#9053)

c6d4e7f

More details to the windows build process (#8519)

b9b4532

* More details to the windows build process * Removed the package instructions.

csr slice operator, gpu implementation (#8814)

9b0a5ca

* csr slice, gpu implementation * update comments * test already exists * common impl of csr slice on dim one * remove unnecessary stream->wait * add doc * trigger

Usability fixes for examples (#9091)

16bd961

* Usability improvements for some examples * some more modifications * formatting fixes * shape fixed * comments added * fix * fix * comments addressed * fix * num-gpus changed to 1

Fix example/reinforcement-learning/dqn (#9128)

d1d3e97

* Some changes to make dqn compatible with python 3 * Update README * union of dict_items does not work in python2.7. Change to list

fix multi worker dataloader deadlock (#9126)

500ed5f

Fix nadam (#9127)

9272d9c

add zero-grad for rounding ops (#9040)

ae70769

* add zero-grad for rounding ops * Update test_symbol.py

Install docs - default to install CUDA 9 and cuDNN 7 (#9147)

a38ab6b

* changed url references from dmlc to apache/incubator-mxnet * updated to cuda9 and cudnn7

warning autotune show once (#9132)

b59943e

* warning_autotune * fix

Usability improvement bi lstm sort (#8944)

c71a2f3

* Improve usability for the bilstm example * Remove argparse from infer_sort since it changes existing usage

Fix float16 min and max (#9149)

080d29c

* Add unittest for float16 min and max * Add mshadow fix

Add Wikitext for gluon/rnnlm (#9090)

5c3acff

* Add wikitext-2 data for rnnlm example in gluon. * Add Wikitext2 for rnnlm. * Add performance data in WikiText-2.

removed special characters

51459e1

Kumar and others added 15 commits December 25, 2017 01:12

added SELU and ELU activation functions

57c54ce

added license

693734a

removed special characters

203d963

changing function signature

510f258

removing whitespace

2aefb55

added swish - credits @kobenaxie

22d69e3

adding unnecessary arguments to make jenkins test pass

932d568

adding unnecessary arguments to make jenkins test pass

c945235

changing function signature

f5fa9f9

removing whitespace

added unit tests

e77b9a3

make tests pass

70733ce

make tests pass

acc2628

making tests pass

cfa08a3

trying to make things pass

867ed31

Merge branch 'master' of https://github.com/anjishnu/incubator-mxnet

6c44c79

anjishnu requested review from yzhliu and thirdwing as code owners December 25, 2017 01:15

anjishnu added 5 commits December 25, 2017 11:14

changing to relative import to align with rnn package

27c18c1

fixing linter errors

a812cca

added new activations

531a938

moved to relative imports

4cc0a7e

making linter pass

d5377f9

szha self-assigned this Jan 31, 2018

szha mentioned this pull request Feb 1, 2018

Gluon PReLU, ELU, SELU, Swish #9662

Merged

8 tasks

szha closed this Feb 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added SELU, ELU and Swish as activation functions for Gluon Interface #9111

Added SELU, ELU and Swish as activation functions for Gluon Interface #9111

anjishnu commented Dec 17, 2017 •

edited

Loading

dongzhuoyao commented Dec 17, 2017

kobenaxie commented Dec 18, 2017

chinakook commented Dec 18, 2017 •

edited

Loading

anjishnu commented Dec 18, 2017 •

edited

Loading

anjishnu commented Dec 25, 2017 •

edited

Loading

mseeger commented Dec 28, 2017

piiswrong commented Jan 31, 2018

szha commented Feb 1, 2018

anjishnu commented Feb 1, 2018

Added SELU, ELU and Swish as activation functions for Gluon Interface #9111

Added SELU, ELU and Swish as activation functions for Gluon Interface #9111

Conversation

anjishnu commented Dec 17, 2017 • edited Loading

Description

Checklist

Essentials

Comments

dongzhuoyao commented Dec 17, 2017

kobenaxie commented Dec 18, 2017

chinakook commented Dec 18, 2017 • edited Loading

anjishnu commented Dec 18, 2017 • edited Loading

anjishnu commented Dec 25, 2017 • edited Loading

mseeger commented Dec 28, 2017

piiswrong commented Jan 31, 2018

szha commented Feb 1, 2018

anjishnu commented Feb 1, 2018

anjishnu commented Dec 17, 2017 •

edited

Loading

chinakook commented Dec 18, 2017 •

edited

Loading

anjishnu commented Dec 18, 2017 •

edited

Loading

anjishnu commented Dec 25, 2017 •

edited

Loading