-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Added SELU, ELU and Swish as activation functions for Gluon Interface #9111
Conversation
well done! |
- Use module API instead of the deprecated model API - Reduce the number of epochs to 10 (which is sufficient) - When using GPU, use GPU 0 and not GPU 1. Many machines will have only one GPU - Use CPU by default. Make it easy for user to switch to GPU is needed.
Why not use 'HybridBlock' ? |
How about Swish(https://arxiv.org/abs/1710.05941) activation function? |
@kobenaxie I'm not too sure about Blocks vs. HybridBlocks - what would be the difference? If it's just changing the class I inherit from, I can do that right now. @chinakook That's a great idea! - Once this is merged we can extend it with other activations. @cjolivier01 I added the license, and it seems the test is failing now because of non-ascii characters in the author's name in the comment section, even though I've specified utf-8 encoding do I need to remove those as well? |
Added assertion to validate achieved accuracy.
* Update for MXNet 1.0. PEP8 fixes to code and misc improvements. Tested under Python 2.7 and Python 3.6 on Sagemaker. * Remove defunct tutorial page * Remove defunct demo * Remove duplicate material
* add shared storage in windows * fix * lint * fix * fix * fix * fix process.h
* Updating the Python readme * Addressing PR comments for /python dir
* Example updates to make it work on Python3, free of lint issues, more clear and easy to use * Addressing PR comments for CFN-xs example updates
* Update linear-regression.md Reduce number of epoch to 20, as validation accuracy doesn't improve any further. Added assertion to check for achieved accuracy in preparation for tutorial regression. * fixed location of assertion moved assertion to correct location
* More details to the windows build process * Removed the package instructions.
* csr slice, gpu implementation * update comments * test already exists * common impl of csr slice on dim one * remove unnecessary stream->wait * add doc * trigger
* [BugFix][CoreML Converter] Dense layers w/o bias. The code was initially assuming bias to be true for dense layer. This change fixes that. Also, added a unit test to verify the change. This is related to issue: #8628 * attr -> attrs.
* usability improvements support for py3 and mac Signed-off-by: Rahul <rahulhuilgol@gmail.com> * more py3 changes * fix cpickle changes py3 Signed-off-by: Rahul <rahulhuilgol@gmail.com> * fix cpickle changes py3 Signed-off-by: Rahul <rahulhuilgol@gmail.com> * fix cpickle changes py3 Signed-off-by: Rahul <rahulhuilgol@gmail.com> * change float used to index to int Signed-off-by: Rahul <rahulhuilgol@gmail.com> * change float used to index to int Signed-off-by: Rahul <rahulhuilgol@gmail.com> * change float used to index to int Signed-off-by: Rahul <rahulhuilgol@gmail.com> * fix xrange Signed-off-by: Rahul <rahulhuilgol@gmail.com> * import compatibility Signed-off-by: Rahul <rahulhuilgol@gmail.com> * int numpy deprecation Signed-off-by: Rahul <rahulhuilgol@gmail.com> * add note about future Signed-off-by: Rahul <rahulhuilgol@gmail.com> * cpu and windows messages Signed-off-by: Rahul <rahulhuilgol@gmail.com> * change cpickle to pickle for both py2 and py3 Signed-off-by: Rahul <rahulhuilgol@gmail.com> * remove six dependency Signed-off-by: Rahul <rahulhuilgol@gmail.com>
* Usability improvements for some examples * some more modifications * formatting fixes * shape fixed * comments added * fix * fix * comments addressed * fix * num-gpus changed to 1
* Some changes to make dqn compatible with python 3 * Update README * union of dict_items does not work in python2.7. Change to list
* add zero-grad for rounding ops * Update test_symbol.py
* changed url references from dmlc to apache/incubator-mxnet * updated to cuda9 and cudnn7
* warning_autotune * fix
* Fixed documentation in mnist tutorial * dimensions of weight and bias were incorrectly reported and the formula for FC layer didn't mention that weight is transposed. * Added a brief description of broadcasting. * Added a link to sym.broadcast_to() call that explains broadcasting and also links to numpy broadcasting semantics. * Included a brief conceptual explanation of broadcasting * Fixed a grammar error in the description of MLPs.
* Improve usability for the bilstm example * Remove argparse from infer_sort since it changes existing usage
* Add unittest for float16 min and max * Add mshadow fix
* Add wikitext-2 data for rnnlm example in gluon. * Add Wikitext2 for rnnlm. * Add performance data in WikiText-2.
removing whitespace
This commit is becoming a bit of a mess.
I tried to merge with upstream/master and all those commits have appeared and are polluting this PR. The python2 tests are also failing for some reason which seems completely unrelated to my changes, does anyone know what might be happening here? |
This is all a bit of a mess. Why have so many commits? This should be a single commit. Please squash your commits and use rebase instead of merge. |
working on it in the above PR. |
Sorry guys, I got a bit busy on some deadlines - need more practice with github PRs and making clean and concise. |
Description
added SELU and ELU activation functions as recommended on this thread #8422
Checklist
Essentials
make lint
)Comments