Dropout may mask values even when ratio=0.0 #9816

DickJC123 · 2018-02-17T03:10:17Z

One reasonable expectation is that a dropout layout would pass no values when the dropout ratio=1.0 and would pass all values with the dropout ratio=0.0. However, the current dropout test fails a ratio=0.0 test under some seeds because some values are masked. Adapting from the current nn/dropout-inl.h, we have essentially:

prob_keep = 1.0 - ratio;
rand_pick = random_uniform_pick(0.0,1.0);
mask_out = (rand_pick < prob_keep ? 1.0 : 0.0) * (1.0/prob_keep);
dropout_out = input_data * mask_out;

It must be that rand_pick above can include the 1.0 endpoint, so mask_out becomes 0.0. A quick fix might change the '<' to '<=', but then some values might be passed when the dropout ratio = 1.0 if the rand_pick can be 0.0. The possible differences between rng properties between cpu and gpu's should be considered: CUDNN rngs return values in (0.0,1.0] while most others are in the range [0.0,1.0). This can be reproduced in any of the commits of the ci_test_randomness3 PR #9791 if you edit the line above 'def test_dropout()' in test_operator.py to be only "@with_seed()", then execute: (output shown)

MXNET_TEST_SEED=990952066 nosetests --verbose tests/python/unittest/test_operator.py:test_dropout
[INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=881809903 to reproduce.
[WARNING] *** test-level seed set: all "@with_seed()" tests run deterministically ***
test_operator.test_dropout ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=990952066 to reproduce.
FAIL
======================================================================
FAIL: test_operator.test_dropout
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/dcarter/mxnet_dev/dgx/mxnet/tests/python/unittest/common.py", line 152, in test_new
    orig_test(*args, **kwargs)
  File "/home/dcarter/mxnet_dev/dgx/mxnet/tests/python/unittest/test_operator.py", line 4596, in test_dropout
    check_dropout_ratio(0.0, shape)
  File "/home/dcarter/mxnet_dev/dgx/mxnet/tests/python/unittest/test_operator.py", line 4561, in check_dropout_ratio
    assert exe.outputs[0].asnumpy().min() == min_value
AssertionError: 
-------------------- >> begin captured logging << --------------------
common: INFO: Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=881809903 to reproduce.
common: WARNING: *** test-level seed set: all "@with_seed()" tests run deterministically ***
common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=990952066 to reproduce.
--------------------- >> end captured logging << ---------------------

The text was updated successfully, but these errors were encountered:

piiswrong · 2018-02-19T21:37:55Z

This needs to be fix.
One possible fix is to use identity pass through when rate=0

DickJC123 · 2018-02-19T23:11:40Z

I like making ratio=0 a special case with an identity pass-through. During the same fix, how about making ratio=1 a special case also to output all 0's (current behavior is to pass inf or nan I think)?

sxjscience · 2018-02-19T23:41:36Z

There was an issue in mshadow about this dmlc/mshadow#213. The cuda rng generates (0.0, 1.0].

samskalicky · 2018-07-30T21:02:11Z

@DickJC123 re-running the test with your scenario appears to be working for me (today). Are you still experiencing a problem or can we close this issue?

piyushghai · 2018-07-31T18:35:00Z

@samskalicky Able to reproduce the issue with seeds 111913613, 211508467, 1329041279

Traceback (most recent call last): File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/ubuntu/incubator-mxnet/tests/python/unittest/common.py", line 172, in test_new orig_test(*args, **kwargs) File "/home/ubuntu/incubator-mxnet/tests/python/unittest/test_operator.py", line 5610, in test_dropout check_dropout_ratio(0.0, shape) File "/home/ubuntu/incubator-mxnet/tests/python/unittest/test_operator.py", line 5554, in check_dropout_ratio assert exe.outputs[0].asnumpy().min() == min_value AssertionError:

piyushghai · 2018-08-01T19:41:30Z

On running this on CPU, the flaky error is reproducible, as mentioned in the above comments.
On running this on GPU for 100k times, the test does not fail.

MXNET_TEST_COUNT=100000  nosetests --logging-level=DEBUG --verbose -s tests/python/gpu/test_operator_gpu.py:test_dropout
/home/ubuntu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
/home/ubuntu/anaconda3/lib/python3.6/site-packages/nose/util.py:453: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec()
  inspect.getargspec(func)
[INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1058042572 to reproduce.
test_operator_gpu.test_dropout ... [DEBUG] 1 of 100000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1077219613 to reproduce.
[DEBUG] 2 of 100000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1986463962 to reproduce.
[DEBUG] 3 of 100000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1446820082 to reproduce.
[DEBUG] 4 of 100000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1300194532 to reproduce.
[DEBUG] 5 of 100000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=576630976 to reproduce.
[DEBUG] 6 of 100000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1613399106 to reproduce.
.
.
.

[DEBUG] 99996 of 100000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=92629950 to reproduce.
[DEBUG] 99997 of 100000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=104203650 to reproduce.
[DEBUG] 99998 of 100000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1609587233 to reproduce.
[DEBUG] 99999 of 100000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=194799359 to reproduce.
[DEBUG] 100000 of 100000: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=675994555 to reproduce.
ok

----------------------------------------------------------------------
Ran 1 test in 7130.900s

OK

On running on GPU with the failure seeds for CPU : 111913613, 211508467, 1329041279, the test still does not fail.

MXNET_TEST_SEED=1329041279 nosetests --logging-level=DEBUG --verbose -s tests/python/gpu/test_operator_gpu.py:test_dropout
/home/ubuntu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
/home/ubuntu/anaconda3/lib/python3.6/site-packages/nose/util.py:453: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec()
  inspect.getargspec(func)
[INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1997611840 to reproduce.
/home/ubuntu/incubator-mxnet/tests/python/gpu/../unittest/common.py:244: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
  logger.warn('*** test-level seed set: all "@with_seed()" tests run deterministically ***')
[WARNING] *** test-level seed set: all "@with_seed()" tests run deterministically ***
test_operator_gpu.test_dropout ... [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1329041279 to reproduce.
ok

----------------------------------------------------------------------
Ran 1 test in 2.103s

OK

samskalicky · 2018-08-08T00:08:39Z

Bug can be worked around by adding a threshold_eq (<=) operator in mshadow_op.h and calling that one instead in dropout-inl.h

Tested working with seed 111913613

* added mshadow op for threshold_eq (theshold currently does <, this will do <=) modified dropout operator to use threshold_eq instead of theshold this will ensure equivalent behavior for the random numbers generated on CPU [0, 1) and GPU (0, 1] removed fixed seed for test_dropout * removed comment about flaky test

piyushghai · 2018-08-22T03:34:18Z

@sandeep-krishnamurthy Related fix for this issue is merged, this issue should be good to close.

…pache#12091) * added mshadow op for threshold_eq (theshold currently does <, this will do <=) modified dropout operator to use threshold_eq instead of theshold this will ensure equivalent behavior for the random numbers generated on CPU [0, 1) and GPU (0, 1] removed fixed seed for test_dropout * removed comment about flaky test

ptrendx · 2019-01-16T21:39:37Z

Unfortunately the issue with dropout operator is not fully solved - when using seed 579061237 on the GPU, rng produces 0 for one of the outputs in ratio=1.0 case, which, because of the usage of <=, ends up being inf instead of nan. Issue encountered in CI for my unrelated PR (and I tested it without the PR, it is reproducible): http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-gpu/detail/PR-13890/2/pipeline

@samskalicky

DickJC123 mentioned this issue Feb 17, 2018

CI test randomness 3 #9791

Merged

7 tasks

marcoabreu added Test Flaky labels Feb 17, 2018

szha added Operator Bug labels Feb 19, 2018

szha removed Flaky Test labels Feb 21, 2018

piyushghai mentioned this issue Jul 31, 2018

test_operator.test_dropout has fixed seed that can mask flakiness #11715

Closed

samskalicky mentioned this issue Aug 8, 2018

[MXNET-792] Fix for issue #9816 with dropout operator and RNG #12091

Merged

5 tasks

sandeep-krishnamurthy closed this as completed Aug 22, 2018

ptrendx mentioned this issue Jan 16, 2019

Improve bulking in Gluon #13890

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dropout may mask values even when ratio=0.0 #9816

Dropout may mask values even when ratio=0.0 #9816

DickJC123 commented Feb 17, 2018 •

edited

Loading

piiswrong commented Feb 19, 2018

DickJC123 commented Feb 19, 2018

sxjscience commented Feb 19, 2018

samskalicky commented Jul 30, 2018

piyushghai commented Jul 31, 2018 •

edited

Loading

piyushghai commented Aug 1, 2018

samskalicky commented Aug 8, 2018 •

edited

Loading

piyushghai commented Aug 22, 2018

ptrendx commented Jan 16, 2019

Dropout may mask values even when ratio=0.0 #9816

Dropout may mask values even when ratio=0.0 #9816

Comments

DickJC123 commented Feb 17, 2018 • edited Loading

piiswrong commented Feb 19, 2018

DickJC123 commented Feb 19, 2018

sxjscience commented Feb 19, 2018

samskalicky commented Jul 30, 2018

piyushghai commented Jul 31, 2018 • edited Loading

piyushghai commented Aug 1, 2018

samskalicky commented Aug 8, 2018 • edited Loading

piyushghai commented Aug 22, 2018

ptrendx commented Jan 16, 2019

DickJC123 commented Feb 17, 2018 •

edited

Loading

piyushghai commented Jul 31, 2018 •

edited

Loading

samskalicky commented Aug 8, 2018 •

edited

Loading