TrainTest fails sometimes #222

hnyu · 2019-10-14T21:47:47Z

======================================================================
FAIL: test_ppo_cart_pole (bin.train_test.TrainTest)
test_ppo_cart_pole (bin.train_test.TrainTest)

Traceback (most recent call last):
File "/ALF/alf/bin/train_test.py", line 100, in test_ppo_cart_pole
self._test_train('ppo_cart_pole.gin', _test_func)
File "/ALF/alf/bin/train_test.py", line 126, in _test_train
assert_func(episode_returns, episode_lengths)
File "/ALF/alf/bin/train_test.py", line 97, in _test_func
self.assertGreater(np.mean(returns[-2:]), 198)
AssertionError: 197.8499984741211 not greater than 198

@witwolf It seems that the determinism isn't working as expected? There are some other cases failed sometimes, and I have to manually restart the testing job each time.

witwolf · 2019-10-15T05:21:43Z

Random seed is not fixed for environment , It is a possible reason that train test fails sometimes.
Any other case fails except TrainTest ?

witwolf · 2019-10-15T08:42:02Z

And there are other reasons:

multiple threads are used for parallelism between independent operations in tf
the order of running op are uncertain (when they have no dependencies)

perharps we should set tf.config.threading.set_inter_op_parallelism_threads(1) for unittest

hnyu · 2019-10-15T16:39:28Z

Random seed is not fixed for environment , It is a possible reason that train test fails sometimes.
Any other case fails except TrainTest ?

Sometimes the SAC case will also fail.

hnyu · 2019-10-15T16:45:09Z

And there are other reasons:

multiple threads are used for parallelism between independent operations in tf

the order of running op are uncertain (when they have no dependencies)

perharps we should set tf.config.threading.set_inter_op_parallelism_threads(1) for unittest

I think for unittest, to avoid stochasticity introduced by parallelism, we can set num_envs=1 and not use async-off policy training.

witwolf · 2019-10-16T02:03:44Z

It's hard to make the training have deterministic result, i did some experiments
with fixed seed for tf and environments and set inter_op_parallelism_threads to 1 , the result shows it still has the probability of getting different results

Personally think, it's ok when some unittest fails

hnyu · 2019-10-16T03:14:46Z

It's hard to make the training have deterministic result, i did some experiments
with fixed seed for tf and environments and set inter_op_parallelism_threads to 1 , the result shows it still has the probability of getting different results

Personally think, it's ok when some unittest fails

OK, I thought the reason why we changed unittest to tf.unittest is because of the determinism it provides. Then maybe next time the test threshold should be less strict.

hnyu · 2019-10-16T06:55:12Z

So if everything has fixed random seeds, then the only stochasticity is from CPU scheduling for parallelism, right? What about we use eager mode for unittests?

witwolf · 2019-10-16T16:02:08Z

So if everything has fixed random seeds, then the only stochasticity is from CPU scheduling for parallelism, right? What about we use eager mode for unittests?

Yes , the only stochasticity is from CPU scheduling for parallelism, it affect the generation of random numbers. I have tried using eager mode for train test with only 1 thread, but it does not make deterministic result (still do not know the reason)

hnyu · 2019-10-16T17:03:31Z

So if everything has fixed random seeds, then the only stochasticity is from CPU scheduling for parallelism, right? What about we use eager mode for unittests?

Yes , the only stochasticity is from CPU scheduling for parallelism, it affect the generation of random numbers. I have tried using eager mode for train test with only 1 thread, but it does not make deterministic result (still do not know the reason)

Hmm.. Interesting. @emailweixu Do you have any insight into this?

emailweixu · 2019-10-16T17:16:36Z

Using eager mode will make the test much longer.
Perhaps the game itself has some randomness inside?

hnyu · 2019-10-16T17:50:43Z

Using eager mode will make the test much longer.
Perhaps the game itself has some randomness inside?

I think @witwolf tried setting the seeds of environments deterministically. Even so, the results are nondeterministic.

hnyu assigned witwolf Oct 14, 2019

witwolf mentioned this issue Oct 15, 2019

set random seed for environments #224

Merged

hnyu mentioned this issue Oct 16, 2019

alf build failing #232

Closed

hnyu closed this as completed in #224 Oct 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TrainTest fails sometimes #222

TrainTest fails sometimes #222

hnyu commented Oct 14, 2019 •

edited

Loading

witwolf commented Oct 15, 2019

witwolf commented Oct 15, 2019

hnyu commented Oct 15, 2019

hnyu commented Oct 15, 2019 •

edited

Loading

witwolf commented Oct 16, 2019

hnyu commented Oct 16, 2019

hnyu commented Oct 16, 2019

witwolf commented Oct 16, 2019

hnyu commented Oct 16, 2019

emailweixu commented Oct 16, 2019

hnyu commented Oct 16, 2019

TrainTest fails sometimes #222

TrainTest fails sometimes #222

Comments

hnyu commented Oct 14, 2019 • edited Loading

====================================================================== FAIL: test_ppo_cart_pole (bin.train_test.TrainTest) test_ppo_cart_pole (bin.train_test.TrainTest)

witwolf commented Oct 15, 2019

witwolf commented Oct 15, 2019

hnyu commented Oct 15, 2019

hnyu commented Oct 15, 2019 • edited Loading

witwolf commented Oct 16, 2019

hnyu commented Oct 16, 2019

hnyu commented Oct 16, 2019

witwolf commented Oct 16, 2019

hnyu commented Oct 16, 2019

emailweixu commented Oct 16, 2019

hnyu commented Oct 16, 2019

hnyu commented Oct 14, 2019 •

edited

Loading

======================================================================
FAIL: test_ppo_cart_pole (bin.train_test.TrainTest)
test_ppo_cart_pole (bin.train_test.TrainTest)

hnyu commented Oct 15, 2019 •

edited

Loading