-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TrainTest fails sometimes #222
Comments
Random seed is not fixed for environment , It is a possible reason that train test fails sometimes. |
And there are other reasons:
perharps we should set |
Sometimes the SAC case will also fail. |
I think for unittest, to avoid stochasticity introduced by parallelism, we can set num_envs=1 and not use async-off policy training. |
OK, I thought the reason why we changed unittest to tf.unittest is because of the determinism it provides. Then maybe next time the test threshold should be less strict. |
So if everything has fixed random seeds, then the only stochasticity is from CPU scheduling for parallelism, right? What about we use eager mode for unittests? |
Yes , the only stochasticity is from CPU scheduling for parallelism, it affect the generation of random numbers. I have tried using eager mode for train test with only 1 thread, but it does not make deterministic result (still do not know the reason) |
Hmm.. Interesting. @emailweixu Do you have any insight into this? |
Using eager mode will make the test much longer. |
I think @witwolf tried setting the seeds of environments deterministically. Even so, the results are nondeterministic. |
======================================================================
FAIL: test_ppo_cart_pole (bin.train_test.TrainTest)
test_ppo_cart_pole (bin.train_test.TrainTest)
Traceback (most recent call last):
File "/ALF/alf/bin/train_test.py", line 100, in test_ppo_cart_pole
self._test_train('ppo_cart_pole.gin', _test_func)
File "/ALF/alf/bin/train_test.py", line 126, in _test_train
assert_func(episode_returns, episode_lengths)
File "/ALF/alf/bin/train_test.py", line 97, in _test_func
self.assertGreater(np.mean(returns[-2:]), 198)
AssertionError: 197.8499984741211 not greater than 198
@witwolf It seems that the determinism isn't working as expected? There are some other cases failed sometimes, and I have to manually restart the testing job each time.
The text was updated successfully, but these errors were encountered: