Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TrainTest fails sometimes #222

Closed
hnyu opened this issue Oct 14, 2019 · 11 comments · Fixed by #224
Closed

TrainTest fails sometimes #222

hnyu opened this issue Oct 14, 2019 · 11 comments · Fixed by #224
Assignees

Comments

@hnyu
Copy link
Collaborator

hnyu commented Oct 14, 2019

======================================================================
FAIL: test_ppo_cart_pole (bin.train_test.TrainTest)
test_ppo_cart_pole (bin.train_test.TrainTest)

Traceback (most recent call last):
File "/ALF/alf/bin/train_test.py", line 100, in test_ppo_cart_pole
self._test_train('ppo_cart_pole.gin', _test_func)
File "/ALF/alf/bin/train_test.py", line 126, in _test_train
assert_func(episode_returns, episode_lengths)
File "/ALF/alf/bin/train_test.py", line 97, in _test_func
self.assertGreater(np.mean(returns[-2:]), 198)
AssertionError: 197.8499984741211 not greater than 198

@witwolf It seems that the determinism isn't working as expected? There are some other cases failed sometimes, and I have to manually restart the testing job each time.

@witwolf
Copy link
Contributor

witwolf commented Oct 15, 2019

Random seed is not fixed for environment , It is a possible reason that train test fails sometimes.
Any other case fails except TrainTest ?

@witwolf
Copy link
Contributor

witwolf commented Oct 15, 2019

And there are other reasons:

  1. multiple threads are used for parallelism between independent operations in tf
  2. the order of running op are uncertain (when they have no dependencies)

perharps we should set tf.config.threading.set_inter_op_parallelism_threads(1) for unittest

@hnyu
Copy link
Collaborator Author

hnyu commented Oct 15, 2019

Random seed is not fixed for environment , It is a possible reason that train test fails sometimes.
Any other case fails except TrainTest ?

Sometimes the SAC case will also fail.

@hnyu
Copy link
Collaborator Author

hnyu commented Oct 15, 2019

And there are other reasons:

  1. multiple threads are used for parallelism between independent operations in tf
  2. the order of running op are uncertain (when they have no dependencies)

perharps we should set tf.config.threading.set_inter_op_parallelism_threads(1) for unittest

I think for unittest, to avoid stochasticity introduced by parallelism, we can set num_envs=1 and not use async-off policy training.

@witwolf
Copy link
Contributor

witwolf commented Oct 16, 2019

It's hard to make the training have deterministic result, i did some experiments
with fixed seed for tf and environments and set inter_op_parallelism_threads to 1 , the result shows it still has the probability of getting different results

Personally think, it's ok when some unittest fails

@hnyu
Copy link
Collaborator Author

hnyu commented Oct 16, 2019

It's hard to make the training have deterministic result, i did some experiments
with fixed seed for tf and environments and set inter_op_parallelism_threads to 1 , the result shows it still has the probability of getting different results

Personally think, it's ok when some unittest fails

OK, I thought the reason why we changed unittest to tf.unittest is because of the determinism it provides. Then maybe next time the test threshold should be less strict.

@hnyu hnyu mentioned this issue Oct 16, 2019
@hnyu
Copy link
Collaborator Author

hnyu commented Oct 16, 2019

So if everything has fixed random seeds, then the only stochasticity is from CPU scheduling for parallelism, right? What about we use eager mode for unittests?

@witwolf
Copy link
Contributor

witwolf commented Oct 16, 2019

So if everything has fixed random seeds, then the only stochasticity is from CPU scheduling for parallelism, right? What about we use eager mode for unittests?

Yes , the only stochasticity is from CPU scheduling for parallelism, it affect the generation of random numbers. I have tried using eager mode for train test with only 1 thread, but it does not make deterministic result (still do not know the reason)

@hnyu
Copy link
Collaborator Author

hnyu commented Oct 16, 2019

So if everything has fixed random seeds, then the only stochasticity is from CPU scheduling for parallelism, right? What about we use eager mode for unittests?

Yes , the only stochasticity is from CPU scheduling for parallelism, it affect the generation of random numbers. I have tried using eager mode for train test with only 1 thread, but it does not make deterministic result (still do not know the reason)

Hmm.. Interesting. @emailweixu Do you have any insight into this?

@emailweixu
Copy link
Contributor

Using eager mode will make the test much longer.
Perhaps the game itself has some randomness inside?

@hnyu
Copy link
Collaborator Author

hnyu commented Oct 16, 2019

Using eager mode will make the test much longer.
Perhaps the game itself has some randomness inside?

I think @witwolf tried setting the seeds of environments deterministically. Even so, the results are nondeterministic.

@hnyu hnyu closed this as completed in #224 Oct 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants