Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Same seed different results #34

Open
GPaolo opened this issue Nov 14, 2019 · 5 comments
Open

Same seed different results #34

GPaolo opened this issue Nov 14, 2019 · 5 comments

Comments

@GPaolo
Copy link

GPaolo commented Nov 14, 2019

Hi,
If I set the seed, I get different results between different runs.

I am attaching a small script to reproduce the issue:

import gym
import pybullet
import pybulletgym

env = gym.make('AntMuJoCoEnv-v0')

env.seed(7)

obs = env.reset()
act = env.action_space.sample()
print(act)

obs = env.reset()
act = env.action_space.sample()
print(act)

The results I obtain are:

[ 0.36626342  0.64521587 -0.06823247  0.3184726   0.49362287  0.06656755
  0.8346416   0.71423185]
[-0.88181156 -0.4371754   0.45050308  0.7611305   0.27231112 -0.43260667
 -0.3026008  -0.3178519 ]

And they change among any run.

I think this is a bug, cause the result should be always the same, given the same seed.
Or am I doing something wrong?

@benelot
Copy link
Owner

benelot commented Nov 15, 2019

Hello! Thanks for mentioning this! Can you check if you get the same issue on the pybullet envs as well? Since the reset of the state is handled directly by pybullet through its loading and saving mechanism, I do not have any influence on the deterministic execution of different runs.

@GPaolo
Copy link
Author

GPaolo commented Nov 15, 2019

Just tested. To have similar results among different runs, there is the need to set the seed also for the action space and observation space:

env.seed(7)
env.action_space.seed(7)
env.observation_space.seed(7)

To have the same results among different resets, the seed needs to be reset everytime. This script:

import gym
import pybullet
import pybulletgym

env = gym.make('AntPyBulletEnv-v0')

env.seed(7)
env.action_space.seed(7)
env.observation_space.seed(7)

obs = env.reset()
act = env.action_space.sample()
print(act)

env.seed(7)
env.action_space.seed(7)
env.observation_space.seed(7)

obs = env.reset()
act = env.action_space.sample()
print(act)

returns:

[-0.44954607  0.83736265 -0.20760961  0.75181586 -0.01520521  0.25760308
  0.06112269 -0.45786014]
[-0.44954607  0.83736265 -0.20760961  0.75181586 -0.01520521  0.25760308
  0.06112269 -0.45786014]

This happens with whatever environment I tested.

@benelot
Copy link
Owner

benelot commented Nov 15, 2019

If you reset the seed in my envs every time, does that help too? If so, then we fix this to be stored across resets.

@GPaolo
Copy link
Author

GPaolo commented Nov 21, 2019

Yes, I tried to set the seed before every reset with different environments from the repo and it seems to work consistently.

I also think that the .seed() method should set not only the env seed, but also the action_space and the observation_space ones. At least to give consistency, given that in other Gym environments the only function I had to call to set the seed was .seed().

@benelot
Copy link
Owner

benelot commented Nov 21, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants