You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The following code checks to see if when supplied with randomly sampled actions sampled from two action spaces supplied with the same seed, two instances of the pybullet ant environment will create the same observations. This code seems to fail intermittently in python3.5 and consistently in python3.6. For the life of me, I can't figure out what is causing the drift between environment instances.
System Specs:
Ubuntu 18.04
python 3.6 (also verified with python 3.5)
import gym
import pybulletgym
import numpy as np
if __name__ == "__main__":
env1 = gym.make("AntPyBulletEnv-v0")
env1.seed(0)
env1.action_space.seed(0)
env2 = gym.make("AntPyBulletEnv-v0")
env2.seed(0)
env2.action_space.seed(0)
obs1 = env1.reset()
obs2 = env2.reset()
for i in range(100):
if not np.array_equal(obs1 ,obs2 ):
for e1,e2 in zip(obs1,obs2):
if e1 != e2:
print(e1,e2)
exit("failed on obs")
action1 = env1.action_space.sample()
action2 = env2.action_space.sample()
if not np.array_equal(action1, action2):
print(action1, action2)
for a1,a2 in zip(action1,action2):
if a1 != a2:
print(e1,e2)
exit("failed on action")
print("env 1")
obs1, reward, done1, info = env1.step(action1)
print(action1, obs1)
print("env 2")
obs2, reward, done2, info = env2.step(action2)
print(action2, obs2)
if done1:
assert(done2)
if not np.array_equal(obs1 ,obs2 ):
for e1,e2 in zip(obs1,obs2):
if e1 != e2:
print(e1,e2)
exit("failed on obs")
obs1 = env1.reset()
obs2 = env2.reset()
## Output
...
env 1
[-0.19167283 -0.24867578 0.57644254 -0.6455737 0.5354068 0.95332575
-0.48177752 0.2555853 ] [-0.22058617 -0.07561598 0.997137 0.10426999 -0.00999914 0.11529232
0.162765 -0.11414797 -1.0032024 0.16360427 0.50044924 0.0312518
-0.58839995 0.32609588 -0.03842217 -0.13290787 0.2856137 0.26151842
1.1269124 0.05937529 -0.35873523 0.02834896 -0.61619514 0.8499997
1. 0. 0. 0. ]
env 2
[-0.19167283 -0.24867578 0.57644254 -0.6455737 0.5354068 0.95332575
-0.48177752 0.2555853 ] [-0.22060278 -0.07551313 0.9971448 0.10177492 -0.01480584 0.11399316
0.1625488 -0.11415483 -1.0028749 0.17036478 0.49990714 0.03061081
-0.58719516 0.33724806 -0.03812427 -0.13370141 0.28572914 0.2649531
1.1260623 0.05638258 -0.3578396 0.03535499 -0.6153823 0.8513867
1. 0. 0. 0. ]
-0.22058617 -0.22060278
-0.07561598 -0.07551313
0.997137 0.9971448
0.10426999 0.101774916
-0.009999137 -0.014805844
0.11529232 0.11399316
0.162765 0.1625488
-0.11414797 -0.11415483
-1.0032024 -1.0028749
0.16360427 0.17036478
0.50044924 0.49990714
0.031251803 0.030610807
-0.58839995 -0.58719516
0.32609588 0.33724806
-0.038422175 -0.03812427
-0.13290787 -0.13370141
0.2856137 0.28572914
0.26151842 0.2649531
1.1269124 1.1260623
0.059375294 0.056382578
-0.35873523 -0.3578396
0.028348956 0.035354994
-0.61619514 -0.6153823
0.8499997 0.8513867
failed on obs
The text was updated successfully, but these errors were encountered:
The following code checks to see if when supplied with randomly sampled actions sampled from two action spaces supplied with the same seed, two instances of the pybullet ant environment will create the same observations. This code seems to fail intermittently in python3.5 and consistently in python3.6. For the life of me, I can't figure out what is causing the drift between environment instances.
System Specs:
Ubuntu 18.04
python 3.6 (also verified with python 3.5)
The text was updated successfully, but these errors were encountered: