-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save/Resume support #42
Conversation
As far as I know, the reset mechanism in pybullet is based on saving and loading the state again. I would suggest to look into this as I believe it is not disk-based. |
Yes, I tried first to work with the same mechanism used in |
Hi, thanks for your great work for saving and restoring state in pybullet-gym. It helped me a lot. import gym
import pybulletgym
if __name__ == '__main__':
env = gym.make("HalfCheetahMuJoCoEnv-v0")
env.seed(0)
env.action_space.seed(0)
obs, done = env.reset(), False
for i in range(200):
nx_obs, rew, done, _ = env.step(env.action_space.sample())
env.save_env_state('state.bullet')
act = env.action_space.sample()
state1 = env.step(act)
env.restore_env_state('state.bullet')
state2 = env.step(act)
print(state1)
print(state2) # the output of the test
(array([-4.1603717e-01, 1.4024903e-01, 3.2494479e-01, 3.7975675e-01,
5.4050684e-01, 8.0007428e-01, -1.2108239e+00, -1.9657683e+00,
-4.3360400e-01, -3.7912399e-01, 5.7443666e-01, -3.6629651e+00,
2.3992314e+00, 1.3094640e+01, -4.5016697e-03, -4.8106198e+00,
1.6316326e+01], dtype=float32), -0.6035058938981799, False, {})
(array([-4.1603717e-01, 1.4024903e-01, 3.2494479e-01, 3.7975675e-01,
5.4050684e-01, 8.0007428e-01, -1.2108239e+00, -1.9657683e+00,
-4.3360400e-01, -3.7912399e-01, 5.7443666e-01, -3.6629651e+00,
2.3992314e+00, 1.3094640e+01, -4.5016697e-03, -4.8106198e+00,
1.6316326e+01], dtype=float32), -0.22421693801879883, False, {}) As expected, state1 and state2 should exactly be the same, but the two Specifically, taking the env HalfCheetahMuJoCoEnv-v0 as an example. In HalfCheetah, After restoring the saved state, the state of the emulator did change. But the Therefore, I think we should save the The specific implementation is: def save_env_state(self, file_path):
self.prev_pos = self.robot.pos_after
self._p.saveBullet(file_path)
def restore_env_state(self, file_path):
self.robot.pos_after = self.prev_pos
self._p.restoreState(fileName=file_path)
try:
os.remove(file_path)
except FileNotFoundError:
pass # After modification, the output of the test
(array([-0.406764 , 0.13484241, 0.36529362, 0.31630808, 0.58991945,
0.8009333 , -1.1393532 , -2.4010131 , -0.4189203 , -0.22211011,
0.41537938, -3.5352209 , 1.9927833 , 14.179333 , -0.05656279,
-3.5729892 , 14.309292 ], dtype=float32), -0.5812075076813373, False, {})
(array([-0.406764 , 0.13484241, 0.36529362, 0.31630808, 0.58991945,
0.8009333 , -1.1393532 , -2.4010131 , -0.4189203 , -0.22211011,
0.41537938, -3.5352209 , 1.9927833 , 14.179333 , -0.05656279,
-3.5729892 , 14.309292 ], dtype=float32), -0.5812075076813373, False, {}) the state1 and state2 are exactly the same. |
Started work on #40.
Because saving and loading using
pybullet.saveState()
is not working across python processes (which don't share memory), I decided to use thepybullet.saveBullet()
API which saves the state of the environment on the disk. This could potentially be expensive, so I added an example in which (on Linux at least) we can save the states to/run/shm/
which is a memory mapped file in fedora and ubuntu based distributions.I think ideally
pybullet
would return the serialized state so we can pass it around among python processes, but I looked at theC++
backend and it doesn't look trivial to implement.Compatibility with OpenAI/gym:
The contributors over at OpenAI/gym haven't decided on an API for saving and resuming state: openai/gym#402.
ALE implements it like this:
The recommended way for MuJoCo is:
Or, if the env is entirely written in python you can just
deepcopy
it. Other envs, such as those based onBox2d
simply don't have support for this.Eventually we will need to get close to the API they'll decide on.