Save/Resume support #42

floringogianu · 2020-01-28T13:42:33Z

Started work on #40.

Because saving and loading using pybullet.saveState() is not working across python processes (which don't share memory), I decided to use the pybullet.saveBullet() API which saves the state of the environment on the disk. This could potentially be expensive, so I added an example in which (on Linux at least) we can save the states to /run/shm/ which is a memory mapped file in fedora and ubuntu based distributions.

state_path = "/run/shm/state.bullet"  # or somewhere on the disk

env.save_env_state(state_path)
...
env.make(...)
env.reset()  # this is necessary
env.restore_env_state(state_path)

I think ideally pybullet would return the serialized state so we can pass it around among python processes, but I looked at the C++ backend and it doesn't look trivial to implement.

Compatibility with OpenAI/gym:

The contributors over at OpenAI/gym haven't decided on an API for saving and resuming state: openai/gym#402.

ALE implements it like this:

   snapshot = env.ale.cloneState()
   ...
   env.ale.restoreState(snapshot)

The recommended way for MuJoCo is:

saved_state = env.sim.get_state()
...
env.sim.set_state(saved_state)

Or, if the env is entirely written in python you can just deepcopy it. Other envs, such as those based on Box2d simply don't have support for this.

Eventually we will need to get close to the API they'll decide on.

benelot · 2020-01-28T16:01:36Z

As far as I know, the reset mechanism in pybullet is based on saving and loading the state again. I would suggest to look into this as I believe it is not disk-based.

floringogianu · 2020-01-28T17:19:44Z

Yes, I tried first to work with the same mechanism used in env.reset(), as I mentioned above. However it does not work between processes. If you have any pointers to how it can be used, besides the way pybullet.saveState() is used in env.reset() and in this example here so that it works between python processes, let me know.

mantle2048 · 2021-04-21T10:43:31Z

Hi, thanks for your great work for saving and restoring state in pybullet-gym. It helped me a lot.
But through the following test, I found that there is a problem in your implementation.

import gym
import pybulletgym


if __name__ == '__main__':
    env = gym.make("HalfCheetahMuJoCoEnv-v0")
    env.seed(0)
    env.action_space.seed(0)
    obs, done = env.reset(), False
    for i in range(200):
        nx_obs, rew, done, _ = env.step(env.action_space.sample())

    env.save_env_state('state.bullet')
    act = env.action_space.sample()
    state1 = env.step(act)

    env.restore_env_state('state.bullet')
    state2 = env.step(act)
    print(state1)
    print(state2)

#  the output of the test
(array([-4.1603717e-01,  1.4024903e-01,  3.2494479e-01,  3.7975675e-01,
        5.4050684e-01,  8.0007428e-01, -1.2108239e+00, -1.9657683e+00,
       -4.3360400e-01, -3.7912399e-01,  5.7443666e-01, -3.6629651e+00,
        2.3992314e+00,  1.3094640e+01, -4.5016697e-03, -4.8106198e+00,
        1.6316326e+01], dtype=float32), -0.6035058938981799, False, {})
(array([-4.1603717e-01,  1.4024903e-01,  3.2494479e-01,  3.7975675e-01,
        5.4050684e-01,  8.0007428e-01, -1.2108239e+00, -1.9657683e+00,
       -4.3360400e-01, -3.7912399e-01,  5.7443666e-01, -3.6629651e+00,
        2.3992314e+00,  1.3094640e+01, -4.5016697e-03, -4.8106198e+00,
        1.6316326e+01], dtype=float32), -0.22421693801879883, False, {})

As expected, state1 and state2 should exactly be the same, but the two env.step(act) produce different rewards. After debugging, I found that pybullet.saveBullet() or pybullet.saveState() only save the state of the emulator, but the reward value is calculated by Python code.

Specifically, taking the env HalfCheetahMuJoCoEnv-v0 as an example. self.robot represents the robot-HalfCheetah in the env.

In HalfCheetah, self.pos_after records the current position of the robot and the function calc_potential is used to update the value of self.pos_after and calc the potential which is a part of the reward.

After restoring the saved state, the state of the emulator did change. But the self.pos_after didn't change, the potential rewards in the first env.step() after restoring the saved state will be zero.

Therefore, I think we should save the pos_after of the current state in save_env_state, and then set pos_after to the correct value in restore_env_state.

The specific implementation is:

def save_env_state(self, file_path):
        self.prev_pos = self.robot.pos_after
	self._p.saveBullet(file_path)

def restore_env_state(self, file_path):
        self.robot.pos_after = self.prev_pos
	self._p.restoreState(fileName=file_path)
	try:
		os.remove(file_path)
	except FileNotFoundError:
		pass

#  After modification, the output of the test
(array([-0.406764  ,  0.13484241,  0.36529362,  0.31630808,  0.58991945,
        0.8009333 , -1.1393532 , -2.4010131 , -0.4189203 , -0.22211011,
        0.41537938, -3.5352209 ,  1.9927833 , 14.179333  , -0.05656279,
       -3.5729892 , 14.309292  ], dtype=float32), -0.5812075076813373, False, {})
(array([-0.406764  ,  0.13484241,  0.36529362,  0.31630808,  0.58991945,
        0.8009333 , -1.1393532 , -2.4010131 , -0.4189203 , -0.22211011,
        0.41537938, -3.5352209 ,  1.9927833 , 14.179333  , -0.05656279,
       -3.5729892 , 14.309292  ], dtype=float32), -0.5812075076813373, False, {})

the state1 and state2 are exactly the same.

floringogianu added 4 commits January 28, 2020 14:02

Add save and resume support.

f6db3ed

Add save and resume support for MuJoCo envs too.

41694b5

Add an example.

d157ef1

Remove PyTorch multiprocessing dependency in the example.

5a2ea3e

floringogianu mentioned this pull request Jan 28, 2020

pybullet gym copy-safe bulletphysics/bullet3#2536

Closed

floringogianu closed this by deleting the head repository Oct 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Save/Resume support #42

Save/Resume support #42

floringogianu commented Jan 28, 2020 •

edited

Loading

benelot commented Jan 28, 2020

floringogianu commented Jan 28, 2020

mantle2048 commented Apr 21, 2021 •

edited

Loading

Save/Resume support #42

Save/Resume support #42

Conversation

floringogianu commented Jan 28, 2020 • edited Loading

benelot commented Jan 28, 2020

floringogianu commented Jan 28, 2020

mantle2048 commented Apr 21, 2021 • edited Loading

floringogianu commented Jan 28, 2020 •

edited

Loading

mantle2048 commented Apr 21, 2021 •

edited

Loading