pybullet gym copy-safe #2536

matthieu637 · 2019-12-11T07:13:08Z

Hello,
I'm trying to sample several next observation from the current state. To do so, I usually use copy.deepcopy(a_gym_environment).
However it doesn't work with pybullet, I observed that is not copy safe.

Example where the 2 prints should return the same output:

import pybullet_envs
import gym
import copy
import numpy as np

ac=[ 0, 0, 0, 0, 0, 0]
np.random.seed(0)

e1=gym.make("HalfCheetahBulletEnv-v0")
e1.seed(0)
e1.reset()

e2=copy.deepcopy(e1)
#e2=gym.make("HalfCheetahBulletEnv-v0")
print(e1.step(ac))
e2.seed(0)
e2.reset()
print(e2.step(ac))

It displays 2 different outputs which mean that e1 and e2 interact with each other.

I guess it's because of the client-server architecture of pybullet. I tried to implement a __deepcopy__ method inside BulletClient but it is not working so far.

Any hints?

The text was updated successfully, but these errors were encountered:

erwincoumans · 2019-12-11T14:03:42Z

PyBullet is a C plugin, so deepcopy cannot be used.

Each copy would need a unique copy of the simulation. Manually copy the state from one to the other env, and use saveState to disk, restoreState from disk
If you reuse PyBullet instances (deepcopy will do that, since it cannot copy the full C state) then manually use pybullet.saveState and restoreState

Both require serious work I suspect.
It seems easier to create multiple separate envs, and copy the state of one env to the other(s).

floringogianu · 2020-01-28T14:12:24Z

@matthieu637 I just started working on support for this here: benelot/pybullet-gym#42, take a look for a discussion of how other envs are doing it.

Here's an example using directly the envs in pybullet, sorry if it's a little messy. It seems to be doing the right thing but not sure if that's really the case, maybe @erwincoumans could confirm it?

import time
import numpy as np
import multiprocessing as mp

from pybullet_envs.gym_locomotion_envs import *
from pybullet_envs.gym_manipulator_envs import *
from pybullet_envs.bullet.racecarZEDGymEnv import RacecarZEDGymEnv
np.set_printoptions(precision=4, suppress=True)


ENVS = {
    "Ant": AntBulletEnv,
    "Reacher": ReacherBulletEnv,
    "Pusher": PusherBulletEnv
}


def mc_rollout(env_name, state_path, crt_step):
    np.set_printoptions(precision=4, suppress=True)

    env = ENVS[env_name](render=False)

    obs = env.reset()
    env._p.restoreState(fileName=state_path)

    print(f"Loaded state from {state_path}.")

    Gt, step, done, first_action = 0, 0, False, None
    while not done:
        action = env.action_space.sample()
        obs, reward, done, _ = env.step(action)

        Gt += reward
        step += 1

        if step % 100 == 0:
            print(f"Rollout: did {step} steps.")

        if step == 1:
            print(f"\nState #{step} in rollout:\n", obs, "\n")
            first_action = action

        if step == (1000 - crt_step):
            break

    print(f"\nRollout done after {step} steps, return={Gt:3.2f}.")
    return Gt, step, first_action


def main():
    pool = mp.Pool(processes=1)
    
    env_name = "Reacher"
    env = ENVS[env_name](render=False)
    print(f"\nStarting {env_name} environment.\n")

    obs, done, roll_act = env.reset(), False, None
    Gt, step = 0, 0
    while not done:
        if step % 100 == 0:
            print(f"Main: did {step} steps.")

        action = env.action_space.sample() if roll_act is None else roll_act
        obs, reward, done, _ = env.step(action)

        if step == 500:

            state_path = "/run/shm/state.bullet"
            env._p.saveBullet(state_path)

            # start a monte-carlo rollout
            task = pool.starmap_async(mc_rollout, [(env_name, state_path, step)])
            # and wait for it to finish
            Gmc, Hmc, roll_act = task.get()[0]
            print(f"Rollout returned after steps={Hmc}, return={Gmc:3.2f}.\n")
        
        if step == 501:
            print(f"State #{step} in main:\n", obs, "\n")

        if step == 1000:
            break

        Gt += reward
        step += 1

    print(f"Done after {step} steps, return={Gt:3.2f}.")


if __name__ == "__main__":
    main()

erwincoumans · 2020-01-28T15:11:50Z

Thanks. Benelot's environments are not the same as pybullet_envs, the one that ship with pybullet (in case you run pip3 install pybullet). So it would still be good to have those improvements in this repository.

Note that the bullet_client is now in pybullet_utils, so the envs you are working on in Benelot's envs need to make this change. (not using from pybullet_envs.bullet import bullet_client)

Note that stable baselines uses the environments in this repo. You can also train them in colab:
https://colab.sandbox.google.com/drive/15JSROMJbeiqxcUwifPR2NYeeFBKmyIlX#scrollTo=E2eWDjPZsQc5

See https://github.com/hill-a/stable-baselines and https://github.com/araffin/rl-baselines-zoo

floringogianu · 2020-01-28T15:43:33Z

Thank you, I'll look at the resources you pointed.

Just to make sure, the example I gave in the code snippet above is with pybullet_envs, the ones in this repo. All I am doing is using env._p (the pybullet.BulletClient), for saving and reloading.

erwincoumans · 2020-04-23T04:16:07Z

@floringogianu Thanks, at a glance it looks good.
Closing this, since I don't expect further contributions in this area.
Thanks!

erwincoumans closed this as completed Apr 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pybullet gym copy-safe #2536

pybullet gym copy-safe #2536

matthieu637 commented Dec 11, 2019

erwincoumans commented Dec 11, 2019 •

edited

Loading

floringogianu commented Jan 28, 2020

erwincoumans commented Jan 28, 2020

floringogianu commented Jan 28, 2020

erwincoumans commented Apr 23, 2020

pybullet gym copy-safe #2536

pybullet gym copy-safe #2536

Comments

matthieu637 commented Dec 11, 2019

erwincoumans commented Dec 11, 2019 • edited Loading

floringogianu commented Jan 28, 2020

erwincoumans commented Jan 28, 2020

floringogianu commented Jan 28, 2020

erwincoumans commented Apr 23, 2020

erwincoumans commented Dec 11, 2019 •

edited

Loading