Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Example of) support for multi-valued Box actions? #44

Closed
sof opened this issue Jul 21, 2017 · 7 comments
Closed

(Example of) support for multi-valued Box actions? #44

sof opened this issue Jul 21, 2017 · 7 comments
Assignees

Comments

@sof
Copy link

sof commented Jul 21, 2017

When trying to run the TRPO agent on BipedalWalker, as follows, I run into:

foo$ PYTHONPATH=. python examples/openai_gym.py BipedalWalker-v2 -D -a TRPOAgent -c examples/configs/trpo_agent.json -n examples/configs/trpo_network.json
....
File "/../tensorforce/tensorforce/environments/openai_gym.py", line 67, in execute
 state, reward, terminal, _ = self.gym.step(action)
File "/usr/local/lib/python2.7/dist-packages/gym/core.py", line 99, in step
 return self._step(action)
File "/usr/local/lib/python2.7/dist-packages/gym/wrappers/time_limit.py", line 36, in _step
 observation, reward, done, info = self.env.step(action)
File "/usr/local/lib/python2.7/dist-packages/gym/core.py", line 99, in step
 return self._step(action)
File "/usr/local/lib/python2.7/dist-packages/gym/envs/box2d/bipedal_walker.py", line 372, in _step
 self.joints[1].motorSpeed     = float(SPEED_KNEE    * np.sign(action[1]))
IndexError: list index out of range

Looking at OpenAIGym.actions, it doesn't seem to unravel that environment's Box(4) action space as wanted - am I just failing to configure the agent as required, or are such action spaces not handled right now?

@michaelschaarschmidt
Copy link
Contributor

Hi,

this seems more like an issue with the interface to gym, we will get on it, thanks for bringing it up!

@AlexKuhnle
Copy link
Member

Hey, so after trying to run this, I first realized that there is a dependency to Box2D, which needs to be installed, but after doing so, it still does not work for me because of an exception from within Box2D:

File ".../gym/envs/box2d/__init__.py", line 1, in <module>
    from gym.envs.box2d.lunar_lander import LunarLander
  File ".../gym/envs/box2d/lunar_lander.py", line 4, in <module>
    import Box2D
  File ".../Box2D/__init__.py", line 20, in <module>
    from .Box2D import *
  File ".../Box2D/Box2D.py", line 435, in <module>
    _Box2D.RAND_LIMIT_swigconstant(_Box2D)
AttributeError: module '_Box2D' has no attribute 'RAND_LIMIT_swigconstant'

Hence I'm not even getting as far as you do, unfortunately.

@AlexKuhnle
Copy link
Member

Nevertheless, you're right, we are not properly translating the Gym action interface in this case (and probably others), so thanks for pointing this out. In fact, our current setup requires that actions are all single-value, i.e. 0-dimensional. I realize now that this might not always be the most convenient way, so we will change this to allow action shapes. We should be able to fix this over the weekend, I'm pretty sure.

@sof
Copy link
Author

sof commented Jul 21, 2017

Great, thanks for looking into this right away. Evaluating TRPO+GAE with multiple continuous actions is particularly interesting.

(openai/gym#100 covers the state of Box2D ; I had to compile pybox2d from sources to get something working.)

@AlexKuhnle
Copy link
Member

Update: It turns out that a proper integration of action arrays requires quite some adaptations in various classes. Unfortunately, these changes aren't quite finished yet, but I think will be in the next 2-3 days. It will then be possible to define an action shape, e.g. dict(continuous=False, shape=(2, 3), num_actions=5), and the Gym environment should hopefully work.

@AlexKuhnle
Copy link
Member

This should work now. I haven't tried it on the BipedalWalker environment (because of the box2d problems), but let me know if it does not work. I will close this issue for now, assuming that it does. Feel free to reopen it, if it still is not working.

@sof
Copy link
Author

sof commented Jul 26, 2017

Can confirm that BipedalWalker-v2's Box(4) actions are now handled just fine; thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants