Updated Observation Space for SawyerReachPushPickPlaceEnv #89

adibellathur · 2020-05-22T02:02:56Z

Fix for issue #39 for a single environment.
A detailed explanation of the logic behind these changes is found here: #39 (comment)

Fix for issue Farama-Foundation#39 A detailed explanation of the logic behind these changes is found here: Farama-Foundation#39 (comment)

ryanjulian · 2020-05-29T03:22:39Z

Out of an abundance of caution, you should first run garage SAC on this environment (random_init=True) and ensure that it can still train Reach and Push (PickPlace is flaky, i think).

avnishn · 2020-05-29T15:29:06Z

Is it possible to write a test that verifies that your fix works, and that would break the previous committed code? It’s hard to merge fixes like this without something like that.

haydenshively · 2020-05-29T18:19:07Z

Our MujocoEnv class has the following interesting comment:

This is a simplified version of the gym MujocoEnv class.
Some differences are:
- Do not automatically set the observation/action space.

I looked at the gym implementation here, and it looks like they set their action space based on an actuator's control range:

bounds = self.model.actuator_ctrlrange.copy().astype(np.float32)
low, high = bounds.T
self.action_space = spaces.Box(low=low, high=high, dtype=np.float32)
return self.action_space

If I understand correctly, @ryanjulian and others were discussing the use of arm + workspace constraints (#39) to analytically determine the observation space (as opposed to empirically). As such, could we make use of something* like model.actuator_ctrlrange? It looks like the parameters in this PR are correct, but if we need to fix this for every environment, manual experimentation doesn't seem efficient.

*If you search this page for ctrlrange, you'll find that many objects include it as a property. I suppose we may not use that one in particular, but there's got to be something in the XML that can help us parameterize this

haydenshively · 2020-05-29T19:13:30Z

If you look at sawyer_reach_push_pick_and_place.xml, it links to sawyer_xyz_base.xml. At the top of that file, the tabletop is defined as follows:

<geom name="tableTop" type="plane" pos="0 0.6 0" size="0.4 0.4 0.5" rgba=".6 .6 .5 1" contype="1" conaffinity="1" friction="2 0.1 0.002" material="light_wood_v3"/>

Combining the pos and size, we get the X & Y obj values that @adibellathur found here. The same goes for Z obj values, which can be found in sawyer_reach_push_pick_and_place.xml:

<geom name="objGeom" type="cylinder" pos="0 0 0" solimp="0.99 0.99 0.01" size="0.03 0.015" rgba="1 0 0 1" solref="0.01 1" contype="1" conaffinity="1" friction="1 0.1 0.002" condim="4" material="wood" />

This seems to confirm his numbers for obj_low and obj_high. Solving for hand_low and hand_high may be a bit more involved since the sawyer arm has multiple sections. In any case, we should be able to access these parameters via something like self.model.tableTop.foo, though I haven't confirmed that yet.

adibellathur · 2020-05-29T22:07:28Z

Hello All,

@avnishn the test for this change was in the previous PR, but due to confusion on getting the observation space for the active env in an ML1 env, that push was not approved. I will make a change and update that PR, hopefully allowing for a test that does not change MultitTask Env's handling of the observation space. Once that is done I can write a test script here.

@haydenshively Currently I'm using the .xml files to get a lot of the observation space bounds. I'm just using experimentation to simply verify they are correct (both visually and through error checking). It was also easier to explain issues with experimentation. Also, as far as I am aware the action_space and observation_space are 2 separate entities. the action space tells the range for possible actions the model can take (for example sawyer_reach_push_pick_and_place uses it to set the mocap values) while observation space is the positions the objects can be in (e.g. robot arm and puck xyz values). Maybe there is something we can find in the gym implementation though so I will take a look there. Thanks!

haydenshively · 2020-05-30T00:06:39Z

Yeah action_space and observation_space are definitely separate; I was just trying to find examples of using XML to parameterize things. But if you’ve already been looking at it that’s great!

adibellathur · 2020-05-30T22:53:34Z

Out of an abundance of caution, you should first run garage SAC on this environment (random_init=True) and ensure that it can still train Reach and Push (PickPlace is flaky, i think).

@ryanjulian Sounds good. @avnishn is helping me run it as my personal computer runs out of memory after 30 epochs. How long should we run SAC for?

ryanjulian · 2020-05-31T00:44:49Z

You should run it until the environment converges. I think that can take up to 5M steps.

avnishn · 2020-06-08T17:22:05Z

https://tensorboard.dev/experiment/Zi4NkYjNQWqUPLljyXGoIQ/

@ryanjulian, @adibellathur's results look good to me, I'm inclined to merge this PR.

ryanjulian · 2020-06-08T18:50:39Z

Can you send an updated tensorboard.dev with a before/after comparison?

avnishn · 2020-06-09T00:33:13Z

@ryanjulian @adibellathur here are is a tensorboard with old and new runs:

https://tensorboard.dev/experiment/dD6gjzVYTniVw4GbTwrsxA/#scalars

avnishn · 2020-06-09T17:51:24Z

@ryanjulian the results are in on the tensorboard in the above comment.

Updated Observation Space for SawyerReachPushPickPlaceEnv

41f8089

Fix for issue Farama-Foundation#39 A detailed explanation of the logic behind these changes is found here: Farama-Foundation#39 (comment)

avnishn requested a review from ryanjulian May 29, 2020 01:00

ryanjulian requested review from avnishn and haydenshively May 29, 2020 03:22

haydenshively approved these changes May 30, 2020

View reviewed changes

ryanjulian approved these changes Jun 9, 2020

View reviewed changes

adibellathur closed this Jul 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated Observation Space for SawyerReachPushPickPlaceEnv #89

Updated Observation Space for SawyerReachPushPickPlaceEnv #89

adibellathur commented May 22, 2020

ryanjulian commented May 29, 2020

avnishn commented May 29, 2020

haydenshively commented May 29, 2020 •

edited

Loading

haydenshively commented May 29, 2020

adibellathur commented May 29, 2020

haydenshively commented May 30, 2020

adibellathur commented May 30, 2020 •

edited

Loading

ryanjulian commented May 31, 2020

avnishn commented Jun 8, 2020

ryanjulian commented Jun 8, 2020

avnishn commented Jun 9, 2020

avnishn commented Jun 9, 2020

Updated Observation Space for SawyerReachPushPickPlaceEnv #89

Updated Observation Space for SawyerReachPushPickPlaceEnv #89

Conversation

adibellathur commented May 22, 2020

ryanjulian commented May 29, 2020

avnishn commented May 29, 2020

haydenshively commented May 29, 2020 • edited Loading

haydenshively commented May 29, 2020

adibellathur commented May 29, 2020

haydenshively commented May 30, 2020

adibellathur commented May 30, 2020 • edited Loading

ryanjulian commented May 31, 2020

avnishn commented Jun 8, 2020

ryanjulian commented Jun 8, 2020

avnishn commented Jun 9, 2020

avnishn commented Jun 9, 2020

haydenshively commented May 29, 2020 •

edited

Loading

adibellathur commented May 30, 2020 •

edited

Loading