Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalized gym env #125

Closed
wants to merge 7 commits into from
Closed

Normalized gym env #125

wants to merge 7 commits into from

Conversation

zhanpenghe
Copy link
Collaborator

@zhanpenghe zhanpenghe commented Jun 6, 2018

This is basically a rewrite of normalized_env and it conforms to the interface of gym.Env.

See issue #64

else:
raise NotImplementedError


Copy link
Collaborator

@jonashen jonashen Jun 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add bounds(), flatten_n_gym_space(), unflatten_gym_space(), and unflatten_n_gym_space()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can move these functions into a new helper class. I think that will be beneficial when porting the other rllab.Envs.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree these would be useful in something like rllab.env.utils

raise NotImplementedError


class NormalizedGymEnv(gym.Env, Serializable):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider inheriting from Wrapper because this will take care of the render, close, seed, etc. functions.

def step(self, action):
if isinstance(self._env.action_space, gym.spaces.Box):
# rescale the action
lb, ub = self._env.action_space.bounds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the bounds can sometimes be (-np.inf, np.inf) so you shouldn't normalize in this case.


def _apply_normalize_obs(self, obs):
self._update_obs_estimate(obs)
return (flatten_gym_space(obs, self._env.observation_space) -
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness, the returned obs should be "unflattened" again to be consistent with how the env behaved originally. You should probably support unflattening for Discrete, Box and Tuple spaces (given that your flatten function handles these cases).
Just in case this results in a significant computing overhead (benchmarking would help clarify), I would suggest you let the user set a constructor parameter, e.g. flatten_obs (False by default), to manually deactivate this costly unflattening.

lb, ub = self._env.action_space.bounds
scaled_action = lb + (action + 1.) * 0.5 * (ub - lb)
scaled_action = np.clip(scaled_action, lb, ub)
else:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Discrete is also a common action space and should be handled here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Discrete does not need to be scaled.

else:
raise NotImplementedError


Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree these would be useful in something like rllab.env.utils

@@ -0,0 +1,30 @@
import gym
from rllab.envs.normalized_gym_env import NormalizedGymEnv
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8: import grouping

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree about the utils. I asked @jonashen to add that into his pr since he is doing the refactoring of gym.Env so let me reopen this after his pr.

@ryanjulian ryanjulian added this to the Week of June 4th milestone Jun 7, 2018
Copy link
Owner

@ryanjulian ryanjulian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all envs will be gym.Envs as of #118, shouldn't this just replace normalized?

@zhanpenghe
Copy link
Collaborator Author

Yes, the normalized class will be deleted when pr #129 is done.

Copy link
Owner

@ryanjulian ryanjulian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make a Github issue to rename/replaced normalized with this once the gym.Env change has posted.

@ryanjulian
Copy link
Owner

Please reopen this PR against https://github.com/rlworkgroup/garage

@zhanpenghe zhanpenghe closed this Jun 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants