-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is VecNormalize for PPO2 is necessary?[question] #694
Comments
Hello, You should also normalize the reward (but not during testing), btw, why did you change the default values of EDIT: your gamma looks quite small compared to "classic" range |
@araffin Thanks for your reply!
from stable_baselines.common.policies import MlpPolicy, FeedForwardPolicy
from stable_baselines.common.vec_env import DummyVecEnv, VecNormalize
from stable_baselines import PPO2
import robosuite as suite
import tensorflow as tf
env = suite.make('HumanReachMultiDir',
use_camera_obs=False,
has_renderer=True,
ignore_done=False,
has_offscreen_renderer=False,
horizon=500,
use_her=False,
use_indicator_object=True,
reward_shaping=True,
)
env = DummyVecEnv([lambda: env])
env = VecNormalize(env, norm_obs=True, norm_reward=False,
clip_obs=10., gamma=0.99) # when training norm_reward = True
model = PPO2.load('trained_model")
model.set_env(env)
obs = env.reset()
for _ in range(500):
action, _states = model.predict(obs)
obs, rewards, dones, info = env.step(action)
env.render() Thank you very much! |
yes
your code looks right, but in fact, for you two questions, I would suggest you to use the rl zoo, it does (almost) everything for you (and you can see tuned hyperparameters for similar envs). |
@araffin Thanks for your suggestion and I will check them! |
Best is to look at the code for those questions, but yes it is for both. |
Related: #698 |
Hi, thanks for the good repo!
I trained the same agent(a human model on Mujoco) with same environment with PPO2, one is only with DummyVecEnv and the other one with DummyVecEnv and VecNormalize
The result shows that the agent with
VecNormalize
is much worse than withoutVecNormalize
. But as here saysI wonder if the reason is from the way I load the model and reset?
Thank you for everyone's reply!
The text was updated successfully, but these errors were encountered: