max_grad_norm and use_clipped_value_loss #160

seungjaeryanlee · 2018-12-28T08:28:59Z

Hello! I was documenting your PPO code algo/ppo.py to improve my understanding of the algorithm, and I got confused on max_grad_norm and _use_clipped_value_loss.

If I am understanding this correctly, max_grad_norm is given to nn.utils.clip_grad_norm_() to set maximum gradient size, and _use_clipped_value_loss. However, I could not find relevent detail in the paper Proximal Policy Optimization Algorithm. If it was explicitly mentioned here, would you please point it out for me?

For L^VF, The paper seems to use the simple squared loss, equivalent to use_clipped_value_loss=False, but I could not find anything about the case when use_clipped_value_loss=True. Is this a trick not mentioned in the paper?

Thank you in advance for your help. Happy holidays!

The text was updated successfully, but these errors were encountered:

ikostrikov · 2018-12-28T17:37:52Z

They introduced the new loss in the implementation of PPO2:
https://github.com/openai/baselines/blob/master/baselines/ppo2/model.py#L63

Also see grad normalization here:
https://github.com/openai/baselines/blob/master/baselines/ppo2/model.py#L102

seungjaeryanlee · 2018-12-29T08:40:38Z

Thank you for the links! I see how they correspond to those parts of PPO2 in OpenAI Baselines.

It's unfortunate that these changes are not written in any paper. Guess I will have to read openai/baselines code as well.

seungjaeryanlee closed this as completed Dec 29, 2018

Phimos mentioned this issue Dec 26, 2023

A Question About Dagger Value Algorithm PKU-EPIC/UniDexGrasp2#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

max_grad_norm and use_clipped_value_loss #160

max_grad_norm and use_clipped_value_loss #160

seungjaeryanlee commented Dec 28, 2018

ikostrikov commented Dec 28, 2018

seungjaeryanlee commented Dec 29, 2018

max_grad_norm and use_clipped_value_loss #160

max_grad_norm and use_clipped_value_loss #160

Comments

seungjaeryanlee commented Dec 28, 2018

ikostrikov commented Dec 28, 2018

seungjaeryanlee commented Dec 29, 2018