Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Divide by zero #1

Open
pedronahum opened this issue Mar 28, 2018 · 4 comments
Open

Divide by zero #1

pedronahum opened this issue Mar 28, 2018 · 4 comments

Comments

@pedronahum
Copy link

Hi,
First and foremost, thanks for sharing the code. This is greatly appreciated.

Currently testing ARS in other learning environments and found that for very difficult environments the users of the code might face a divide by zero error, particularly at early stages of the learning process (ie, zero reward in all the initial rollouts).

# normalize rewards by their standard deviation
rollout_rewards /= np.std(rollout_rewards)

Thanks,

@hari-sikchi
Copy link

I experienced this kind of difficulties in all sparse reward setting. Is ARS a good way to go for these optimization landscapes?

@ashutoshtiwari13
Copy link

Can we use a .clip(min=1e-2) to avoid that ?

@pedronahum
Copy link
Author

pedronahum commented Feb 18, 2019

In my case, adding 1e-8 to the divisor made the trick...

@ashutoshtiwari13
Copy link

yeah @pedronahum , that would do it too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants