Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dirichlet noise #59

Closed
pathway opened this issue May 10, 2018 · 3 comments
Closed

Dirichlet noise #59

pathway opened this issue May 10, 2018 · 3 comments

Comments

@pathway
Copy link

pathway commented May 10, 2018

From pp 24 of https://deepmind.com/documents/119/agz_unformatted_nature.pdf :

Additional exploration is achieved by adding Dirichlet noise 
to the prior probabilities in the root node s0, specifically 
P(s, a) = (1 − ε)p_a + εη_a, where η ∼ Dir(0.03) and ε = 0.25; 
this noise ensures that  all moves may be tried, but the
 search may still overrule bad moves

Searching this repo brings no matches for "dirichlet". I think you want numpy.random.dirichlet

Thanks for sharing this repo!

@pathway
Copy link
Author

pathway commented May 10, 2018

Btw how do you achieve exploration currently, is it purely via temperature?
Different exploration strategies are quite interesting.

@suragnair
Copy link
Owner

Yep, the exploration is purely via temperature in the code. Should be fairly easy to incorporate Dirichlet noise. It was a little unclear whether they use the noise in the evaluation stage.

@evg-tyurin
Copy link
Contributor

@pathway there is sample implementation of Dirichlet noise in the following repo.

https://github.com/evg-tyurin/alpha-nagibator/blob/48b2ebd3ca272f388c13277297edbb60d98eb64b/MCTS.py#L191

The noise is used in self-play only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants