Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is 0.55 too high for replace_rate given Reversi can have draw result? #7

Closed
gooooloo opened this issue Nov 8, 2017 · 3 comments
Closed

Comments

@gooooloo
Copy link
Contributor

gooooloo commented Nov 8, 2017

I know the Deepmind paper says a replace_rate of 0.55. But considering in that Go game under that rule, there is no "draw" result, so 0.55 is reasonable. However, in reversi there is "draw", so is it too high for the replace rate still being 0.55?

By 0.55, that saying, the next generation has to beat best model in most games, even draw is not allowed. That seems difficult. And the best model is thus less evolved, which makes the selfplay policy less improved neither. Then the training data less improved.

Or another question can be: in your practice when evaluating, how often does "draw" ending happen? In my local running, it happens in about rate of 1/8 when evaluating. I am still in early stage of training, and I rewrite the selfplay part also, so I don't know whether this 1/8 rate is reasonable or not. Just curious what rate of drawing you got.

Thanks.

@gooooloo
Copy link
Contributor Author

gooooloo commented Nov 8, 2017

I see. You just drop out the draw games when evaluating. So it doesn't matter how often the draw happens.

@gooooloo gooooloo closed this as completed Nov 8, 2017
@mokemokechicken
Copy link
Owner

As you said, it is difficult to decide a good replace_rate.

you just drop out the draw games when evaluating

Yes, but I don't know whether it is a good choice.
In my evaluation configuration, game number for evaluation is 200. That of the paper is 400.
Still a little false positives are likely to occur.

Just curious what rate of drawing you got.

  • total evaluation: 928/25812 ≒ 3.6%
  • first 1000 games: 48/1000 ≒ 4.8%
  • latest 1000 games: 38/1000 ≒ 3.8%

Therefore, your 1/8≒12.5% is a little high rate.

@gooooloo
Copy link
Contributor Author

gooooloo commented Nov 8, 2017

Yes, but I don't know whether it is a good choice.

I think it depends on whether "draw" is a often-happened result for two same level players. If it is less seen in real practice, then it is a good choice( however I would just ignore those draw games rather than counting them in the 200 number). And if not, then not. But That's basically the game's property, not training program's. I am not good at Reversi, so I can't tell.

your 1/8≒12.5% is a little high rate.

I see. I will wait and see my more result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants