Is 0.55 too high for replace_rate given Reversi can have draw result? #7

gooooloo · 2017-11-08T09:19:56Z

reversi-alpha-zero/src/reversi_zero/configs/normal.py

Line 4 in 61922cc

self.replace_rate = 0.55

I know the Deepmind paper says a replace_rate of 0.55. But considering in that Go game under that rule, there is no "draw" result, so 0.55 is reasonable. However, in reversi there is "draw", so is it too high for the replace rate still being 0.55?

By 0.55, that saying, the next generation has to beat best model in most games, even draw is not allowed. That seems difficult. And the best model is thus less evolved, which makes the selfplay policy less improved neither. Then the training data less improved.

Or another question can be: in your practice when evaluating, how often does "draw" ending happen? In my local running, it happens in about rate of 1/8 when evaluating. I am still in early stage of training, and I rewrite the selfplay part also, so I don't know whether this 1/8 rate is reasonable or not. Just curious what rate of drawing you got.

Thanks.

gooooloo · 2017-11-08T09:32:48Z

I see. You just drop out the draw games when evaluating. So it doesn't matter how often the draw happens.

mokemokechicken · 2017-11-08T09:44:24Z

As you said, it is difficult to decide a good replace_rate.

you just drop out the draw games when evaluating

Yes, but I don't know whether it is a good choice.
In my evaluation configuration, game number for evaluation is 200. That of the paper is 400.
Still a little false positives are likely to occur.

Just curious what rate of drawing you got.

total evaluation: 928/25812 ≒ 3.6%
first 1000 games: 48/1000 ≒ 4.8%
latest 1000 games: 38/1000 ≒ 3.8%

Therefore, your 1/8≒12.5% is a little high rate.

gooooloo · 2017-11-08T10:22:14Z

Yes, but I don't know whether it is a good choice.

I think it depends on whether "draw" is a often-happened result for two same level players. If it is less seen in real practice, then it is a good choice( however I would just ignore those draw games rather than counting them in the 200 number). And if not, then not. But That's basically the game's property, not training program's. I am not good at Reversi, so I can't tell.

your 1/8≒12.5% is a little high rate.

I see. I will wait and see my more result.

gooooloo closed this as completed Nov 8, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is 0.55 too high for replace_rate given Reversi can have draw result? #7

Is 0.55 too high for replace_rate given Reversi can have draw result? #7

gooooloo commented Nov 8, 2017

gooooloo commented Nov 8, 2017

mokemokechicken commented Nov 8, 2017

gooooloo commented Nov 8, 2017 •

edited

Loading

Is 0.55 too high for replace_rate given Reversi can have draw result? #7

Is 0.55 too high for replace_rate given Reversi can have draw result? #7

Comments

gooooloo commented Nov 8, 2017

gooooloo commented Nov 8, 2017

mokemokechicken commented Nov 8, 2017

gooooloo commented Nov 8, 2017 • edited Loading

gooooloo commented Nov 8, 2017 •

edited

Loading