-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Progress stalled? #34
Comments
Yes, I also think it stalls. |
If old nets beat the newest one with winrate>60%, it obviously stalls. |
50 40 30 games are all too small to get a true winrate of two similar strength nets |
Yes, the smaller match tests if the new net is terrible, but we need bigger
match to test whether it stalled.
l1t1 <notifications@github.com> 於 2019年11月3日 週日 14:38 寫道:
… 50 40 30 games are all too small to get a true winrate of two similar
strength nets
eg.
sai52 beat sai51 with(65.45%) in 50 game match,
but
sai51 beat sai52 with(58.82%) in 30 game match
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#34>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AI3VVFNBS3ETRFOOOJDJ733QRZWUZANCNFSM4JIGS5GA>
.
|
Visits increased to 1200 |
How to judge whether it stalls? |
There's no general rule. We try to understand what's happening and make decisions accordingly |
did the training games of a generation increase after training with 1200 vists? |
No, there was a problem with the script. Corrected this morning. |
suggest set training games to a const value winrate of promote, eg, set sai 55's games to 200000.33=6600 games, sai 54=20000*0.64=12800 games, so that the stronger produce more higher quality games |
@l1t1 It is a bit complicated to realize, but I will think about this. |
Visits increased to 1600 from next generation |
If you are wandering what happened with the present promotion, with this sudden and huge strength jump... Previously there was a stupid error in the training script. The rate used was never dropped as intended, and was still 0.05. Now we are quenching (slowly). The first change is from 0.05 to 0.01. (I thought I was training at 0.00025.) |
@Vandertic would it be possible to publish the parameters used (in repository)? Because as it is nobody could have spotted this error without asking for them, while if they were in the repository, people could see and maybe spot the error. Similarly with the webpage and mentions about change from 800->1200->1600 visits. If people do not watch this issue they don't have a chance to even know when the change happened. |
You are right. Will be done ASAP. (I have a conference Tuesday, so after that.) |
@sheeryjay just a quick update before I write some real documentation next week: the bug was (still is) on the repository: There rate should be configured here: But then it is not used here: |
BTW, I am temporarily increasing the number of nets for promotion and the steps between each net and the following one, so as to get an updated measure of which numbers make sense with the current training rate. |
did sai 96 update the learn rate again? |
Yes. Down to 0.004. That's why we have also more promotion candidates for this generation |
We are at 0.0005 since generation 108. Almost stalled. Time to scale-up to 9x192, as soon as the training of that structure reaches current network in a couple of days. |
Heatmap of SAI net # 116
Heatmap of the final 6x128 LZ net, # 91
Policy with --nrsymm:
|
There is always the --randomvisits command, or you could do like MuZero and reduce temp to 0.5 until it stalls again, then to 0.25. |
Visits raised to 2400 |
it seems good |
B&W are moving out from 33 opening style. It reminds me that minigoV16 wights also have the same opening style as black opening with 33+44(later w has a komoku-Joseki |
3 games first 4 moves are same.
|
In my visits 400 match tests, SAI120/LZ081 got 51/47(100total) and 194/197(400total). Sgf files are commented in simplified Chinese. |
Are reference and comparison games staying at 1600 visits? |
It's a bit frustrating indeed. I was convinced that 6x128 had not reached its max yet, but I expected that 9x192 would progress faster. |
a large network is progressing more slowly, look at Leela Zero of 10 and 15 blocks - growth is not as fast as 5 blocks. and compare what SAI and Leela Zero have achieved after 700,000 games! what was the level of Leela Zero? 5x64, elo1650! |
I agree with @cryptpark, but if you have any idea on how to improve faster, we are happy to consider options. |
LZ's progresses were slower at 15 blocks because it was stronger. Look at AGZ and AZ: their ratings grow very quickly at the beginning even if the size of the net doesn't change. The main reason for switching from 6x128 to 9x192 was that the progress is suppose to stall at some point, and the fact that the 9x192 net was stronger looks like an evidence that 6x128 was close to its max (even is some progress was still possible). I don't see any reason for the 9x192 not to learn faster, at least at the beginning. But i agree that it's not an issue. A concern at most. |
Win-rate on empty board is a good indicator which was found from 9x9 training. In recent leelaz weights 249-254-255rc white's win-rates are 56.8%-57.1%-57.3%. In recent SAI weights 140-145-150 white's win-rates are 51.4%-50.8%-50.3%, coz SAI hasn't achieved leelaz's strength. Black has found more skills to avoid losing games which we would see too on 6*128 if 6*128 were going on training IMO. Today with SAI152-153 white's win-rates are 50.3%-50.8% rising again, while we can compare opening variances from 140 to 153 in 40K visits and notice that SAI knows 33 is a solid move avoiding battle、 slow and may be phased out. In SAI153 promotion match I saw a lot good moves in 44 and 43 corner such as in match 24 moves after 20th, it's early stage of the most fashion star joseki(move 6th and 11th corner enclosure and invasion are also amazing). I guess in 160's matches there wouldn't be 33 opening and AI fashion joseki would be seen. Although recent elos are very close(9550), I think progress is still obvious. |
Is the learning rate correct? Shouldn't the parameter be perhaps higher? Reason why I am asking is that current behaviour of promotions reminds me of situation when the parameter was to high and we were getting some nets that got 0% promotion wins, except that this seems just like what happened when gradually the parameter was rightfully lowered. My naive and intuitive guess would be that after training up to the first released 9x192, the learning rate would be increased to allow the learning process to make bigger jumps initially. Or this would not help? Having 12+ networks make this slow advancement is kind of weird and I guess unexpected at this point? |
maybe shift to 15x192? minigo and katago did not traning small nets |
155 is weaker than many prior promoted nets in refer matches, why its elo is higher them?
|
Even if the blue line on the rating graph looks flat, the cloud of green dots seems to rise faster. This is somewhat puzzling. Wait and see... |
i think the green dot elo is calculated after promote match, if some of them wont play in other match, their elo wont change. |
For every generation the (at the moment 11) candidate networks come from three different training experiments (or replications). We change rate and window size, so we are dynamically trying to understand what training rate and window is best at all times. But I agree with you and after some generations where the higher training rate showed to work at least as well as the lower, now we may try to raise it again a step. |
It would slow down progress terribly. I think it's too early. I think we may wait a little bit and see if we get improvements. SAI153 showed the best performance to date against LZ092. |
Correct |
By the way, we will add a few useful commits in the next week or two and then make a new release. Any suggestions are welcome. |
Of course, that's why there are so many green dots above the blue line. Nonetheless, it can give a trend on a dozen of generations. |
Well, we are now below the first 9x192. The situation is becoming more and more concerning. I would be happy if i had any clue about what is happening. Could it be just an issue with the evaluation of the nets ? |
I tried to compare the last network 6e65 against the one with the highest elo 4 generations before, on an empty Goban and some middle game positions, but nothing major came out of the comparison... |
my test shows in the four 9x192 nets that play against lz 250 v1, lz 139 is the best |
Maybe the fall of SAI157 changed things and now we will start to improve. We increased somewhat the training rate as suggested. We are also increasing a bit the number of candidates for one or two generations, to test parameters more thoroughly. |
I have done some analysis on the pipeline hyperparameters. You can find it here. |
That was impressive you could reverse engineer all that information! So in summary if the training falls below ~0.4 elo on average for a long period of time then the training window size should be experimented with more? |
Since Deepmind anchored Alphago to human elo, doesn't that change things a bit? |
Not for the derivative. It is just an additive constant. |
I would say that if we fall below 1-2 elo/generation before we reach a much higher level than we have a problem and should generally check for solutions. (Increase network size, improve training somehow, ...) |
Improvement per gen seemed to have slowed quite a bit since we had that huge boost of clients. Sai has improved about 1.5 elo per gen since net 214. Does sai training use swa? If not, it should allow us to crank up the learning rate a lot and see if it's just stuck in a local minima. |
We are trying different things and occasionally we also make quick attempts to increase the rate, but it seems that current progress is the best we can get, for now. Consider that progress is now roughly as fast as it was for LZ at this stage. And we are clearly slowly improving also against LZ. In fact LZ092 is finally well behind and LZ098 was beaten today for the first time. We are also prepared in case there is again a boost of clients, this time. And, no, we are not using SWA. Edit: BTW, the playing style of SAI changed quite a lot in the last 20-30 generations, despite the apparent small increase in strength, so maybe we are really improving steadily. I also believe that matches are not the best way to prove how good is a net at this high level of play. As you can see, there should be 400 Elo points between LZ103 and LZ113 (from LZ values), but we have similar performances against the two nets, implying that the huge progress between LZ103 and LZ113 does not immediately convert in matches performance. |
Line chart may be better, and smooth one is much better. |
Hello! We are moving to 12x256 with the next promotion. Stay tuned! |
I just wanted to reassure everyone that if the progress stalls we are going to increase the visits and we believe that in a few generations the upgrade will restore a good rate of improvement
The text was updated successfully, but these errors were encountered: