8M games - training progress note #1569

gcp · 2018-06-18T15:06:07Z

Small note because I know many people will start "panicking" as we have 200k games with no notable strength progress (i.e. PASS).

One reason why progress dropped a bit is that the ELF games left the training window. This wasn't intentional. I tried a quick fix (I was traveling) of increasing the window to 800k but the result wasn't particularly good. I somewhat rewrote the dumping code now to correctly deal with this and the next run should be closer to what was intended (~230k ELF + 250k regular). We'll see if this improves things. There'll be some delay before the next bunch of networks as I was fiddling with the training setup to get it debugged.

If these fixes don't produce progress, the next step will be a lowering of the learning rate. Because we're using SWA this might not necessarily produce a jump either.

It would be good to be able to start training and comparing 256x20 now, to figure out if network size is the issue (it sounds likely, but we do need confirmation in the form of a stronger 256x20 without which we can't do anything anyway!). It is really extremely unfortunate that we have the problem in #167 now. If this does not get resolved in a few days I will see if I can get some other hosting set up. (Data loss is not an issue, I have 2 full backups of the database...)

herazul · 2018-06-18T15:18:47Z

How many Gigabyte of data is the necessary training data ? (i mean the data we need to train a new network can't be that long ago, 10x128 training games can't be valuable now for example)
The total GB of 15*192 games + ELF games ?
Honnestly it may be doable to just torrent it, if we several hundred peoples ready to autogtp every day, we might have also enough people to host data ourselves.
For exemple, I would be ready to torrent it 24/24 on a gigabit connection on my server, and i think that i would not be the only one. Even if there would be like 10 of us, it would be more than enough to be a LOT better, faster and ~~stronger~~ more reliable than almost all hosting we could find.

gcp · 2018-06-18T15:23:21Z

A full window is about 20G-30G IIRC.

The problem is that it evolves a few times a day so it's not very fit for torrenting. The storage server that broke was very nice because I could update it almost incrementally with rsync.

At worst I'll rent a dedicated server with a big HDD for a while.

gcp · 2018-06-18T15:25:04Z

The problem is that it evolves a few times a day so it's not very fit for torrenting.

To make it clear, the data for old networks doesn't change any more so this you can torrent. But of course if you're trying to beat the last network, you want the data for the last network, and this gets a bit more messy.

herazul · 2018-06-18T15:28:15Z

Yeah it would not be as practical, but it would be possible to create a new torrent on a schedule like on every 2 days. It would maybe be acceptable of a delay to do training experiments.

Renting a server could be doable too, do not hesitate to create a go fund me or something like it to help with money, i think many of us would donate a little to help with renting.

barrtgt · 2018-06-18T17:31:43Z

Is there room for improvement with SWA?

LRGB · 2018-06-18T20:19:30Z

@gcp FYI, bjiyxo released v15 of his 256x20 network a few days ago. It hasn't been queued yet.

l1t1 · 2018-06-19T00:05:01Z

the webpage shows the training games of 256x20 v15 is 8.121M, it is too big, as bjiyxo syas it was up to lz 146

roy7 · 2018-06-19T00:16:22Z

I wasn't sure what number to use. ~~Should I change it to read 7690965?~~

I changed it to 7690965.

bjiyxo · 2018-06-19T01:53:43Z

@gcp Can you post the new raw training data (including 147, 148, and ELF) on Google Drive or Dropbox? Then I can train a new 20b catching up the latest 15b.

carljohanr · 2018-06-19T06:25:57Z

Did the previous 20b include any ELF games?

bjiyxo · 2018-06-19T06:28:16Z

@carljohanr Yes, including ~ 150k ELF games.

ThorAvaTahr · 2018-06-19T08:08:28Z

You attribute the slow progress to lack of ELF games in the window. However, if that is the case, i am rather worried by the rate of progress intrinsically our training pipeline.

It looks as though our training hardly inproves strength, while apparently there is still a lot of room for Improvement. Of course I understand that learning Goes faster with a good tutor, but it seems to have nearly stalled without it.

gcp · 2018-06-19T08:27:06Z

You attribute the slow progress to lack of ELF games in the window. However, if that is the case, i am rather worried by the rate of progress intrinsically our training pipeline.

Our progress would have been clearly slower without the ELF data. That's why we are using it! I am not sure we would have stalled out or not without it, or would have needed to jump to 256x20 faster, but it seems likely.

(Of course, in terms of time, generating the ELF data took time that could have been used for "regular" training games)

Friday9i · 2018-06-19T08:27:15Z

@ThorAvaTahr Not so worried on my side, I think we are "not so far" from the theoretical maximum of a 15x192 network, hence the selfplay improvement necessarily slows down. What kept a relatively good pace of progress is due to ELF, accelerating strongly the process vs a pure selfplay approach. Once you remove ELF, it is not accelerated anymore and as we are close to the max, it is (very) slow... JMHO.

john45678 · 2018-06-19T11:20:27Z

Did anyone notice that each time GCP posted about the current situation and what the plans were...we'd have a PASS in the next few hours :)

TFiFiE · 2018-06-19T12:53:40Z

Not to be a spoilsport, but the irony of that disappears with the knowledge these posts tend to be accompanied with a lowering of the learning rate.

l1t1 · 2018-06-19T13:18:54Z

maybe lz148 is better than 149, as it beat 20 weights and last 20k games, while 149 only beat 1 weight, and is beaten by 2 weights and only last 6000 games

gcp · 2018-06-19T14:04:28Z

Not to be a spoilsport, but the irony of that disappears with the knowledge these posts tend to be accompanied with a lowering of the learning rate.

The learning rate was not changed, fixing the dump of the window was enough (for now).

gcp · 2018-06-19T14:06:21Z

Can you post the new raw training data (including 147, 148, and ELF) on Google Drive or Dropbox?

I'll update the #167 topic with some temporary links.

2ji3150 · 2018-06-20T10:41:12Z

The selfplay games of Elf are going to 25k. Will we train a super ELF (20*224) for replacing ELF to generate new better selfplay game?

l1t1 · 2018-06-20T12:08:55Z

the elf self play will hit 250k games today, will @gcp stop elf play tomorrow?

TFiFiE · 2018-06-20T18:32:42Z

So it's normal to expect a point at which a network of a given size can still be improved but only by training games that come from a larger network?

Friday9i · 2018-06-20T19:57:13Z

@TFiFiE it's not true theoretically as, after an infinite time of efficient selfplay training, the net should go to its maximum performance. So training it with a larger net or not, it should reach its maximum.
But practically speaking, the convergence is slowing down exponentially... Hence, it's probably more efficient to train a larger net to get stronger games and then train the smaller net with it: it should reach its maximum level faster. That seems logic, and the experience with ELF seems to confirm that.

alphaladder · 2018-06-20T22:31:52Z

@2j3150 elf is just a helper for LZ. Why do we need to discard our LZ and train a so -called “super Elf II “?

But we do need to enlarge our network for a better LZ in order to beat and swallow the elf weight.

ryouiki · 2018-06-21T01:53:22Z

2j3150 means, since we have good training set(>250k) to enhance ELF itself, we could try to train better-ELF separately. If we would get better-ELF net, it may help LZ training too.

l1t1 · 2018-06-22T00:15:35Z

elf self play games is over 250k.

7933493 total self-play games (20519 in past 24 hours, 625 in past hour, includes 250252 ELF).

wonderingabout · 2018-06-22T06:43:24Z

most of you seem only to think about how to make lz stronger instead of appreciating the progress of the project as it is
i think i can say confidently that lz will never reach alphago level, and even for elf level its still far beyond
i think lz project should think more about how useful it is rather than how powerful

also, 250k is just an estimation, it could be 260k, 300k, 350k, depending on how things go
and i still believe 192x15 has more room for improvement, provided we wait long enough (no network in 5 days is not dramatic), so i would still wait a few more weeks and see how things go

my suggestion would be less matches, instead of doing them every 30k games, do it rather every 50-60k games should be quite safe, and start with +64.0k networks (as +32.0k rarely pass). The computing power saved can be invested in more selfplay training

EDIT : +30k new games after 8M total games have naturally much less impact than +30k new games at 1M total games, so maybe its the right time to widen the window

alphaladder · 2018-06-22T17:34:25Z

@wonderingabout I disagree with your suggestion which is just a disturbing to normal LZ training.
Note that our goal is to obtain another AGZ . In addition, you could train another project based on your own idea and it may be reasonable.

wonderingabout · 2018-06-22T21:53:37Z

@alphaladder

sorry but you're dreaming..
with current computing power, it would take 10-20 years to reach alphago level for lz, and by that time other tools like elf may be released, so again strength race, while enjoyable, should not be lz's main objective, but rather as an "open" source project, it should focus on being useful for the go community, amateurs as well as pro scene. (and for alphago, the learning curve starts with a peak then gets close to flat for a long time, see alphago curve.)
so far, lz had a very positive effect on the go community, among which promoting the game of go, creating a positive emulation to thrill and motivate go players as well as giving a better understanding of the game and a refreshed non human new approach to the game of go, providing review tools and a big game selfplay database for online viewing, making big companies like tencent or facebook open source their ai, from programmers and developpers perspective the data and experiments could be applicable to other non go projects, also inspiring other non-go projects like neural net leela chess zero etc etc

so i think lz keeping working in the same direction would be much appreciated, at least by me, and among possible projects the ones that first come to my mind are generalized komi (being able to play with -50.5 komi against dan players for example), support for handi games with realistic moves, customizbale realistic and reliable difficulty (i know you can pick first networks to lower difficulty, but these have many flaws in particular falling to ladders)
again, while enjoyable, i think if you're following lz for the strength race you're missing the most interesting points of it
personally, i'm also looking forward to lz improvements and growth (how more than up to which point), but i dont expect it to reach alphago, i just enjoy seeing it improve as much as it can

see my above comment for EDIT

EDIT 2 : for customizable difficulty, make lz play moves to reach ex. 30% winrate = win ?, or variable winrate during the game = win ?

bjiyxo · 2018-06-22T22:06:21Z

Don't take ELF for granted. FAIR is not a go company. We might need to reach AlphaGo Zero by ourselves. As far as I'm concerned,I will always support the Go AI which will constantly improve itself，no matter how long it takes.

l1t1 · 2018-06-27T11:51:41Z

i am afraid of the new weights become too strong, others can training with them, and the risk of lz lost in ai competition increases

wonderingabout · 2018-06-27T12:17:35Z

i'm curious to see how lz 152 compares to elf now
i wonder if it can go below 80% winrate for elf, but testing is interesting
last test was 03 june against lz 147 with 86% winrate for elf
is a match planned in the future ?

Friday9i · 2018-06-27T12:38:09Z

Difference was about 315 Elo with 86% winrate. Since that, LZ progressed around 205 Elo, but it is selfplay, so the real improvement is rather ~80 Elo, so the remaining difference should be around 230 Elo, ie a ~79% win rate for ELF vs LZ152. But LZ vs Elf is becoming a kind of selfplay scale too, with an inflated scale. Hence, the difference should be smaller than 230 Elo: hard to guess but I'd say ~180 Elo, ie a ~74% win rate
All in all, I'd say around 75% for ELF vs LZ152. Hopefully we'll know soon.

john45678 · 2018-06-27T14:57:22Z

I'm a little less optimistic, so I'll say 79% for ELF (vs. LZ152).

Friday9i · 2018-06-27T15:23:34Z

78% after (only) 37 games: seems we are not too far of, but still extremely noisy, we have to wait for at least 100 games to get a better idea!

Mardak · 2018-06-27T17:50:27Z

I just updated the 3-3 knight's move tracking #1442

From LZ148, the priors for the critical move for the joseki increased suddenly from 17% to LZ150 32% to LZ152 46%. Perhaps in addition to ELF reaching half of the window, newer training data is pushing out those from much older networks that would have trained towards ~10% prior instead of ELF's ~100% prior.

Here's the progress of the joseki for just the 192x15 networks every 5 since turning on ELF:

bjiyxo · 2018-07-07T23:09:23Z

@gcp We should lower the learning rate in my opinion if there isn't any progress on 15b today.

diadorak · 2018-07-08T03:28:37Z

200k without a promotion. It's time for LR drop :)

jkiliani · 2018-07-09T09:58:15Z

@bjiyxo's point still holds in my opinion. With process as slow as currently, you eventually get lucky passes. A change to LR and a plan to soon switch to 256x20 would make a lot of sense now.

l1t1 · 2018-07-09T10:06:55Z

promote too early, in next 58 games , need 30 win to pass, but only lost 28 will not pass

Marcin1960 · 2018-07-09T10:14:23Z

@jkiliani "A change to LR and a plan to soon switch to 256x20 would make a lot of sense now."

Perhaps after lowering LR gets exhausted, to give a try to lowering promotion limit to 52% or 51% for a week or so? It would not hurt, would it?

l1t1 · 2018-07-09T10:35:11Z

final passed, we can wait for next 10 days.

kityanhem · 2018-07-09T19:31:44Z

Two nets appear in a row.
I feel like 15b no limit (just kidding!)

wonderingabout · 2018-07-09T20:10:04Z

no promotion in 10 days, then 7 nets appear on a row.
it was so fast that 2 networks didnt have time to be promoted (one with 83%!!)
one network was even promoted twice, normal behaviour ?
I feel like 15b no limit (just kidding @l1t1, i like your contributions on lz, so its a fun way to thank you)

gcp · 2018-07-09T20:54:52Z

I lowered learning rate at around 200k games, but 7ff1 passed while still using the old rate.

wonderingabout · 2018-07-09T21:42:44Z

on a more serious note, i find it a bit strange that lz gets 2 (and maybe 3) promotions on a row after this 200k games stagnation
it could be possible that lz figured out a lot of things in the last few games, but i find this explanation strange and baseless
or it is possible that one of the networks was a false positive, making it easier to be beaten once (and maybe twice with a "random" double promotion)
this is all interesting though, i'm curious about the impact of lower learning rate in the future

remdu · 2018-07-09T21:51:42Z

@gcp What is the current learning rate ?

jokkebk · 2018-07-09T22:15:37Z

I’ve also wondered if there is a trend with many recent "long reigning champions" (networks that last 50k+ selfplay without losing 55% to anynew nets) first being beaten, and then these new networks usually get beaten themselves in 5k games or so? "Revolution eats its children"?

Learning rate was not the culprit for many of these, as it was changed only now. Could be just random stuff ofc

l1t1 · 2018-07-10T02:05:13Z

@gcp why lower learning rate can speed up promote?
I found an article at https://blog.csdn.net/Uwr44UOuQcNsUQb60zk2/article/details/78566465
in which, the author says
Training should begin with a relatively large learning rate. This is because at the beginning, the initial random weight is far from the optimal value. During the training process, the learning rate should be reduced to allow for fine-grained weight updates.

NhanHo · 2018-07-10T03:43:18Z

@jokkebk that was actually fairly easy to explain: those cases were networks of the same training batch. As they were trained with more training steps on the same data, they got stronger.

Tsun-hung · 2018-07-10T09:29:05Z

@NhanHo Do you have the plan to train new network of 1286 using later raw data, as you have done to 645 network?

gcp · 2018-07-10T10:20:10Z

why lower learning rate can speed up promote?

If we're in between your two examples, it's as if the parameters jump from one side of the ramp to the other, without going down (but not too big, so it doesn't diverge up either). If you halve (or more) the step sizes, it will go right down to the bottom.

gcp · 2018-07-10T10:21:43Z

and then these new networks usually get beaten themselves in 5k games or so?

One other factor might be that on a new iteration the learning starts from the best network, so you're now starting the learning off of a very good spot and adding a lot of iterations onto that. This could especially be true for 256k promotes.

gcp · 2018-07-10T10:22:56Z

@gcp What is the current learning rate ?

0.0001 @ batch size = 256.

jkiliani · 2018-07-10T10:29:42Z

@gcp: Could you please queue @bjiyxo V18 for a match?

wonderingabout · 2018-07-15T11:47:09Z

lz just got a new promotion
and since it has been 400k games and 4 promotions since last elf match, is it the right time to queue another elf match or should we wait for one more network ?

l1t1 · 2018-07-16T02:20:24Z

lz 156 vs elf winrate 23-25%

Mardak · 2018-07-16T05:08:39Z

Anyone know if ELF is correctly resigning in the 300+ move games?

http://zero.sjeng.org/match-games/5b4be7022f06263c66c692a7

http://zero.sjeng.org/viewmatch/580fc26fdfa45d8707374cef8018a03424df3d06eff03cd0c68fa295da6132b9?viewer=wgo
http://zero.sjeng.org/viewmatch/695570224e2b2a9e617afb9e411821787afa9b357ed5da2b044aae3382f45c4f?viewer=wgo

Looking at the average network eval (no search), ELF thinks the win rate is around 15% while LZ156 thinks it's closer to 25% and 40%. So I suppose at least both agree that ELF was in a losing position…

ChinChangYang · 2018-07-16T05:24:29Z

To self-prove the resign is correct or not, Leela Zero can continue the game without resigning? Then, scoring from the endgame to see which side (black or white) wins.

wonderingabout · 2018-07-16T08:42:54Z

some games are wrongly resigned, but it is less than 5% generally, so on 400 games we can assume the number of wrongly resigned games is the same
so, with a delta of arround 5%, it seems leela zero got significantly stronger in the last 400k games

thanks for doing the elf test, i was surprised to see it improve that much (with a delta of possible error margin in mind)

8M games - training progress note #1569

8M games - training progress note #1569

Comments

gcp commented Jun 18, 2018 • edited Loading

herazul commented Jun 18, 2018

gcp commented Jun 18, 2018

gcp commented Jun 18, 2018

herazul commented Jun 18, 2018

barrtgt commented Jun 18, 2018

LRGB commented Jun 18, 2018

l1t1 commented Jun 19, 2018

roy7 commented Jun 19, 2018 • edited Loading

bjiyxo commented Jun 19, 2018

carljohanr commented Jun 19, 2018

bjiyxo commented Jun 19, 2018 • edited Loading

ThorAvaTahr commented Jun 19, 2018

gcp commented Jun 19, 2018 • edited Loading

Friday9i commented Jun 19, 2018

john45678 commented Jun 19, 2018

TFiFiE commented Jun 19, 2018

l1t1 commented Jun 19, 2018

gcp commented Jun 19, 2018

gcp commented Jun 19, 2018

2ji3150 commented Jun 20, 2018 • edited Loading

l1t1 commented Jun 20, 2018

TFiFiE commented Jun 20, 2018

Friday9i commented Jun 20, 2018

alphaladder commented Jun 20, 2018 • edited Loading

ryouiki commented Jun 21, 2018 • edited Loading

l1t1 commented Jun 22, 2018

wonderingabout commented Jun 22, 2018 • edited Loading

alphaladder commented Jun 22, 2018 • edited Loading

wonderingabout commented Jun 22, 2018 • edited Loading

bjiyxo commented Jun 22, 2018

l1t1 commented Jun 27, 2018

wonderingabout commented Jun 27, 2018 • edited Loading

Friday9i commented Jun 27, 2018 • edited Loading

john45678 commented Jun 27, 2018

Friday9i commented Jun 27, 2018

Mardak commented Jun 27, 2018

bjiyxo commented Jul 7, 2018

diadorak commented Jul 8, 2018

jkiliani commented Jul 9, 2018

l1t1 commented Jul 9, 2018

Marcin1960 commented Jul 9, 2018

l1t1 commented Jul 9, 2018

kityanhem commented Jul 9, 2018

wonderingabout commented Jul 9, 2018 • edited Loading

gcp commented Jul 9, 2018

wonderingabout commented Jul 9, 2018 • edited Loading

remdu commented Jul 9, 2018

jokkebk commented Jul 9, 2018 • edited Loading

l1t1 commented Jul 10, 2018 • edited Loading

NhanHo commented Jul 10, 2018

Tsun-hung commented Jul 10, 2018

gcp commented Jul 10, 2018 • edited Loading

gcp commented Jul 10, 2018 • edited Loading

gcp commented Jul 10, 2018

jkiliani commented Jul 10, 2018

wonderingabout commented Jul 15, 2018 • edited Loading

l1t1 commented Jul 16, 2018 • edited Loading

Mardak commented Jul 16, 2018

ChinChangYang commented Jul 16, 2018

wonderingabout commented Jul 16, 2018 • edited Loading

gcp commented Jun 18, 2018 •

edited

Loading

roy7 commented Jun 19, 2018 •

edited

Loading

bjiyxo commented Jun 19, 2018 •

edited

Loading

gcp commented Jun 19, 2018 •

edited

Loading

2ji3150 commented Jun 20, 2018 •

edited

Loading

alphaladder commented Jun 20, 2018 •

edited

Loading

ryouiki commented Jun 21, 2018 •

edited

Loading

wonderingabout commented Jun 22, 2018 •

edited

Loading

alphaladder commented Jun 22, 2018 •

edited

Loading

wonderingabout commented Jun 22, 2018 •

edited

Loading

wonderingabout commented Jun 27, 2018 •

edited

Loading

Friday9i commented Jun 27, 2018 •

edited

Loading

wonderingabout commented Jul 9, 2018 •

edited

Loading

wonderingabout commented Jul 9, 2018 •

edited

Loading

jokkebk commented Jul 9, 2018 •

edited

Loading

l1t1 commented Jul 10, 2018 •

edited

Loading

gcp commented Jul 10, 2018 •

edited

Loading

gcp commented Jul 10, 2018 •

edited

Loading

wonderingabout commented Jul 15, 2018 •

edited

Loading

l1t1 commented Jul 16, 2018 •

edited

Loading

wonderingabout commented Jul 16, 2018 •

edited

Loading