[RFC] closed positions book. #2646

vondele · 2020-04-25T10:27:32Z

I have made a pull request to the official book repo with a closed positions book.
official-stockfish/books#8
this still needs some testing, but should eventually be available.

I first want to do some testing comparing this to the noob_3moves book on fishtest before we possibly start using this, so that we have a feeling for its quality. My initial impression is rather good.

There are several options we can first discuss here before I decide on this.

Allow patches to be tested against this book, normal stc and ltc. Leave choice up to the submitter
test all patches against this book, just switch for a couple of weeks.
first retest a couple of patches that were aiming at closed positions but didn't pass.
etc.

MJZ1977 · 2020-04-25T12:13:50Z

This is a very good idea I had suggested many times before !

But just for me pe->blockedcount() >=4 is not enough. Many of positions of the book are not blocked (>80%). Can we add by hand some french and king indian positions and retrieve clearly open positions?

Edit: we can allow patchs with this book and test STC non regression with initial book.

vondele · 2020-04-25T12:36:03Z

@MJZ1977 , thanks! Some related observations/notes:

the positions in the book are from before the blocked position is reached, i.e. the position in the game out of which this position is extracted becomes more blocked as SF plays.
I hope that this allows for some variety still, and that improvements will come from both avoiding to get into blocked positions when not advantageous and from playing blocked positions well
Roughly only 1 out 50 games currently played games on fishtest matched the criterion 'blocked', so this is already 'a massive change' compared to the current state.
Adding by hand is not so easy, I had no way to get the ECO code of a game (fishtest games start from a fen nowadays, not from moves), and one needs ~50k different positions to make a reasonable book. I assume a few people have more advanced tools, and could contribute another book constructed with a different strategy.
Very narrow opening books (e.g. just French) might be a bit risky, overfitting could be lurking there.

NKONSTANTAKIS · 2020-04-25T15:10:13Z

Thanks for this exciting incentive!

Both strategies should be valid, the specialized one would indeed require a non-regression step.
This is a versatile book with a stronger closed position signal, imo safe to use as normal book.
Probably more universal, due to closed positions heavy underrepresentation in default.
Distribution is evened out in regards to opening type instead of opening availability.

Another point is that for open positions search is a nifty tool, so its closed positions which need elements.

vondele · 2020-04-25T19:51:27Z

Influence of the book on Elo difference. noob_3moves.epd vs closedpos.epd.
Basically, books have a similar Elo performance, for both SF10 - SF11,
as well as SF11 - SFdev.

SF11 vs master (STC)

closed:

ELO: 17.94 +-1.7 (95%) LOS: 100.0%
Total: 60000 W: 13779 L: 10684 D: 35537
Ptnml(0-2): 880, 6085, 13460, 8210, 1365
https://tests.stockfishchess.org/tests/view/5ea415c913fcd4bb2f00a0e4

noob:

ELO: 17.91 +-1.7 (95%) LOS: 100.0%
Total: 60000 W: 13292 L: 10202 D: 36506
Ptnml(0-2): 814, 6166, 13525, 8106, 1389
https://tests.stockfishchess.org/tests/view/5ea415c913fcd4bb2f00a0e4

SF10 vs SF11 (STC):

closed:

ELO: 50.59 +-1.8 (95%) LOS: 100.0%
Total: 60000 W: 17819 L: 9143 D: 33038
Ptnml(0-2): 586, 4917, 12288, 9653, 2556
https://tests.stockfishchess.org/tests/view/5ea413e913fcd4bb2f00a0d3

noob:

ELO: 48.18 +-1.8 (95%) LOS: 100.0%
Total: 60000 W: 17306 L: 9038 D: 33656
Ptnml(0-2): 619, 5006, 12298, 9642, 2435
https://tests.stockfishchess.org/tests/view/5ea415ac13fcd4bb2f00a0e1

SF11 vs master (LTC, Edit: final values)

closed:

ELO: 20.12 +-1.8 (95%) LOS: 100.0%
Total: 40000 W: 7149 L: 4835 D: 28016
Ptnml(0-2): 211, 3221, 11101, 4977, 490 
https://tests.stockfishchess.org/tests/view/5ea45e85b908f6dd28f34ada

noob:

ELO: 17.45 +-1.7 (95%) LOS: 100.0%
Total: 40000 W: 6357 L: 4350 D: 29293
Ptnml(0-2): 224, 3109, 11590, 4590, 487 
https://tests.stockfishchess.org/tests/view/5ea45e72b908f6dd28f34ad7

I think this indicates that the book is pretty general purpose.

I will now reschedule a few of the recent yellow LTCs that presumably target
closed positions with the new book

vondele · 2020-04-25T20:08:54Z

Can I ask authors of recent yellow LTC patches (e.g. @Vizvezdenec @xoto10 @locutus2 @MJZ1977 @Lolligerhans) that target closed positions to resubmit them LTC, with the new closedpos.epd book, putting closedbook in the info field as well? Looks like a few of them will need rebasing so I can't easily reschedule.

I've reschedule 2 that were based on current master:
https://tests.stockfishchess.org/tests/view/5ea49685b908f6dd28f34b85
https://tests.stockfishchess.org/tests/view/5ea4969ab908f6dd28f34b87

locutus2 · 2020-04-25T20:09:48Z

I will retest with the closed book my pawn chain patches . I had three similiar version which all passed STC and failed LTC yellow.

Lolligerhans · 2020-04-25T21:37:24Z

@vondele I had no such patch. I kept track of yellows so I am pretty sure. :)

adentong · 2020-04-25T21:54:28Z

Unrelated to the current topic, but the last regression was only ~11elo, but @vondele's LTC tests are showing 18/20 elos respectively for closed book/noob book. I know we use a different book for regression, but still a bit surprising.

xoto10 · 2020-04-25T22:01:20Z

Very interesting results! Am i right in thinking this book is about the same size as noob_3moves ?

So we've used noob_3moves to play a lot of games, then sampled games we're interested in after 8 plies - is that 14 plies from startpos then? That might be a concern for long-term use as the standard book, but given the performance tests give very similar results to noob_3moves, I'm happy to test it out for a couple of weeks. Definitely a plus point to just update the main book instead of having a choice, and having to do non-regression tests against the main book, I just hadn't expected this to be an option. Interesting ...

Vizvezdenec · 2020-04-25T22:04:01Z

well side note that last RT has different master that was behind by 2 elo patches and one simplification.
Also it's kinda expected I guess with 2 space/blocked positions interacting patches...

adentong · 2020-04-26T00:23:01Z

Yea well usually I wouldn't expect a 7-9 elo difference with just two elo gaining patches lol...

NKONSTANTAKIS · 2020-04-26T01:29:42Z

@adentong RT's use 8_moves book, which has the lowest elo spread (around 10% less). This makes the +50 elo between versions more meaningful. On top of that are the 3 patches, an undefined small effect of book optimization, and double error-bars.

vondele · 2020-04-26T07:19:15Z

I indeed wouldn't focus to much on the comparison to the RT, it is indeed not exactly the same version of the code, and the 8moves_v3 book is known to yield less Elo difference. The draw rate is slightly different with the books as well 8moves 0.74, noob_3moves 0.73, closedpos 0.70.
This all looks good IMO.

There have been a number of tests overnight using the new book (on old yellow LTCs):
https://tests.stockfishchess.org/tests/view/5ea49685b908f6dd28f34b85
https://tests.stockfishchess.org/tests/view/5ea4b95ab908f6dd28f34bde
https://tests.stockfishchess.org/tests/view/5ea4a0dcb908f6dd28f34ba4
https://tests.stockfishchess.org/tests/view/5ea4969ab908f6dd28f34b87
https://tests.stockfishchess.org/tests/view/5ea4a14cb908f6dd28f34bab
https://tests.stockfishchess.org/tests/view/5ea4a0efb908f6dd28f34ba7
none of them passed, and IIRC one yellow.... probably not too surprising.

So let's get the expectations right. The closedpos book is not a magic bullet, and it will remain a real challenge to get patches passed.

vondele · 2020-04-26T07:26:00Z

Based on the data collected, my proposal is to switch the default book to closedpos.epd relatively soon, used for essentially all tests (but not RT), and just continue testing as before. In particular, after passed STC and LTC tests on closedpos, PRs can be made, no need for additional non-regression tests. After a couple of weeks (June?) this strategy is reassessed.

Give thumbs up or down if you agree or disagree with this proposal.

locutus2 · 2020-04-26T07:33:37Z

@vondele
I would prefer more to do a non-regression against noob book but more in the sense of monitoring to be alarmed if it goes really bad. Here we can probably use weaker bounds like [-2;0].

But the the best approach seems for me to do a mixed book: 50% positions from closed book and 50% positions from noob book. So we would have the best of two worlds: closed position testing but no overfitting to this type of positions IMO.

vondele · 2020-04-26T07:41:05Z

@locutus2 I plan to do the monitoring based on the usual 8moves RT runs.

My argument against doing additional non-regression tests is that I want to keep our procedure as simple as possible. I'm also pretty confident that regression are unlikely. But if there is a strong feeling in favor of the additional testing on passed patches, I'm fine with it. So, let's see what the vibes are.

I'm not in favor of mixing the books. Let's try to get a clean signal. Again, the book is not extreme, and there will be opinions going in either direction (e.g. @MJZ1977 would like to see it more closed, you prefer a little more open).

locutus2 · 2020-04-26T07:55:50Z

@vondele
About the clean signal point:
Ok i understand it from scientific standpoint it is good to get clean data about the closed book to asses it (here i'am with you). But its important how we go from there. Say the closed book seems good: take we then this further or mix it with par example the noob book (which till now also works). Here only the second one seems to avoid biased development and i think it is not good to go now from one extreme (unusual open positions) to another (near closed positions) so mixing up seems the best approach.

vondele · 2020-04-26T08:09:21Z

@locutus2 long term I can indeed see the point, and we can reassess.

Short term, let's figure out if the book actually matters much. I think this is an experiment to try and see if the perceived weakness in closed positions can actually be more easily fixed with a closed book (if one looks at the positions, it really is not that closed). We might find that this is not as important as we think.

This is in part an old discussion, the many years of development with the 2moves book, which really was not very sophisticated, illustrated that the book might not be the key ingredient to progress.

MJZ1977 · 2020-04-26T10:36:27Z

I think we can keep the 2 books for instance and change the default once we have the ideas clear. It will be interessant to find a patch that shows a big gap between the 2 books. Green to "closed book" and red to "noob book". Then we can conclude.

xoto10 · 2020-04-26T10:45:21Z

Last night I was thinking this was a big development ... now seeing the results of the reruns, it seems it doesn't make much difference at all. Perhaps there is a subtle change that we will become aware of over time. At the moment (very early of course), it seems the lower draw rate is perhaps the main change (benefit?) of this.

My main concern if we switch to using this book for the medium term remains the beginning of the game. If we want sf to get better at the early moves, surely we need a test book that includes small ply openings (say 0-5) as well as longer ones?

miguel-l · 2020-04-26T12:22:46Z

The way I understand it is that we get positions which, in its games Stockfish closes the position (please correct me if I misunderstood something). But what about games that Stockfish fails to close the position? For example, when searching from root, very commonly we see the exchange French, etc. Something feels off about it.

NKONSTANTAKIS · 2020-04-26T13:21:15Z

I believe that the beginning of the game is too vague to be helped by eval, due to very high availability of viable options and different setups. But as the midgame eval becomes more accurate, it will show at openings via better steering of search.

This book should not be regarded as a specialized closed position book, but as an attempt for a more balanced general book in regards to position type. The conditioning is soft and leads to open positions too. The problem with typical books is that they are balanced in regards to viable opening availability, thus tiny signal of truly closed positions. SF has problem with those for 3 reasons:

Rarity of occurence, as explained
Vastly different characteristics
Inefficiency of search (as their long-term nature, where 1 pawn move can ruin the prospects forever, entering a distant dead-end)

Search inefficiency (and unfortunate setup selection) has partly to do with seeking generically favorable evals: A highly valued bonus in a static position acts like a black hole for the search. It sucks up all the resources to that direction, because it "believes" its something supreme, blinding it for alternatives. An example is a very deep knight outpost at totally blocked flank + space advantage. Totally useless at a glance for chess players, but SF aiming for it form early game even.

Removing those black-holes completely will require "alien" tech like pattern-recognition, MCTS, NN, or a detailed categorization of cases. But an increased representation of black-hole situations will surely boost long-term health.

I don't believe SF needs training at positions that are very easy for it, nor is it in danger of regressing. At tactical cases the various paths are narrow and concrete and search shines.

xoto10 · 2020-04-26T22:05:47Z

But what about games that Stockfish fails to close the position?

Good question. I guess there will be a few d4/e5 French advance structures in this book, perhaps this can be an iterative process and the book can be recreated occasionally? If we can improve sf's blocked position play a little, then it will choose more blocked positions ... then we can improve it's play a little more ... etc

Edit: or we could just get some games from somewhere else, no reason to only use fishtest? e.g. http://data.lczero.org/files/match_pgns/1/

vondele · 2020-04-28T07:37:19Z

I believe there have been some valid concerns raised in this thread, enough so that we should consider alternatives. I have now built a new book with a very different approach based on these comments. I'll again do some testing on fishtest later. The major concerns I have seen raised are:

balance between closed and open lines (e.g. closedpos.epd vs noob_3moves.epd)
need for short lines (2moves, noob_3moves)
need for long lines (8moves)
presence of particular openings like french advance, KID, etc (8moves)
absence of 'strange/rare openings' (2moves, noob_*)
Elo resolution

To address this, I made a book based on the frequency of FENs in games played at lichess (restricted to Elo > 1800, TC > 60). I retained the 200k most frequent FENs out of >8M games. (see official-stockfish/books#9)

This have the following advantages:

lines closed and open are balanced, reflecting human choice
short lines are present (e.g. startpos is the most frequent position)
long lines are present (i.e. popular deep lines are played relatively often).
has all named openings
'strange/rare' openings are absent or a very small fraction (e.g. no grob in the top 200'000)
Elo resolution needs to be measured on fishtest.

Of course, the choice of the initial database will somewhat influence the resulting FENs, but I think that's more or less secondary.

Edit: the Elo testing yielded the following:

SF11 -> master (STC)
 ELO: 11.89 +-1.6 (95%) LOS: 100.0%
Total: 60000 W: 13791 L: 11738 D: 34471
Ptnml(0-2): 763, 6016, 14647, 7553, 1021 
https://tests.stockfishchess.org/tests/view/5ea7e0a953a4548a0348ecb1

SF11 -> master (LTC)
ELO: 14.61 +-1.6 (95%) LOS: 100.0%
Total: 40000 W: 7331 L: 5650 D: 27019
Ptnml(0-2): 181, 3045, 11987, 4486, 301 
https://tests.stockfishchess.org/tests/view/5ea7e0d653a4548a0348ecb5

SF10 -> SF11 (STC)
ELO: 43.35 +-1.7 (95%) LOS: 100.0%
Total: 60000 W: 17566 L: 10119 D: 32315
Ptnml(0-2): 531, 4776, 13411, 9279, 2003 
https://tests.stockfishchess.org/tests/view/5ea7e0c353a4548a0348ecb3

So the Elo spread is somewhat small on this book.

Anybody has a pointer to another pgn database of high quality games (e.g. master level, ICCF), but it will need to be > 2M games to be suitable to build a book, I would say.

Alternatively, a subset of high quality leela training games (again >2M) ?

xoto10 · 2020-04-29T08:14:35Z

noob_2/3moves books were selected to avoid drawish openings IIRC, but the closedpos book just turned out to have a good Elo spread without any explicit drawish checks. (I wonder why?)

Do you have any info on how many of these popularpos lines qualify as closed under the closedpos tests? Maybe we need a not-drawish test if we want to consider these popular and more open lines?

vdbergh · 2020-04-29T11:05:35Z

noob_2/3moves books were selected to avoid drawish openings IIRC,

No they were not. In fact their draw ratio is rather high. Note: for the same Elo you want the highest possible draw ratio (= least amount of noise). It you want to lower the draw ratio convert every draw into a win or loss using a coin.

vondele · 2020-04-30T20:25:35Z

I ran a second test on a book popularpos_lichess_v2.epd which was contructed retaining games from >2200 Elo players only. The result, however, is nearly identical:

 ELO: 43.41 +-1.7 (95%) LOS: 100.0%
Total: 59896 W: 16875 L: 9430 D: 33591
Ptnml(0-2): 492, 4789, 13408, 9300, 1959 
https://tests.stockfishchess.org/tests/view/5eab03cb09d25e8e5058169b

the noob_3moves book was not selected specifically to avoid drawish openings, but it might be a side effect of how the database has been constructed.

noobpwnftw · 2020-04-30T20:44:48Z

My books were built from one simple rule: pick moves that are top N and not worse than a score threshold.
I find it interesting that the result converges with a book built with human games.

vondele · 2020-04-30T20:54:49Z

I did a quick analysis (depth 13) of the score of the book moves, and that highlights quite some difference between the 2 classes of books:

basically, the human games, even in these 'popular positions' have a much broader range of scores, i.e. essentially won or lost. This improves only very little with Elo of the players. I think the main problem is that these human games are mostly very short TC (>60s, but typically 180s). So, if anybody has a clean database of long TC games between good players...

vondele · 2020-05-02T20:59:42Z

so average number nodes needed to reach depth 13:

book	nodes
noob_3moves	81385
closedpos	123145
popularpos	113054
popularpos_v2	111785
popularpos_v3	115037

noobpwnftw · 2020-05-02T21:01:19Z

Weird, so the theory is right, but the result went the opposite...

dorzechowski · 2020-05-03T22:56:56Z

Out of curiosity I checked depth 13 nodes in 2moves_v2 book. The book is relatively small (12k positions) so I analyzed whole book. The average is 134673 and histogram looks like this:

Perft 5 nodes vs depth 13 nodes scatter plot looks like below. There is no correlation at all (R=0.14).

Position with max depth 13 nodes (385505):
rnbqkbnr/p1pp1ppp/1p2p3/8/3P4/4P3/PPP2PPP/RNBQKBNR w KQkq -

Position with min depth 13 nodes (28154):
rnbqkbnr/p1pp1ppp/4p3/1p6/5P2/2N5/PPPPP1PP/R1BQKBNR w KQkq -

All with latest SF (2 May 2020).

noobpwnftw · 2020-05-04T00:54:23Z

It makes sense now, elo spread is related to the percentage of positions contained in the book may be reached by playing SF topN moves. This is why closedpos had a good spread but popularpos didn't.

dorzechowski · 2020-05-04T01:41:01Z

I'm not sure. For example book 2moves_v1 contained basically random sequences of moves and had the same spread as noob_3moves. We measured it end of December and results were as below. Looks like books constructed differently and even with vastly different RMS bias may give the same sensitivity.

book	Elo spread	draw ratio	RMS
2moves_v1	44.50	0.513	73.85
noob_3moves	44.90	0.566	31.47
noob_2moves	40.75	0.562	33.02

noobpwnftw · 2020-05-04T01:47:06Z

Well as for 2moves there are just 2 moves, so pretty much anything not losing a pawn's worth is within topN, and it did remove some outright bad moves.

dorzechowski · 2020-05-04T02:02:17Z

I added noob_2moves to the table above. Both 2moves books have very little in common it seems.

Actually I want now to test hypothesis that positions with bigger depth 13 nodes are more complex. I'm going to sort 12k positions from 2moves_v2 by depth 13 nodes, split it in 3 equal parts and then use 1st and 3rd part as a new books to play 8000 games matches between SF11 and SF10. If it's true that bigger node count mean more complexity, then book made from 3rd part should give significantly bigger spread than the first one. It would be interesting to either confirm or debunk it. Unfortunately I have only a measly laptop, so it may take some time before I get back with the results.

noobpwnftw · 2020-05-04T02:30:20Z

The difference between my 2moves and 3moves book are just making one move that is not too bad and my scores are back propagated, but still I think coverage ratio among topN matters, spread of 2moves_v1 might because of higher RMS matters only for a few moves in but not more.

vondele · 2020-05-04T05:26:37Z

I have #W # L #D (White POV) for the noob_3moves from fishtest LTCs. Typically looks like:

  "rn1qkbnr/ppp2ppp/3p4/4pb2/2PP1P2/8/PP2P1PP/RNBQKBNR w KQkq -": [
    59,
    48,
    215
  ],
  "rnbqkb1r/pp1pppp1/2p2n1p/8/3P1P2/8/PPPBP1PP/RN1QKBNR w KQkq -": [
    38,
    27,
    186
  ],
  "rnbqkbnr/2pp1ppp/1p6/p3p3/8/3P4/PPPNPPPP/1RBQKBNR w Kkq -": [
    25,
    44,
    233
  ],
  "rn1qkb1r/pbpppppp/5n2/1p6/8/PP4P1/2PPPP1P/RNBQKBNR w KQkq -": [
    39,
    35,
    226
  ],

So, openings appear winnable from both sides. I don't directly see a pattern. @vdbergh do you think that this data be used to select good positions for a book ?

bookstats_noob_3moves.json.zip

NKONSTANTAKIS · 2020-05-05T01:21:44Z

A lot of 150K-350K eval yellows recently. Maybe check them on closedpos?
I am thinking its getting harder and harder to get 1 elo with a single patch.
As most of those should be around +0.5 to +1.3, I like the idea of a standardized decider.
Different environment + excellent spread scaling of book...how about at a bit higher LTC?
It feels wasteful to throw them away after having spent so many LTC games.
The higher the game count, the closer they are to +1. Well probably around 0.9, due to selection bias.

Also with too many tests + low success rate, eventually some will pass out of luck. With a closer examination of the best performers the harvesting will be safer.

Atm it seems to me that too many resources are used on an extreme amount of different versions on very low pass rate, and thus a higher confidence would be logical.

noobpwnftw · 2020-05-05T01:38:57Z

closedpos will not make them pass, the LTC bounds are very narrow, it is expected to take large number of games to resolve for patches fall within this elo diff range. This is the price to pay so that less patches pass by luck. Low success rate and too many similar tests cannot be solved by lowering the bar while I'm colorblind so that I cannot tell the difference between a yellow and a red SPRT test.

dorzechowski · 2020-05-05T02:00:40Z

@vondele I think we could calculate SNR of each book position by normalized Elo formula or just check z=(w-l)/sqrt(w+l) and get rid of positions with z close to zero as they don't give any signal. But it would be also good to get confirmation from @vdbergh of course.

NKONSTANTAKIS · 2020-05-05T02:21:42Z

@noobpwnftw I want less patches to pass by luck, not more. Atm the pass rates are extremely low, but the amount of tested patches is huge, so inevitably the quality decreases & resources are wasted. For colorblind purposes the yellow can be regarded as red without lowering the elo bar but with an even higher amount of games. A higher spread will enable better performance.

closedpos had equal spread at STC but +2.7 at LTC, a very good indication.

So it might not make them pass as you say, but it can make them fail faster!

noobpwnftw · 2020-05-05T02:30:31Z

I hope so but with the large number of games their elo measurement is actually very accurate, they do fall around +0.5 range and they would still cost similar resources to conclude, and book probably won't change that.
In fact, if it does, then I see trouble.

NKONSTANTAKIS · 2020-05-05T02:58:17Z

Well at this point maybe even a +0.5 at worst is nice. Using millions of LTC games for little gain feels ineffective. What if without you? I also think that testing many versions of same patches with slight changes is bad practice. One might get lucky in the end, worth 0.5, but at a very high price.
The beast needs to be fed I guess...so why not to get our +0.5 in a smarter way?

Btw I like the system more than ever, but I think its very beneficial to keep evolving it, not only SF.

noobpwnftw · 2020-05-05T03:23:03Z

For that then I think it is important to understand how to manipulate elo spread.

This is my scored list of all unique positions after 2 moves without any filtering:
https://www.chessdb.cn/downloads/2moves_scores.zip

I think I have calculated scores for any position up to 4 moves but the data is quite large.

vondele · 2020-05-05T06:09:36Z

@noobpwnftw could you make that scores data available for 3moves ? Either all if less than a few GB, or just for the positions in the noob_3moves book ? That will be interesting to correlate with ' z=(w-l)/sqrt(w+l)'

vondele · 2020-05-05T06:36:25Z

apart for a 'feature' near zero (not sure where this is coming from), the distribution of (w - l) / sqrt(w + l) is very Gaussian for the noob_3moves book. This could be because the limited statistics for each of the openings? Might nevertheless be interesting to try in split the positions in two sets.

vondele · 2020-05-05T11:27:50Z

So, I locally did a test, splitting the noob_3moves according to the abs( (w-l) / sqrt(w+l)) > 0.167 (roughly 1 sigma), and there is no measurable difference (60k games) between the low and high parts of the book. So I start suspecting the broad Gaussian is just the noise, and the feature near 0 is the signal.... this is using the results of 44M LTC fishtest games using the noob_3moves book.

noobpwnftw · 2020-05-06T16:56:03Z

@vondele Full scores of positions after 3 moves: https://www.chessdb.cn/downloads/3moves_scores.zip

vondele · 2020-05-06T17:40:23Z

Interesting distribution of the scores of all positions after 3 moves...

noobpwnftw · 2020-05-06T17:45:41Z

The feature around -15 and 0 are probably caused by the way I calculate things, might actually be smooth but doesn't matter when you sample moves with a wider range.

dorzechowski · 2020-05-06T23:12:31Z

No difference in my tests between book created from positions with low or high node count on depth 13 (TC 10+0.1).
Low:
Score of Stockfish_11 vs Stockfish_10: 2296 - 1236 - 4468 [0.566] 8000
High:
Score of Stockfish_11 vs Stockfish_10: 2276 - 1276 - 4448 [0.562] 8000

vondele · 2020-05-11T18:54:57Z

so, with #2670 we have a first patch that resulted from the closedpos book. Let's call this a success :-)

I don't think we have particular evidence to change the default book, but I'm sure we now know that we still don't know quite a few things about opening books.

I'll thus close this issue, keeping noob_3moves the default book. The other books can be used as non-default books, either for experimenting or to create Elo gainers, but we'll test patches for non-regression against noob_3moves to gather experience with this setup, asserting that we prefer generic solutions rather than specialized ones.

xoto10 · 2020-05-11T19:04:34Z

See also: https://tests.stockfishchess.org/tests/view/5eb1e2dd2326444a3b6d33f9 #2662 :)
Although the stc was with noob_3moves, don't remember why I made those choices. Probably intended to use closedpos with the stc but forgot to set it, then made sure I did for the ltc.

vondele · 2020-05-11T19:09:09Z

OK, I overlooked that... should have been in the PR a little more clearly ;-). Extra credit for the book.

xoto10 referenced this issue in SFisGOD/Stockfish Apr 25, 2020

Try. Bench: 4797620

ff5ed07

snicolet added the books label May 4, 2020

vondele mentioned this issue May 10, 2020

Pawn value #2670

Closed

vondele closed this as completed May 11, 2020

Lolligerhans mentioned this issue Nov 30, 2020

Refine rook penalty on closed files #3242

Closed

[RFC] closed positions book. #2646

[RFC] closed positions book. #2646

Comments

vondele commented Apr 25, 2020

MJZ1977 commented Apr 25, 2020 • edited Loading

vondele commented Apr 25, 2020

NKONSTANTAKIS commented Apr 25, 2020

vondele commented Apr 25, 2020 • edited Loading

vondele commented Apr 25, 2020 • edited Loading

locutus2 commented Apr 25, 2020

Lolligerhans commented Apr 25, 2020

adentong commented Apr 25, 2020

xoto10 commented Apr 25, 2020

Vizvezdenec commented Apr 25, 2020

adentong commented Apr 26, 2020

NKONSTANTAKIS commented Apr 26, 2020

vondele commented Apr 26, 2020

vondele commented Apr 26, 2020

locutus2 commented Apr 26, 2020

vondele commented Apr 26, 2020

locutus2 commented Apr 26, 2020 • edited Loading

vondele commented Apr 26, 2020

MJZ1977 commented Apr 26, 2020

xoto10 commented Apr 26, 2020

miguel-l commented Apr 26, 2020

NKONSTANTAKIS commented Apr 26, 2020 • edited Loading

xoto10 commented Apr 26, 2020 • edited Loading

vondele commented Apr 28, 2020 • edited Loading

xoto10 commented Apr 29, 2020

vdbergh commented Apr 29, 2020 • edited Loading

vondele commented Apr 30, 2020

noobpwnftw commented Apr 30, 2020

vondele commented Apr 30, 2020

vondele commented May 2, 2020

noobpwnftw commented May 2, 2020

dorzechowski commented May 3, 2020 • edited Loading

noobpwnftw commented May 4, 2020

dorzechowski commented May 4, 2020 • edited Loading

noobpwnftw commented May 4, 2020 • edited Loading

dorzechowski commented May 4, 2020 • edited Loading

noobpwnftw commented May 4, 2020 • edited Loading

vondele commented May 4, 2020

NKONSTANTAKIS commented May 5, 2020 • edited Loading

noobpwnftw commented May 5, 2020 • edited Loading

dorzechowski commented May 5, 2020

NKONSTANTAKIS commented May 5, 2020 • edited Loading

noobpwnftw commented May 5, 2020 • edited Loading

NKONSTANTAKIS commented May 5, 2020

noobpwnftw commented May 5, 2020

vondele commented May 5, 2020

vondele commented May 5, 2020

vondele commented May 5, 2020

noobpwnftw commented May 6, 2020

vondele commented May 6, 2020

noobpwnftw commented May 6, 2020

dorzechowski commented May 6, 2020

vondele commented May 11, 2020

xoto10 commented May 11, 2020 • edited Loading

vondele commented May 11, 2020

MJZ1977 commented Apr 25, 2020 •

edited

Loading

vondele commented Apr 25, 2020 •

edited

Loading

vondele commented Apr 25, 2020 •

edited

Loading

locutus2 commented Apr 26, 2020 •

edited

Loading

NKONSTANTAKIS commented Apr 26, 2020 •

edited

Loading

xoto10 commented Apr 26, 2020 •

edited

Loading

vondele commented Apr 28, 2020 •

edited

Loading

vdbergh commented Apr 29, 2020 •

edited

Loading

dorzechowski commented May 3, 2020 •

edited

Loading

dorzechowski commented May 4, 2020 •

edited

Loading

noobpwnftw commented May 4, 2020 •

edited

Loading

dorzechowski commented May 4, 2020 •

edited

Loading

noobpwnftw commented May 4, 2020 •

edited

Loading

NKONSTANTAKIS commented May 5, 2020 •

edited

Loading

noobpwnftw commented May 5, 2020 •

edited

Loading

NKONSTANTAKIS commented May 5, 2020 •

edited

Loading

noobpwnftw commented May 5, 2020 •

edited

Loading

xoto10 commented May 11, 2020 •

edited

Loading