Threefold repetition detection #925

saproj · 2016-12-13T18:26:57Z

Implement a threefold repetition detection. Below are the examples of problems fixed by this change.

Loosing move in a drawn position.
position fen 8/k7/3p4/p2P1p2/P2P1P2/8/8/K7 w - - 0 1 moves a1a2 a7a8 a2a1
The old code suggested a loosing move "bestmove a8a7", the new code suggests "bestmove a8b7" leading to a draw.
Incorrect evaluation (happened in a real game in TCEC Season 9).
position fen 4rbkr/1q3pp1/b3pn2/7p/1pN5/1P1BBP1P/P1R2QP1/3R2K1 w - - 5 31 moves e3d4 h8h6 d4e3
The old code evaluated it as "cp 0", the new code evaluation is around "cp -50" which is adequate.

Brings 0.5-1 ELO gain. Passes [-3.00,1.00].

STC: http://tests.stockfishchess.org/tests/view/584ece040ebc5903140c5aea
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 47744 W: 8537 L: 8461 D: 30746

LTC: http://tests.stockfishchess.org/tests/view/584f134d0ebc5903140c5b37
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 36775 W: 4739 L: 4639 D: 27397

MichaelB7 · 2016-12-13T18:33:18Z

A nice effort with a satisfactory result. Congrats.

Stefano80 · 2016-12-13T18:59:32Z

Why -3, 1? It adds code, so it should be 0, 5

saproj · 2016-12-13T19:04:35Z

Can be considered a bug fix, therefore tested for no-regression.

joergoster · 2016-12-13T20:07:16Z

@Stefano80 Similar tries were also tested with bugfix bounds in the past. But I admit this patch adds a LOT of code. OTOH, this is the first one that passed STC and LTC!
I don't envy the maintainer(s) for the decision they have to make.
@saproj Congrats!

atumanian · 2016-12-13T21:08:30Z

Good work! The 3-fold repetition blindness has been always annoying.
But why does this change the bench signature? The bench test doesn't have pre-root positions, so the patch shouldn't affect it.

saproj · 2016-12-13T21:42:56Z

Because the root position itself is reenterable unless it is a repetition (like in the case of bench). Why? I tested the alternative and it was weaker.

syzygy1 · 2016-12-14T08:48:07Z

It does not seem to make sense to allow a first repetition of the root position. Did keeping the old behaviour for the root position fail on fishtest?

A possible way to remove the st->draw flags: after setting up the position in uci.cpp, walk from the root back into history and set all non-repeated st->key values to zero.

saproj · 2016-12-14T12:28:54Z

It does not seem to make sense to allow a first repetition of the root position. Did keeping the old behaviour for the root position fail on fishtest?

Yes, it did.

A possible way to remove the st->draw flags: after setting up the position in uci.cpp, walk from the root back into history and set all non-repeated st->key values to zero.

Interesting idea. I'll try it.

vdbergh · 2016-12-14T17:48:02Z

@saproj The fact that the first version failed SPRT(-3,1) does not mean it is weaker. You cannot draw any conclusion from a failed SPRT(-3,1).

Also in case of a small regression (like -1 elo as could be the case here) you can make a SPRT(-3,1) pass by repeating it enough times (possibly combining it with placebo changes a la "Take 1", "Take 2", "optimized version" etc...).

I think in this case the maintainers should simply be ready to (possibly) sacrifice 1 elo in order to repair what is obviously a bug in Stockfish.

MichaelB7 · 2016-12-14T17:56:02Z

I know I don't have a vote - but I would vote for this patch. There are many users of Stockfish that would appreciate the nuance corrected here.

Stefano80 · 2016-12-14T19:26:09Z

I really don't understand why it should be tested with [-3, 1]. It it adds more than 30 lines of code. It does not fix anything, and for certain not a bug. It has not been approved in advance by maintainers. It is just an arbitrary abuse of the framework. The guidelines say:

These tests are also used for bug fixes and other special cases, but only after being discussed and approved in advance to avoid people testing with no-regression mode becoming their preferred toy, instead of using the stricter standard mode.

Please either stick to the guidelines or propose to change them.

@vdbergh : it has been repeated hundreds of times that conter-intuitive eval in an arbitrary class of positions is not considered a bug.

Mindbreaker1 · 2016-12-14T20:21:25Z

I think 30 lines is an exaggeration. Some is documentation, spacing, and brackets. I see 13 without that. It fixes a problem and it does not appear to hurt. And is it possible that this might help when contempt is present achieving more at a lower setting? That is just a guess, not an assertion.

At the very least, it means less grumbles on the forum. That has to be worth something.

saproj · 2016-12-14T20:42:43Z

@vdbergh I did direct comparison locally. Not allowing to reenter the root position is, surprisingly, worse than allowing it. By several ELO points. Why? It's a mystery.

saproj · 2016-12-14T20:50:03Z

@Stefano80 Just to clarify my position: I believe I'm fixing a bug. Surprised by your "for certain not a bug".

ddugovic · 2016-12-15T00:13:02Z

Regardless we can agree, "It has not been approved in advance by maintainers."

The old code evaluated it as "cp 0", the new code evaluation is around "cp -50" which is adequate.

How is "cp 0" a bug? (In my fork I do address this issue, however I cannot prove "cp 0" is a bug according to terminology used by maintainers.)

jhellis3 · 2016-12-15T02:59:36Z

To understand why this is a "bug" is fairly straightforward - the current behavior simply does not adhere to the actual rules of chess. Using a small eval difference is a bit disingenuous. Say the eval is -10, but human players repeated the previous position. Showing an eval of draw is clearly in error, since the first repetition does not result in an immediate draw under the actual rules of the game. While it may not make any difference for Stockfish in playing a game, it does cause a clear discontinuity in evaluation when used for analysis. As analysis is the primary use case for the engine, ensuring correct output should at least be worthy of consideration, IMHO.

saproj · 2016-12-15T11:36:31Z

@syzygy1 Zeroing non-repeated st->key (at least the way I implemented it) is not better. See http://tests.stockfishchess.org/tests/view/58514cd80ebc5903140c5c04

saproj · 2016-12-15T13:43:29Z

@ddugovic,
"cp 0" is a bug because it is a wrong evaluation. This 0 eval may be assigned to a position which is in fact very different from equality.

You mentioned you address this issue in your fork. I do not know about it. So, is there also a fix for the 1st problem from my list? I mean the problem when SF makes a loosing move in a drawn position.

ddugovic · 2016-12-15T14:10:31Z

My fork's change probably looks familiar to some; unfortunately I forget who the original author is that I copied the idea from.

People keep using the words "bug" and "fix" and while I only endorse the word "problem" my fork does not play a8a7 in that study position.

saproj · 2016-12-15T14:18:53Z

@ddugovic Familiar idea. It introduces an ELO-stealing slowdown because of additional loop iterations.

Rocky640 · 2016-12-15T15:13:18Z

Congrats to the author !!!

There is one helper function, and a few line changes.
This looks quite reasonable to me.

Let me take some examples:
Let's say that engine would systematically output cp=0 whenever the last move was e.p.
I think we would want to fix this.

If search would never consider promotions to bishop, (hello old Rybka),
I think we would want to fix this. No ELO improvement,
but we want to design an engine which plays according to the rules of chess.

If I have a car which runs perfectly, but would stop working each time I "leave from work" (but would be ok if I leave from all the other places), I would say that this is a problem.

As it is, each time current master "leaves from a repetition", it virtually stops doing its duty which is
to keep trying to output the most realistic evaluation according to the available information.

It's not like a specific funny position which is misevaluated. This is a systemic error which does not really depend of the position. It depends only to the fact that there was a repetition in the game history.

We expect an engine to properly handle e.p., castling rights, 50 moves rule, (which all relates to the game history) AND also repetitions.

It might even be a small ELO gainer. In a time limited game, SF might go for a first repetition, and on the next move. using the info in the TT, and with the time allocated by the time management, it might go for a better fighting continuation or simply keep trying with something else.

lantonov · 2016-12-15T15:25:17Z

Having tried myself to fix this problem (bug?) and failing badly, I know how complicated it is if you dig deeper. The author of this patch did a good job, improving on his previous idea.

syzygy1 · 2016-12-15T21:08:20Z

@saproj
I would not expect zeroing st->key instead of adding st->draw flags to do measurably better, but it cannot do worse and, arguably, it is a slightly simpler solution (not needing an additional field in StateInfo, for example).

atumanian · 2016-12-15T23:02:33Z

@syzygy1, but your version treats the root position differently: it doesn't allow repeating it even once during a search, unlike saproj's version. He has already written about this here.

saproj · 2016-12-15T23:03:20Z

@syzygy1 This idea of zeroing st->key looked appealing to me at first. But now that I have tested it, I want to stick to adding st->draw. By the way, this new field resides in an alignment gap (under x86_64) and StateInfo's size did not change.

ElbertoOne · 2016-12-16T09:31:03Z

I'm in favor of this patch, even if it adds some complexity. To remove all doubt, we might consider a reverse simplification test (master against patch instead of patch versus master).

jcalovski · 2016-12-16T11:52:10Z

If I were a maintainer i'd list a few "exceptions":

3 fold rep
fortress detection
reworked contempt
and so on...

For which i'd relax the rules slightly.

stockfishdeveloper · 2016-12-16T15:31:34Z

I guess this patch cuts to the root of Stockfish's purpose.
The question it brings up is:
Should Stockfish be made for playing or analyzing?

joergoster · 2016-12-25T11:18:03Z

@mcostalba Nice! But this again excludes the root position to be repeated once without directly assigning a draw score ...

vdbergh · 2016-12-25T19:42:30Z

@mcostalba If I read correctly this is precisely the version that failed several times in the past. It seemed to be a -1 elo regression. So if you keep on testing it, it will eventually pass... which would be a good thing IMHO.

mcostalba · 2016-12-25T19:57:40Z

@vdbergh Can you please post some link to validate your statement or you just go by memory. FYI this is fully equivalent to the original patch but the recursion at root.

…

On Sunday, December 25, 2016, vdbergh ***@***.***> wrote: @mcostalba <https://github.com/mcostalba> If I read correctly this is precisely the version that failed several times in the past. It seemed to be a -1 elo regression. So if you keep on testing it, it will eventually pass... which would be a good thing IMHO. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#925 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABDGAQxZSPE0dumwJ5gVi9qNLe097ekEks5rLscngaJpZM4LMEQh> .

vdbergh · 2016-12-25T20:25:28Z

@mcostalba AFAICS your version does not contain the optimization "calc_draw". This is probably a placebo but one cannot be sure.

atumanian · 2016-12-25T21:24:00Z

This test shows that it's better to allow repetition of the root position: http://tests.stockfishchess.org/tests/view/58514cd80ebc5903140c5c04
@syzygy1, I understand the difference between repeating the root position and repeating positions after the root. My only argument for repeating the root is the results of tests.

vdbergh · 2016-12-25T21:48:55Z

@mcostalba This seems to be the first version. It failed SPRT(-4,0).

https://groups.google.com/forum/?fromgroups=#!searchin/fishcooking_results/3fold_fix%7Csort:relevance/fishcooking_results/Jvg9Ytp5mY4/JyQ7uviPMkIJ

mcostalba · 2016-12-26T07:49:38Z

These version are similar but not equivalent, in particular these have not verify that the boundary is exacty at root position (it is tricky to find the exact boundary, I did this by some debug code. You can see it in previous commit of my pushed branch).

…

On Sun, Dec 25, 2016 at 10:48 PM, vdbergh ***@***.***> wrote: @mcostalba <https://github.com/mcostalba> This seems to be the first version. It failed SPRT(-4,0). https://groups.google.com/forum/?fromgroups=#!searchin/ fishcooking_results/3fold_fix%7Csort:relevance/fishcooking_ results/Jvg9Ytp5mY4/JyQ7uviPMkIJ — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#925 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABDGAby-yLS_WR1cHnEOKCB4m7Z9x6kzks5rLuTIgaJpZM4LMEQh> .

mcostalba · 2016-12-26T07:55:37Z

@vdbergh regarding the calc_draw(), well if the patch passed just because of that then it is better to fail and do not commit, it means it passed by pure luck. In real games the cases where we reach plies before root are very very few (measured with debug instrumentation) and even in those cases, the extra search to reach the game ply base is absolutely unmeasurable. Again, if this is the case then it is much more probable to explain the successful test by a statistical fluke.

vdbergh · 2016-12-26T08:06:38Z

@mcostalba Thanks for repeating my points (your last post). As for your earlier post. I think the old version from 2014 is equivalent to your "new" one (there is nothing tricky about this). I have not checked it carefully however (it was not my patch).

Based on official-stockfish/Stockfish#925 bench: 5255881

Based on official-stockfish/Stockfish#925 bench: 5493489

vdbergh · 2016-12-26T08:16:39Z

BTW I was simply responding to your claim: "FYI this is fully equivalent to the original patch but the recursion at root.". This statement is false as the (probably placebo) optimization is dropped. Also: please do not use "FYI". I am perfectly capable of reading your code.

mcostalba · 2016-12-26T08:17:01Z

Ok, I have submitted a couple of test: with root position included and excluded from extended draw repetition.

ddugovic · 2016-12-26T15:01:32Z

I apologize for the untimeliness of the following idea as I do not fully understand HGM's comment as it applies to Stockfish:

One way to solve [threefold repetition] is to remove all positions from the game history that occurred only once, during loading of a position in AnalyzeMode. This is code that is unlikely to be patched very frequently. E.g. you could spoil the incrementally updated hash key before parsing the moves by XOR'ing it with some constant, and repair it afterwards. And repair the key of any encountered position that would get stored in the move-history table whenever you detect it is a repetition.

Given that the position ... moves command reconstructs the entire game history I wonder, after that command would it be safe (for positions only existing once in the game history) to spoil stp->key (with or without "repair" code)?

atumanian · 2016-12-26T15:16:55Z

@ddugovic, I've actually tried this idea on Fishtest: http://tests.stockfishchess.org/tests/view/58223a4b0ebc5910626b9c64
My version isn't perfect, so you shouldn't take it literally, but you can look at it.
Before every search it reconstructs pre-root states so that all positions which were repeated only once are deleted. The patch adds a lot of code, but the code which tests for a draw is almost unchanged.

mcostalba · 2016-12-31T13:34:28Z

Both tests passed, so I think I will commit the one with root position excluded from extended draw repetition because it seems the most logical to me.

atumanian · 2016-12-31T16:54:15Z

@mcostalba, but only the version with the root position included works correctly with MultiPV > 1, as I have said earlier.

atumanian · 2016-12-31T16:57:04Z

We can also make a match between the two versions to decide which is stronger.

Based on official-stockfish/Stockfish#925 bench: 5255881

Based on official-stockfish/Stockfish#925 bench: 5493489

mcostalba · 2017-01-01T09:43:16Z

@atumanian The argument on MultiPV is not clear, in particular has been tentatively refuted by @syzygy1 and I think his arguments make sense. Overall to include root because of MultiPV leads to a questionable and subjective argumentation for both parts, not to a clear black or white condition. so because it is not clear the benefit in MultiPV case and because, apart form this very specific and debatable point, there is no logical/measurable added benefit, I will commit the version with root excluded.

Based on official-stockfish/Stockfish#925 bench: 5255881

Based on official-stockfish/Stockfish#925 bench: 5493489

Implement a threefold repetition detection. Below are the examples of problems fixed by this change. Loosing move in a drawn position. position fen 8/k7/3p4/p2P1p2/P2P1P2/8/8/K7 w - - 0 1 moves a1a2 a7a8 a2a1 The old code suggested a loosing move "bestmove a8a7", the new code suggests "bestmove a8b7" leading to a draw. Incorrect evaluation (happened in a real game in TCEC Season 9). position fen 4rbkr/1q3pp1/b3pn2/7p/1pN5/1P1BBP1P/P1R2QP1/3R2K1 w - - 5 31 moves e3d4 h8h6 d4e3 The old code evaluated it as "cp 0", the new code evaluation is around "cp -50" which is adequate. Brings 0.5-1 ELO gain. Passes [-3.00,1.00]. STC: http://tests.stockfishchess.org/tests/view/584ece040ebc5903140c5aea LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 47744 W: 8537 L: 8461 D: 30746 LTC: http://tests.stockfishchess.org/tests/view/584f134d0ebc5903140c5b37 LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 36775 W: 4739 L: 4639 D: 27397 Patch has been rewritten into current form for simplification and logic slightly changed so that return a draw score if the position repeats once earlier but after or at the root, or repeats twice strictly before the root. In its original form, repetition at root was not returned as an immediate draw. After retestimng testing both version with SPRT[-3, 1], both passed succesfully, but this version was chosen becuase more natural. There is an argument about MultiPV in which an extended draw at root may be sensible. See discussion here: #925 For documentation, current version passed both at STC and LTC: STC LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 51562 W: 9314 L: 9245 D: 33003 LTC LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 115663 W: 14904 L: 14906 D: 85853 bench: 5468995

mcostalba · 2017-01-01T10:00:42Z

Merged with 881a9df

atumanian · 2017-01-01T15:18:24Z

@mcostalba, I've created an issue where I've tried to explain my point with an example position: #948.

saproj added 2 commits December 12, 2016 16:04

Threefold repetition detection with draw search depth optimization.

0cd9a74

Code comments added

abc87e1

mcostalba added a commit to mcostalba/Stockfish that referenced this pull request Dec 26, 2016

Rewrite 3-fold detection (root excluded)

78210dc

Based on official-stockfish/Stockfish#925 bench: 5255881

mcostalba added a commit to mcostalba/Stockfish that referenced this pull request Dec 26, 2016

Rewrite 3-fold detection (root included)

220d4d1

Based on official-stockfish/Stockfish#925 bench: 5493489

mcostalba added a commit to mcostalba/Stockfish that referenced this pull request Jan 1, 2017

Rewrite 3-fold detection (root excluded)

253df50

Based on official-stockfish/Stockfish#925 bench: 5255881

mcostalba added a commit to mcostalba/Stockfish that referenced this pull request Jan 1, 2017

Rewrite 3-fold detection (root included)

ec391e9

Based on official-stockfish/Stockfish#925 bench: 5493489

mcostalba added a commit to mcostalba/Stockfish that referenced this pull request Jan 1, 2017

Rewrite 3-fold detection (root excluded)

cdf3683

Based on official-stockfish/Stockfish#925 bench: 5255881

mcostalba added a commit to mcostalba/Stockfish that referenced this pull request Jan 1, 2017

Rewrite 3-fold detection (root included)

e42479d

Based on official-stockfish/Stockfish#925 bench: 5493489

mcostalba closed this Jan 1, 2017

Mardak mentioned this pull request Mar 30, 2020

Allow treating 2-fold repetition as draw with --draw-repetitions=1 LeelaChessZero/lc0#1161

Closed

ddugovic mentioned this pull request Mar 8, 2021

endgame still passes unnecessarily sometimes! domino14/macondo#45

Closed

Threefold repetition detection #925

Threefold repetition detection #925

Conversation

saproj commented Dec 13, 2016 • edited Loading

MichaelB7 commented Dec 13, 2016

Stefano80 commented Dec 13, 2016

saproj commented Dec 13, 2016

joergoster commented Dec 13, 2016

atumanian commented Dec 13, 2016

saproj commented Dec 13, 2016 • edited Loading

syzygy1 commented Dec 14, 2016

saproj commented Dec 14, 2016

vdbergh commented Dec 14, 2016

MichaelB7 commented Dec 14, 2016

Stefano80 commented Dec 14, 2016

Mindbreaker1 commented Dec 14, 2016

saproj commented Dec 14, 2016

saproj commented Dec 14, 2016

ddugovic commented Dec 15, 2016

jhellis3 commented Dec 15, 2016

saproj commented Dec 15, 2016

saproj commented Dec 15, 2016

ddugovic commented Dec 15, 2016

saproj commented Dec 15, 2016

Rocky640 commented Dec 15, 2016 • edited Loading

lantonov commented Dec 15, 2016

syzygy1 commented Dec 15, 2016

atumanian commented Dec 15, 2016

saproj commented Dec 15, 2016

ElbertoOne commented Dec 16, 2016

jcalovski commented Dec 16, 2016 • edited Loading

stockfishdeveloper commented Dec 16, 2016

joergoster commented Dec 25, 2016

vdbergh commented Dec 25, 2016

mcostalba commented Dec 25, 2016 via email • edited Loading

vdbergh commented Dec 25, 2016

atumanian commented Dec 25, 2016 • edited Loading

vdbergh commented Dec 25, 2016

mcostalba commented Dec 26, 2016 via email

mcostalba commented Dec 26, 2016 • edited Loading

vdbergh commented Dec 26, 2016

vdbergh commented Dec 26, 2016

mcostalba commented Dec 26, 2016

ddugovic commented Dec 26, 2016

atumanian commented Dec 26, 2016 • edited Loading

mcostalba commented Dec 31, 2016 • edited Loading

atumanian commented Dec 31, 2016

atumanian commented Dec 31, 2016

mcostalba commented Jan 1, 2017

mcostalba commented Jan 1, 2017

atumanian commented Jan 1, 2017

saproj commented Dec 13, 2016 •

edited

Loading

saproj commented Dec 13, 2016 •

edited

Loading

Rocky640 commented Dec 15, 2016 •

edited

Loading

jcalovski commented Dec 16, 2016 •

edited

Loading

mcostalba commented Dec 25, 2016 via email •

edited

Loading

atumanian commented Dec 25, 2016 •

edited

Loading

mcostalba commented Dec 26, 2016 •

edited

Loading

atumanian commented Dec 26, 2016 •

edited

Loading

mcostalba commented Dec 31, 2016 •

edited

Loading