Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce bad outpost penalty #2803

Closed
wants to merge 9 commits into from

Conversation

SFisGOD
Copy link
Contributor

@SFisGOD SFisGOD commented Jul 10, 2020

In some French games, Stockfish likes to bring the Knight to a bad outpost spot. This is evident in TCEC S18 Superfinal Game 63, where there is a Knight outpost on the queenside but is actually useless. Stockfish is effectively playing a piece down while holding ground against Leela's break on the kingside.

This patch turns the +56 mg bonus for a Knight outpost into a -7 mg penalty if it satisfies the following conditions:

  1. The outpost square is not on the CenterFiles (i.e. not on files C,D,E and F)
  2. The knight is not attacking non pawn enemies.
  3. The side where the outpost is located contains only few enemies.

Passed STC:
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 6960 W: 1454 L: 1247 D: 4259
Ptnml(0-2): 115, 739, 1610, 856, 160
https://tests.stockfishchess.org/tests/view/5f08221059f6f0353289477e

Passed LTC:
LLR: 2.98 (-2.94,2.94) {0.25,1.75}
Total: 21440 W: 2767 L: 2543 D: 16130
Ptnml(0-2): 122, 1904, 6462, 2092, 140
https://tests.stockfishchess.org/tests/view/5f0838ed59f6f035328947a2

Bench: 4651788


Thank you to apospa...@gmail.com for bringing this to my attention and for providing insights.
See https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/dEXNzSIBgZU
Reference game: https://tcec-chess.com/#div=sf&game=63&season=18

@SFisGOD
Copy link
Contributor Author

SFisGOD commented Jul 10, 2020

@vondele I prefer this version since it's just the "more_than_two" name that is wrong and it passed 4 STCs with flying colors (different tweaks of mg value)
https://tests.stockfishchess.org/tests/view/5f08221059f6f0353289477e
https://tests.stockfishchess.org/tests/view/5f081f8f59f6f0353289477a
https://tests.stockfishchess.org/tests/view/5f081a7959f6f03532894774
https://tests.stockfishchess.org/tests/view/5f0818c159f6f03532894771

And 3 LTCs passed too
https://tests.stockfishchess.org/tests/view/5f0838ed59f6f035328947a2
https://tests.stockfishchess.org/tests/view/5f0835b959f6f03532894796
https://tests.stockfishchess.org/tests/view/5f082e7559f6f03532894786

I submitted an LTC for the 4th one with priority -1 but then cancelled it after one of the LTCs passed.

Not looking good for popcount and more_than_one versions
https://tests.stockfishchess.org/tests/view/5f08625859f6f035328947b9
https://tests.stockfishchess.org/tests/view/5f08702559f6f035328947c0

For those who are interested to check the game, here's a screenshot from S18 Sufi Game 63:
image

@vondele
Copy link
Member

vondele commented Jul 10, 2020

@SFisGOD you'll need to a add comment to the now misnamed 'more_than_two' function, which explains what it is doing.... it can't be committed in this way. However, since it addresses a long-standing problem, with seemingly good results, it is even more important we understand what it is doing to be able to further improve it.

@SFisGOD
Copy link
Contributor Author

SFisGOD commented Jul 10, 2020

@vondele I observed the same as what protonspring said when I was playing with it, sometimes it is popcount(b) > 1 AND other times popcount(b) > 2

@vondele
Copy link
Member

vondele commented Jul 10, 2020

@SFisGOD so can we understand when it is popcount(b)>1, and when it is popcount(b)>2.... seemingly that's the magic bit that makes the patch work so well, since using either popcount(b)>1 or popcount(b)>2 all the time fails.

@NKONSTANTAKIS
Copy link

NKONSTANTAKIS commented Jul 10, 2020

Thanks SFisGOD, it feels amazing when people can translate what I see into code and performs that great.

My take for the function is that it is exact when >2 and <2, while when =2 we get a benefit similar to the randomization one:
By reporting 2/3's of the time false, and 1/3 true, its like saving 2/3's of the time the resources going into fat outpost variations, but not in a strict manner so as to completely exclude ourselves from cases where it can be good.

At this borderline territory we introduce a bit of vagueness, for "not putting all eggs in 1 basket"

So this is mostly an eval improvement, but in the case of =2, its a search improvement as well.

I propose that similar vagueness is tried into other concepts as well: For everything that is clearly good after a threshold and clearly bad below one, to use this "quantum behavior" at the borderline instead of a polarized hard switch.

Also, randomizing stuff is generically helpful but neutral, and neutral is arbitary. Instead it would be superior when things are taken into a more fitting analogy to our chess-related perspective. In this case its 2/3's of the time vs 1/3rd.

The TT is enriched with less determinism, and search is more versatile.

Just visualize the search tree, instead of being absolute and locating all resources towards the one or the other direction, some times it will switch exploration, not only as a sanity check, but also for boosting truth seeking via transpositions.

Likewise, we humans, as scientists, philosophers, or whatever, are better off if in between what we consider true and what we consider false, include a room for "maybe". Our perception is limited.

@vondele
Copy link
Member

vondele commented Jul 10, 2020

@NKONSTANTAKIS the idea of randomly picking one or the the other can be checked. First version here:
https://tests.stockfishchess.org/tests/view/5f08d53059f6f03532894803

@NKONSTANTAKIS
Copy link

@vondele Ok this is interesting experiment, but I am thinking what if we profit by biasing the randomness to one direction?

If this tweaks it to 1/3 vs 2/3, and your try 1/2 vs 1/2, imo another interesting experiment would be to tweak it even more, like 1/6 vs 5/6. This with the intention of saving even more resources, as we will treat more outposts as bad.

@vondele
Copy link
Member

vondele commented Jul 10, 2020

yes, that's why it is take 1..... I guess I need some statistics on the original implementation to see what the bias would be to have it most similar with the passed patch.

@mstembera
Copy link
Contributor

Just FYI, if it turns out to be useful to implement more_than_two() correctly w/o popcount then it's

constexpr bool more_than_two(Bitboard b) {
  return more_than_one(b & (b - 1));
}

@SFisGOD
Copy link
Contributor Author

SFisGOD commented Jul 10, 2020

@vondele
We consider a Black Knight on b3 represented by N, same as Sufi game above. The 1s represent the attacked squares by the knight.

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
1 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0
0 N 0 0 0 0 0 0
0 0 0 1 0 0 0 0
1 0 1 0 0 0 0 0

If there are white pieces on b1 and d1 only, then the bitboard is 1010.

b      1010
b-1    1001
b-2    1000

And so b & (b-1) & (b-2) => 1000 is true. So if there's a white piece on b1 then b & (b-1) & (b-2) is just like more_than_one

Same if there's a white piece on d1
In short, if Knight on b3 then b & (b-1) & (b-2) is more_than_one


For Black Knight on a3

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
N 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 1 0 0 0 0 0 0

If white pieces are on a1 and c1 then bitboard is 101. And b & (b-1) & (b-2) is false and so this triggers the penalty.

b & b-1 & b-2
1 & 1 & 0 -> 0
0 & 0 & 1 -> 0
1 & 0 & 1 -> 0

If there's an additional piece on the queenside then the outpost won't be penalized. So it works like the correct more_than_two

Same case if there are pieces on a1 and d1

In short, if there's a white piece on a1 then it works just like the correct more_than_two
If there's no piece on a1 then it behaves like more_than_one

Same for the other Black outpost squares on the queenside actually, you just need to check if there's a piece on a1 square.


Summary
Black Knight on Queenside outpost squares
Check if there's a piece on a1 square. If there's a piece then b & (b-1) & (b-2) is more_than_two
If no piece then more_than_one

Black Knight on Kingside outpost squares
b & (b-1) & (b-2) behaves just like more_than_one

White Knight on Queenside outpost squares
You need to check a1 but it's very rare for a black piece to be on a1 in the opening or early middlegame. If there's no piece on a1 then b & (b-1) & (b-2) is more_than_one

White Knight on Kingside outpost squares
b & (b-1) & (b-2) behaves just like more_than_one

So the b & (b-1) & (b-2) condition is really assymetric. Most of the time, it behaves like more_than_one
But for a Black Knight outpost, it's a bit different.

Sadly though, more_than_one version is not looking good at STC
https://tests.stockfishchess.org/tests/view/5f08702559f6f035328947c0

@NKONSTANTAKIS
Copy link

So "b & (b-1) & (b-2)" just happened to match the power pattern of different outposts taking into account asymmetry too?
Sounds like hitting the lottery, but more realistic than my previous take, with the new data.

Can we verify that its indeed asymmetric, meaning that the exact same position with colors reversed & deterministic search will produce different evals?

If yes then lets try to match the performance with a symmetric take?
How about copying the black outpost behavior, as the elo might have come from fixing french defense?

@SFisGOD
Copy link
Contributor Author

SFisGOD commented Jul 11, 2020

@NKONSTANTAKIS By assymetric, I mean, Black Knight outpost on the queenside is a bit different than in other quadrants.

@NKONSTANTAKIS
Copy link

NKONSTANTAKIS commented Jul 11, 2020

If the exact same position with reverse colors, ie with black knight a3 evaluated differently than with white knight a6, then its asymmetric in regards to color, something that we haven't in SF yet. (but the NN's have)

But if K-side and Q-side outposts are asymmetric, then its like our pawn PSQT.

@SFisGOD
Copy link
Contributor Author

SFisGOD commented Jul 11, 2020

@NKONSTANTAKIS Actually maybe I'll try to do that too, check if there's piece on a8 if we have white knight outpost on queenside and maybe in other corners as well :)

@AlexandreMasta
Copy link

AlexandreMasta commented Jul 11, 2020

Chess is asymmetric. Sides are not exactly symmetric...they are "mirrowed". Difference is what you are talking about and "discovering" now. The fact that as outposts many other square values changes due to position of black or white squares in comparisson of each side. It would be symmetrical if first of all we really had 2 of each piece for both sides and we all know that queen and king are only one. And more...you should have an axis with same distance connecting the initial pieces passing through the exact center of the board. Yeah! In the central point of the squares e4 e5 d4 d5. Seems absurd in a 8x8 square board. To the game be symmetrical it should have a 9x9 board with 2 queens each side and 2 bishops of the same color for each side. Problem is all bishops should have same color (lol).

On the contrary we have each king facing the other and only a queen at each side and a 8x8 board causing the asymmetric behavior.

Many time ago I said this here but we had Lucas that disagreed and many other ppl continue trying to use symmetrical formulas. Chess is not symmetrical. It is "mirrowed".

@Nordlandia
Copy link

Nordlandia commented Jul 11, 2020

9x9 chess observation.

The light-squared bishop is more worth than the dark, because e5 square can reach more squares and is the true center of the board. @AlexandreMasta

@vondele
Copy link
Member

vondele commented Jul 11, 2020

@SFisGOD (et al.) thanks for the analysis. I can confirm that the function is equivalent to:

constexpr bool magic_more_than_two(Bitboard b) {
  return b & (1ULL << SQ_A1) ? more_than_two(b) : more_than_one(b);
}

@vondele vondele added the to be merged Will be merged shortly label Jul 11, 2020
@vondele
Copy link
Member

vondele commented Jul 11, 2020

So, I'll merge this version with added comment, and we'll use the standard process to improve and generalize it.

@vondele vondele closed this in 1f3bd96 Jul 11, 2020
@vondele
Copy link
Member

vondele commented Jul 11, 2020

Thanks!

Interesting patch for sure. I hope we can squeeze more out of it.

@vondele vondele mentioned this pull request Jul 11, 2020
@locutus2
Copy link
Member

@mstembera

Just FYI, if it turns out to be useful to implement more_than_two() correctly w/o popcount then it's

constexpr bool more_than_two(Bitboard b) {
  return more_than_one(b & (b - 1));
}

I have seen recently this topic and came up with the same correct version of more_than_two than you. This one i would prfere to.
Have you or something other tried this as non-regression? i haven't found any tests but thisis a long thead and perhaps i overlookd it. Else i would start a test.

@vondele
Copy link
Member

vondele commented Jul 11, 2020

At least the poptcount version has been tried.

There is also https://tests.stockfishchess.org/tests/view/5f099c3359f6f03532894880 running, which goes further, but I'm not sure how put that in the perspective of the 4x STC mentioned #2803 (comment)

@locutus2
Copy link
Member

MichaelB7 pushed a commit to MichaelB7/Stockfish that referenced this pull request Jul 12, 2020
"

Test results:

---------------------------------------------------------------------------------------------------------
   1 HoSib              3103   0.0   12   12  1000  507.5  50.7  173  158  669  17.3  66.9  3097
   2 Honey-dev-071020   3097   5.2   12   12  1000  492.5  49.2  158  173  669  15.8  66.9  3103
---------------------------------------------------------------------------------------------------------
@vondele
Copy link
Member

vondele commented Jul 12, 2020

I had a look how this influences the response to 1 e4, and this patch indeed avoid 1 .. e6 in favor of 1 .. c5, i.e. at depth >35 goes for c5 in a stable way, whereas before e6 was still present once at depth 46.

as function of depth (1 core, bestmove depth 1 - 50):
before the patch:

 e7e6 e7e6 b7b6 d7d5 d7d5 d7d5 d7d5 d7d5 d7d5 d7d5
 c7c5 c7c5 e7e6 e7e6 e7e6 e7e6 e7e6 e7e6 e7e6 e7e6
 e7e6 e7e6 e7e6 e7e6 e7e6 c7c5 c7c5 e7e6 e7e6 e7e6
 e7e6 e7e6 e7e6 e7e6 c7c5 c7c5 c7c5 c7c5 c7c5 c7c5
 c7c5 c7c5 c7c5 c7c5 c7c5 e7e6 c7c5 c7c5 c7c5 c7c5

after the patch

 e7e6 e7e6 b7b6 d7d5 d7d5 d7d5 d7d5 d7d5 d7d5 d7d5
 c7c5 c7c5 e7e6 e7e6 e7e6 e7e6 e7e6 e7e6 e7e6 e7e6
 e7e6 e7e6 e7e6 e7e6 e7e6 c7c5 c7c5 c7c5 c7c5 c7c5
 c7c5 c7c5 c7c5 c7c5 c7c5 c7c5 c7c5 c7c5 c7c5 c7c5
 c7c5 c7c5 c7c5 c7c5 c7c5 c7c5 c7c5 c7c5 c7c5 c7c5

@ssj100
Copy link

ssj100 commented Jul 13, 2020

@vondele You mentioning analysing with 1-core. What hash and contempt settings are you using? I can't reproduce your results with 1-core here with 128Mb hash and contempt 0. SF wants e6 from depths 27-37, switches to c5 from depths 38-40, and then switches back to e6 from depths 41-42. Currently I have it switching back to c5 from depth 43....

@vondele
Copy link
Member

vondele commented Jul 17, 2020

@ssj100 that was with default contempt, and 512 Hash:

setoption name Hash value 512
position startpos moves e2e4
go depth 50
ucinewgame

but I'm sure this is a bit of a poor test anyway.

noobpwnftw pushed a commit to noobpwnftw/Stockfish that referenced this pull request Aug 15, 2020
In some French games, Stockfish likes to bring the Knight to a bad outpost spot. This is evident in TCEC S18 Superfinal Game 63, where there is a Knight outpost on the queenside but is actually useless. Stockfish is effectively playing a piece down while holding ground against Leela's break on the kingside.

This patch turns the +56 mg bonus for a Knight outpost into a -7 mg penalty if it satisfies the following conditions:

* The outpost square is not on the CenterFiles (i.e. not on files C,D,E and F)
* The knight is not attacking non pawn enemies.
* The side where the outpost is located contains only few enemies, with a particular conditional_more_than_two() implementation

Thank you to apospa...@gmail.com for bringing this to our attention and for providing insights.
See https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/dEXNzSIBgZU
Reference game: https://tcec-chess.com/#div=sf&game=63&season=18

Passed STC:
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 6960 W: 1454 L: 1247 D: 4259
Ptnml(0-2): 115, 739, 1610, 856, 160
https://tests.stockfishchess.org/tests/view/5f08221059f6f0353289477e

Passed LTC:
LLR: 2.98 (-2.94,2.94) {0.25,1.75}
Total: 21440 W: 2767 L: 2543 D: 16130
Ptnml(0-2): 122, 1904, 6462, 2092, 140
https://tests.stockfishchess.org/tests/view/5f0838ed59f6f035328947a2

various related tests show strong test results, but so far no generalizations or simplifications of conditional_more_than_two() are found. See PR for details.

closes official-stockfish#2803

Bench: 4366686
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
to be merged Will be merged shortly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet