Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EXPERIMENT] NNUE #21

Closed
wants to merge 55 commits into from
Closed

[EXPERIMENT] NNUE #21

wants to merge 55 commits into from

Conversation

niklasf
Copy link
Member

@niklasf niklasf commented Aug 15, 2020

Loading a NNUE file.

Performance is horrible. Very roughly, for go depth 25 in starting position:

  • 1600 knps native x86-64-sse41-popcnt
  • 1000 knps native x86-64-sse41-popcnt nnue
  • 1000 knps wasm master branch
  • 10 knps wasm nnue
  • 60 knps wasm nnue with -msimd128 (allows: sse, ssse3, sse41), requires --experimental-wasm-simd flag in Chrome

Not known where performance is lost.

nodchip and others added 30 commits August 6, 2020 16:37
This patch ports the efficiently updatable neural network (NNUE) evaluation to Stockfish.

Both the NNUE and the classical evaluations are available, and can be used to
assign a value to a position that is later used in alpha-beta (PVS) search to find the
best move. The classical evaluation computes this value as a function of various chess
concepts, handcrafted by experts, tested and tuned using fishtest. The NNUE evaluation
computes this value with a neural network based on basic inputs. The network is optimized
and trained on the evalutions of millions of positions at moderate search depth.

The NNUE evaluation was first introduced in shogi, and ported to Stockfish afterward.
It can be evaluated efficiently on CPUs, and exploits the fact that only parts
of the neural network need to be updated after a typical chess move.
[The nodchip repository](https://github.com/nodchip/Stockfish) provides additional
tools to train and develop the NNUE networks.

This patch is the result of contributions of various authors, from various communities,
including: nodchip, ynasu87, yaneurao (initial port and NNUE authors), domschl, FireFather,
rqs, xXH4CKST3RXx, tttak, zz4032, joergoster, mstembera, nguyenpham, erbsenzaehler,
dorzechowski, and vondele.

This new evaluation needed various changes to fishtest and the corresponding infrastructure,
for which tomtor, ppigazzini, noobpwnftw, daylen, and vondele are gratefully acknowledged.

The first networks have been provided by gekkehenker and sergiovieri, with the latter
net (nn-97f742aaefcd.nnue) being the current default.

The evaluation function can be selected at run time with the `Use NNUE` (true/false) UCI option,
provided the `EvalFile` option points the the network file (depending on the GUI, with full path).

The performance of the NNUE evaluation relative to the classical evaluation depends somewhat on
the hardware, and is expected to improve quickly, but is currently on > 80 Elo on fishtest:

60000 @ 10+0.1 th 1
https://tests.stockfishchess.org/tests/view/5f28fe6ea5abc164f05e4c4c
ELO: 92.77 +-2.1 (95%) LOS: 100.0%
Total: 60000 W: 24193 L: 8543 D: 27264
Ptnml(0-2): 609, 3850, 9708, 10948, 4885

40000 @ 20+0.2 th 8
https://tests.stockfishchess.org/tests/view/5f290229a5abc164f05e4c58
ELO: 89.47 +-2.0 (95%) LOS: 100.0%
Total: 40000 W: 12756 L: 2677 D: 24567
Ptnml(0-2): 74, 1583, 8550, 7776, 2017

At the same time, the impact on the classical evaluation remains minimal, causing no significant
regression:

sprt @ 10+0.1 th 1
https://tests.stockfishchess.org/tests/view/5f2906a2a5abc164f05e4c5b
LLR: 2.94 (-2.94,2.94) {-6.00,-4.00}
Total: 34936 W: 6502 L: 6825 D: 21609
Ptnml(0-2): 571, 4082, 8434, 3861, 520

sprt @ 60+0.6 th 1
https://tests.stockfishchess.org/tests/view/5f2906cfa5abc164f05e4c5d
LLR: 2.93 (-2.94,2.94) {-6.00,-4.00}
Total: 10088 W: 1232 L: 1265 D: 7591
Ptnml(0-2): 49, 914, 3170, 843, 68

The needed networks can be found at https://tests.stockfishchess.org/nns
It is recommended to use the default one as indicated by the `EvalFile` UCI option.

Guidelines for testing new nets can be found at
https://github.com/glinscott/fishtest/wiki/Creating-my-first-test#nnue-net-tests

Integration has been discussed in various issues:
official-stockfish/Stockfish#2823
official-stockfish/Stockfish#2728

The integration branch will be closed after the merge:
official-stockfish/Stockfish#2825
https://github.com/official-stockfish/Stockfish/tree/nnue-player-wip

closes official-stockfish/Stockfish#2912

This will be an exciting time for computer chess, looking forward to seeing the evolution of
this approach.

Bench: 4746616
The idea is to use NNUE only on quite balanced material positions. This bring a big speedup on research since NNUE eval is slower than classical eval for most of the hardwares and specially on unbalanced positions with LazyEval.

STC: https://tests.stockfishchess.org/tests/view/5f2c2680b3ebe5cbfee85b61
LLR: 2.95 (-2.94,2.94) {-0.50,1.50}
Total: 3168 W: 560 L: 400 D: 2208
Ptnml(0-2): 21, 294, 819, 404, 46

LTC: https://tests.stockfishchess.org/tests/view/5f2c2ca6b3ebe5cbfee85b69
LLR: 2.98 (-2.94,2.94) {0.25,1.75}
Total: 3200 W: 287 L: 183 D: 2730
Ptnml(0-2): 4, 149, 1191, 251, 5

closes official-stockfish/Stockfish#2916

Bench 4746616
STC:
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 10608 W: 1507 L: 1358 D: 7743
Ptnml(0-2): 94, 945, 3074, 1100, 91
https://tests.stockfishchess.org/tests/view/5f2c5921b3ebe5cbfee85b8b

LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 7536 W: 556 L: 448 D: 6532
Ptnml(0-2): 9, 383, 2881, 481, 14
https://tests.stockfishchess.org/tests/view/5f2c6f4461e3b6af64881e95

closes official-stockfish/Stockfish#2919

Bench: 4746616
Passed STC:
https://tests.stockfishchess.org/tests/view/5f2aa49fa5abc164f05e4d1b
LLR: 2.95 (-2.94,2.94) {-0.50,1.50}
Total: 40888 W: 7977 L: 7726 D: 25185
Ptnml(0-2): 665, 4806, 9333, 4893, 747

Passed LTC:
https://tests.stockfishchess.org/tests/view/5f2b1059b3ebe5cbfee85ae7
LLR: 2.98 (-2.94,2.94) {0.25,1.75}
Total: 51264 W: 6445 L: 6134 D: 38685
Ptnml(0-2): 328, 4564, 15580, 4789, 371

closes official-stockfish/Stockfish#2920

bench: 4314943
STC https://tests.stockfishchess.org/tests/view/5f2955b1a5abc164f05e4c85
LLR: 2.96 (-2.94,2.94) {-1.50,0.50}
Total: 29216 W: 5560 L: 5416 D: 18240
Ptnml(0-2): 466, 3329, 6902, 3417, 494

LTC https://tests.stockfishchess.org/tests/view/5f299154a5abc164f05e4ca1
LLR: 2.92 (-2.94,2.94) {-1.50,0.50}
Total: 54144 W: 6635 L: 6594 D: 40915
Ptnml(0-2): 372, 4859, 16536, 4966, 339

closes official-stockfish/Stockfish#2910

Bench: 4609008
This alllows to simplify the code because the move counter haven't to be
decremented later if a move isn't legal. As a side effect now illegal
pruned moves doesn't included anymore in move counter. So slightly less
pruning and reductions are done.

STC:
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 111016 W: 21106 L: 21077 D: 68833
Ptnml(0-2): 1830, 13083, 25736, 12946, 1913
https://tests.stockfishchess.org/tests/view/5f28816fa5abc164f05e4c26

LTC:
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 39264 W: 4909 L: 4843 D: 29512
Ptnml(0-2): 263, 3601, 11854, 3635, 279
https://tests.stockfishchess.org/tests/view/5f297902a5abc164f05e4c8e

closes official-stockfish/Stockfish#2906

Bench: 4390086
Net created at 20200806-1802

passed STC:
https://tests.stockfishchess.org/tests/view/5f2d00b461e3b6af64881f21
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 6672 W: 1052 L: 898 D: 4722
Ptnml(0-2): 63, 600, 1868, 730, 75

passed LTC:
https://tests.stockfishchess.org/tests/view/5f2d052a61e3b6af64881f29
LLR: 2.96 (-2.94,2.94) {0.25,1.75}
Total: 7576 W: 573 L: 463 D: 6540
Ptnml(0-2): 8, 392, 2889, 480, 19

closes official-stockfish/Stockfish#2923

Bench: 4390086
STC https://tests.stockfishchess.org/tests/view/5f2d237161e3b6af64881f43
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 12712 W: 1823 L: 1664 D: 9225
Ptnml(0-2): 122, 1166, 3627, 1313, 128

LTC https://tests.stockfishchess.org/tests/view/5f2d473061e3b6af64881f6f
LLR: 2.96 (-2.94,2.94) {0.25,1.75}
Total: 12104 W: 912 L: 788 D: 10404
Ptnml(0-2): 13, 665, 4582, 769, 23

closes official-stockfish/Stockfish#2930

bench: 4271421
Allow any pawn in front of a minor piece to replace the pawn protection
requirement for outposts.

  +-------+  +-------+
  | . . o |  | o . . |    o  Their pawns
  | . o x |  | o . . |    x  Our pawns
  | o N . |  | x o B |  N,B  New (reachable) outpost
  | . . . |  | . _ . |    _  Reachable square behind a pawn
  +-------+  +-------+
  N outpost  B reaches
               outpost

  We want outposts to be secured by pawns against major pieces. If
a minor is shielded by any pawn from above, it is rarely at the same
time protected by our pawn attacks from below. However, the pawn shield
in itself offers some degree of protection.
  A pawn shield will now suffice to replace the pawn protection for the
outpost (and reachable outpost) bonus.

This effect stacks with the existing "minor behind pawn" bonus.

STC
https://tests.stockfishchess.org/tests/view/5f2bcd14b3ebe5cbfee85b2c
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 27248 W: 5353 L: 5119 D: 16776
Ptnml(0-2): 462, 3174, 6185, 3274, 529

LTC
https://tests.stockfishchess.org/tests/view/5f2bfef5b3ebe5cbfee85b5a
LLR: 2.96 (-2.94,2.94) {0.25,1.75}
Total: 99432 W: 12580 L: 12130 D: 74722
Ptnml(0-2): 696, 8903, 30049, 9391, 677

Closes #2935

Bench: 4143673
Reintroduce vondele's late irreversible move extension for fortress keeping.
This was removed when we only had classical eval.
Now that we have the NNUE net, it seems that this is useful again.

STC:
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 5352 W: 787 L: 653 D: 3912
Ptnml(0-2): 34, 451, 1579, 571, 41
https://tests.stockfishchess.org/tests/view/5f2dc8ad61e3b6af64881ff0

LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 14416 W: 1013 L: 891 D: 12512
Ptnml(0-2): 15, 722, 5623, 822, 26
https://tests.stockfishchess.org/tests/view/5f2e0e3661e3b6af6488201e

closes official-stockfish/Stockfish#2936

Bench: 4154696
This patch increases LMRdepth threshold for futility pruning at parent nodes so it can apply more often.
With radical change to evaluation approach it seems that search is really far from optimal state, especially it parts that use static evaluation of position.

passed STC
https://tests.stockfishchess.org/tests/view/5f2da75661e3b6af64881fd0
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 8744 W: 1305 L: 1156 D: 6283
Ptnml(0-2): 75, 789, 2500, 928, 80

passed LTC
https://tests.stockfishchess.org/tests/view/5f2dcb2a61e3b6af64881ff3
LLR: 2.98 (-2.94,2.94) {0.25,1.75}
Total: 17728 W: 1256 L: 1117 D: 15355
Ptnml(0-2): 22, 961, 6774, 1070, 37

Bench: 4067325
STC https://tests.stockfishchess.org/tests/view/5f2deb1661e3b6af6488200f
LLR: 2.96 (-2.94,2.94) {-1.50,0.50}
Total: 10376 W: 1481 L: 1359 D: 7536
Ptnml(0-2): 91, 953, 2981, 1069, 94

LTC: https://tests.stockfishchess.org/html/live_elo.html?5f2e0a0461e3b6af64882019
LLR: 2.99 (-2.94,2.94) {-1.50,0.50}
Total: 5040 W: 375 L: 315 D: 4350
Ptnml(0-2): 7, 263, 1926, 311, 13

closes official-stockfish/Stockfish#2934

Bench: 4067325
Possible after the recent reording pos.legal(move) check

official-stockfish/Stockfish#2941

No functional change.
STC: https://tests.stockfishchess.org/tests/view/5f2dc38561e3b6af64881fec
LLR: 2.99 (-2.94,2.94) {-0.50,1.50}
Total: 6120 W: 903 L: 758 D: 4459
Ptnml(0-2): 44, 535, 1775, 644, 62

LTC: https://tests.stockfishchess.org/tests/view/5f2dd55f61e3b6af64882003
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 7424 W: 577 L: 463 D: 6384
Ptnml(0-2): 16, 375, 2824, 473, 24

closes official-stockfish/Stockfish#2942

bench 4107833
This patch lines up with other patches which use better eval to produce more aggressive cutoffs based on static evaluation of position, it allows more aggressive futility pruning for captures - so now we will be producing them with bigger evaluation of position, so more often.

passed STC
https://tests.stockfishchess.org/tests/view/5f2da79e61e3b6af64881fd2
LLR: 3.87 (-2.94,2.94) {-0.50,1.50}
Total: 27256 W: 3809 L: 3593 D: 19854
Ptnml(0-2): 221, 2578, 7830, 2762, 237

passed LTC
https://tests.stockfishchess.org/tests/view/5f2df92061e3b6af64882012
LLR: 4.97 (-2.94,2.94) {0.25,1.75}
Total: 43624 W: 3095 L: 2820 D: 37709
Ptnml(0-2): 66, 2410, 16608, 2639, 89

closes official-stockfish/Stockfish#2946

Bench: 4272280
This patch tries to run multiple LTO threads in parallel, speeding up
the build process of optimized builds if the -j make parameter is used.
This mitigates the longer linking times of optimized builds since the
integration of the NNUE code. Roughly 2x build speedup.

I've tried a similar patch some two years ago but it ran into trouble
with old compiler versions then. Since we're on the C++17 standard now
these old compilers should be obsolete.

closes official-stockfish/Stockfish#2943

No functional change.
Tweak depth.

STC https://tests.stockfishchess.org/tests/view/5f2d22ec61e3b6af64881f40
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 17984 W: 2603 L: 2441 D: 12940
Ptnml(0-2): 133, 1751, 5094, 1849, 165

LTC https://tests.stockfishchess.org/tests/view/5f2d5a6a61e3b6af64881f7f
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 85808 W: 5956 L: 5621 D: 74231
Ptnml(0-2): 149, 4748, 32785, 5063, 159

closes official-stockfish/Stockfish#2950

fixes two README.md typos:
fixes official-stockfish/Stockfish#2932

bench: 4022669
All credit to Vizvezdenec, the original author of the idea.

STC https://tests.stockfishchess.org/tests/view/5f2d606a61e3b6af64881f88
LLR: 2.95 (-2.94,2.94) {-0.50,1.50}
Total: 8440 W: 1191 L: 1048 D: 6201
Ptnml(0-2): 59, 754, 2467, 865, 75

LTC https://tests.stockfishchess.org/tests/view/5f2d84ad61e3b6af64881fbd
LLR: 2.95 (-2.94,2.94) {0.25,1.75}
Total: 21896 W: 1557 L: 1406 D: 18933
Ptnml(0-2): 33, 1185, 8378, 1302, 50

closes official-stockfish/Stockfish#2951

bench: 4084753
small rewording, but also print the download url for the default net.

closes official-stockfish/Stockfish#2954

No functional change
introduced with d7a2689

closes official-stockfish/Stockfish#2959

No functional change.
The idea of this patch is that positions are usually more complex and hard to evaluate even if there are more pawns.
This patch adjusts NNUE threshold usage depending on number of pawns in position, if pawn count is <3 we use the
classical evaluation more often, for pawn count = 3 patch the is non-functional,
with pawn count > 3 NNUE evaluation is used more often.

passed STC
https://tests.stockfishchess.org/tests/view/5f2f02d09081672066536b1f
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 36520 W: 5011 L: 4823 D: 26686
Ptnml(0-2): 299, 3482, 10548, 3594, 337

passed LTC
https://tests.stockfishchess.org/tests/view/5f2f4c329081672066536b5c
LLR: 2.98 (-2.94,2.94) {0.25,1.75}
Total: 39272 W: 2630 L: 2433 D: 34209
Ptnml(0-2): 53, 2066, 15218, 2229, 70

closes official-stockfish/Stockfish#2960

bench 4084753
after some testing, no version of MinGW/gcc has been found where this code is still necessary.
Probably older code (pre-c++17?)

closes official-stockfish/Stockfish#2891

No functional change
the stateInfo at the rootPos is no longer read-only, as the NNUE accumulator is part of it.
Threads can thus not share this object and need their own copy.

tested for no regression
https://tests.stockfishchess.org/tests/view/5f3022239081672066536bce
LLR: 2.96 (-2.94,2.94) {-1.50,0.50}
Total: 52800 W: 6843 L: 6802 D: 39155
Ptnml(0-2): 336, 4646, 16399, 4679, 340

closes official-stockfish/Stockfish#2957

fixes official-stockfish/Stockfish#2933

No functional change
This reverts commit a6e8929.

The offending setup has been found as gcc/mingw 7.3 (on Ubuntu 18.04).

fixes official-stockfish/Stockfish#2963

closes official-stockfish/Stockfish#2968

No functional change.
First trained net using search eval instead of pv leaf static eval.

Net created at: 20200810-0744

passed STC: https://tests.stockfishchess.org/tests/view/5f30995d90816720665373f8
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 15416 W: 2071 L: 1920 D: 11425
Ptnml(0-2): 123, 1376, 4563, 1519, 127

passed LTC: https://tests.stockfishchess.org/tests/view/5f30a104908167206653742b
LLR: 2.93 (-2.94,2.94) {0.25,1.75}
Total: 29792 W: 2003 L: 1834 D: 25955
Ptnml(0-2): 50, 1541, 11550, 1700, 55

closes official-stockfish/Stockfish#2966

Bench: 4084753
STC https://tests.stockfishchess.org/tests/view/5f3059d1908167206653736b:
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 12520 W: 766 L: 727 D: 11027
Ptnml(0-2): 13, 624, 4949, 659, 15

LTC: https://tests.stockfishchess.org/tests/view/5f30863a90816720665373d1
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 12520 W: 766 L: 727 D: 11027
Ptnml(0-2): 13, 624, 4949, 659, 15

closes: official-stockfish/Stockfish#2965

Bench: 4084753
despite usage of alignas, the generated (avx2/avx512) code with older compilers needs to use
unaligned loads with older gcc (e.g. confirmed crash with gcc 7.3/mingw on abrok).

Better performance thus requires gcc >= 9 on hardware supporting avx2/avx512

closes official-stockfish/Stockfish#2969

No functional change
Extend castling only if there are few friendly pieces on the castling side.

Inspired by silversolver1's (Rahul Dsilva) test
https://tests.stockfishchess.org/tests/view/5f0fef560640035f9d2978cf

STC:
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 7096 W: 947 L: 818 D: 5331
Ptnml(0-2): 32, 604, 2181, 665, 66
https://tests.stockfishchess.org/tests/view/5f309f729081672066537426

LTC:
LLR: 2.96 (-2.94,2.94) {0.25,1.75}
Total: 4712 W: 300 L: 215 D: 4197
Ptnml(0-2): 2, 190, 1895, 259, 10
https://tests.stockfishchess.org/tests/view/5f30a2039081672066537430

closes official-stockfish/Stockfish#2970

Bench: 4094850
Makefile targets x86-64-sse42, x86-sse3 are removed; x86-64-sse41
is renamed to x86-64-sse41-popcnt (it did enable popcnt).

Makefile variables sse3, sse42, their associated compilation flags
and code in misc.cpp are removed.

closes official-stockfish/Stockfish#2922

No functional change
mstembera and others added 23 commits August 10, 2020 14:38
AVX512 +4% faster
AVX2 +1% faster
SSSE3 +5% faster

passed non-regression STC:
STC https://tests.stockfishchess.org/tests/view/5f31249f90816720665374f6
LLR: 2.96 (-2.94,2.94) {-1.50,0.50}
Total: 17576 W: 2344 L: 2245 D: 12987
Ptnml(0-2): 127, 1570, 5292, 1675, 124

closes official-stockfish/Stockfish#2962

No functional change
This patch allows old x86 CPUs, from AMD K8 (which the x86-64 baseline
targets) all the way down to the Pentium MMX, to benefit from NNUE with
comparable performance hit versus hand-written eval as on more modern
processors.

NPS of the bench with NNUE enabled on a Pentium III 1.13 GHz (using the
MMX code):
  master: 38951
  this patch: 80586

NPS of the bench with NNUE enabled using baseline x86-64 arch, which is
how linux distros are likely to package stockfish, on a modern CPU
(using the SSE2 code):
  master: 882584
  this patch: 1203945

closes official-stockfish/Stockfish#2956

No functional change.
STC https://tests.stockfishchess.org/tests/view/5f31219090816720665374ec
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 3376 W: 487 L: 359 D: 2530
Ptnml(0-2): 17, 253, 1042, 337, 39

LTC https://tests.stockfishchess.org/tests/view/5f3127f79081672066537502
LLR: 2.93 (-2.94,2.94) {0.25,1.75}
Total: 8360 W: 581 L: 475 D: 7304
Ptnml(0-2): 11, 407, 3238, 513, 11

closes official-stockfish/Stockfish#2971

bench: 4733874
and rename a variable

closes official-stockfish/Stockfish#2819

No functional change
This adds -save-temps to the linker flags when parallel LTO is used on
MinGW/MSYS.

fixes #2977

closes official-stockfish/Stockfish#2978

No functional change.
Move to posix_memalign for those platforms, in particular android,
that do not fully support c++17 std::aligned_alloc() (and are not windows)

see official-stockfish/Stockfish#2860

closes official-stockfish/Stockfish#2973

No functional change
avoids an intrinsic that is missing in gcc < 10.

For this target, might trigger another gcc bug on windows that
requires up-to-date gcc 8, 9, or 10, or usage of clang.

Fixes official-stockfish/Stockfish#2975

closes official-stockfish/Stockfish#2976

No functional change
…rofile-build) of the NNUE part of the code.

Joint work gvreuls / vondele

* Download the default NNUE net in AppVeyor
* Download net in travis CI `make net`
* Adjust tests to cover more archs, speedup instrumented testing
* Introduce 'mixed' bench as default, with further options:

classical, NNUE, mixed.

mixed (default) and NNUE require the default net to be present,
which can be obtained with

```
make net
```

Further examples (first is equivalent to `./stockfish bench`):

```
./stockfish bench 16 1 13 default depth mixed
./stockfish bench 16 1 13 default depth classical
./stockfish bench 16 1 13 default depth NNUE
```

The net is now downloaded automatically if needed for `profile-build`
(usual `build` works fine without net present)

PGO gives a nice speedup on fishtest:

passed STC:
LLR: 2.93 (-2.94,2.94) {-0.50,1.50}
Total: 3360 W: 469 L: 343 D: 2548
Ptnml(0-2): 20, 246, 1030, 356, 28
https://tests.stockfishchess.org/tests/view/5f31b5499081672066537569

passed LTC:
LLR: 2.97 (-2.94,2.94) {0.25,1.75}
Total: 8824 W: 609 L: 502 D: 7713
Ptnml(0-2): 8, 430, 3438, 519, 17
https://tests.stockfishchess.org/tests/view/5f31c87b908167206653757c

closes official-stockfish/Stockfish#2931

fixes official-stockfish/Stockfish#2907

requires fishtest updates before commit

Bench: 4290577
Change condition from three friendly pieces to two. This now means that we only extend castling on the king side if there are no other friendly pieces aside from king and rook. For the queen side, we only extend if there is only a rook and another friendly piece or if there is only a single rook and no other friendly piece but this is very rare.

STC:
LLR: 3.20 (-2.94,2.94) {-0.50,1.50}
Total: 31144 W: 4086 L: 3903 D: 23155
Ptnml(0-2): 227, 2843, 9278, 2968, 256
https://tests.stockfishchess.org/tests/view/5f31487f9081672066537516

LTC:
LLR: 2.93 (-2.94,2.94) {0.25,1.75}
Total: 57816 W: 3786 L: 3538 D: 50492
Ptnml(0-2): 92, 2991, 22488, 3251, 86
https://tests.stockfishchess.org/tests/view/5f3167c3908167206653753d

closes official-stockfish/Stockfish#2980

Bench: 4244812
this workaround is possibly rather a windows & gcc specific problem. See e.g.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412#c25

on Linux with gcc 8 this patch brings roughly a 8% speedup.
However, probably needs some testing in the wild.

includes a workaround for an old msys make (3.81) installation (fixes #2984)

No functional change
fails to build on that target, because of missing Intel Intrinsics.
macOS has posix_memalign() since ~2014 so we can simplify the code and just use that for all Apple platforms.

closes official-stockfish/Stockfish#2985

No functional change.
Adds support for Vector Neural Network Instructions (avx512), as available on Intel Cascade Lake

The _mm512_dpbusd_epi32() intrinsic (vpdpbusd instruction) is taylor made for NNUE.

on a cascade lake CPU (AWS C5.24x.large, gcc 10) NNUE eval is at roughly 78% nps of classical
(single core test)

bench 1024 1 24 default depth:
target 	classical 	NNUE 	ratio
vnni 	2207232 	1725987 	78.20
avx512 	2216789 	1671734 	75.41
avx2 	2194006 	1611263 	73.44
modern 	2185001 	1352469 	61.90

closes official-stockfish/Stockfish#2987

No functional change
was missing in the list of outputs, slightly reorder flags.
explicitly add -msse2 if USE_SSE2 (is implicit already, -msse -m64).

closes official-stockfish/Stockfish#2990

No functional change.
Net created at: 20200812-2257

passed STC: https://tests.stockfishchess.org/tests/view/5f340ca99e5f2effc089da17
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 5744 W: 756 L: 627 D: 4361
Ptnml(0-2): 28, 485, 1731, 586, 42

passed LTC: https://tests.stockfishchess.org/tests/view/5f341eba9e5f2effc089da23
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 17136 W: 1041 L: 917 D: 15178
Ptnml(0-2): 13, 813, 6807, 907, 28

closes official-stockfish/Stockfish#2992

Bench: 3935117
Do not show the details of the default architecture for a simple "make help"
invocation, as the details are most likely to confuse beginners. Instead we
make it clear which architecture is the default and put an example at the end
of the Makefile as an incentative to use "make help ARCH=blah" to discover
the flags used by the different architectures.

```
    make help
    make help ARCH=x86-64-ssse3
```

Also clean-up and modernize a bit the Makefile examples while at it.

closes official-stockfish/Stockfish#2996

No functional change
check SHA of the available and downloaded file.

Document the format requirement on the default net.

Also allow curl to make possibly insecure connections, as needed for old curl.

fixes official-stockfish/Stockfish#2998

closes official-stockfish/Stockfish#3000

No functional change.
Move the existing dampening function last so that NNUE evaluations are
also handled as we approach the 50 move rule.

STC:
LLR: 2.95 (-2.94,2.94) {-0.50,1.50}
Total: 4792 W: 695 L: 561 D: 3536
Ptnml(0-2): 19, 420, 1422, 478, 57
https://tests.stockfishchess.org/tests/view/5f3164179081672066537534

LTC:
LLR: 8.62 (-2.94,2.94) {0.25,1.75}
Total: 286744 W: 18494 L: 17430 D: 250820
Ptnml(0-2): 418, 14886, 111745, 15860, 463
https://tests.stockfishchess.org/tests/view/5f316b039081672066537541

closes official-stockfish/Stockfish#3004

Bench: 4001800
The idea is that since we are mixing NNUE and classical evals matching their magnitudes closer allows for better comparisons.

STC https://tests.stockfishchess.org/tests/view/5f35a65411a9b1a1dbf18e2b
LLR: 2.94 (-2.94,2.94) {-0.50,1.50}
Total: 9840 W: 1150 L: 1027 D: 7663
Ptnml(0-2): 49, 772, 3175, 855, 69

LTC https://tests.stockfishchess.org/tests/view/5f35bcbe11a9b1a1dbf18e47
LLR: 2.93 (-2.94,2.94) {0.25,1.75}
Total: 44424 W: 2492 L: 2294 D: 39638
Ptnml(0-2): 42, 2015, 17915, 2183, 57

also corrects the location to clamp the evaluation (non-function on bench).

closes official-stockfish/Stockfish#3003

bench: 3905447
@vondele
Copy link
Contributor

vondele commented Aug 16, 2020

@niklasf
one guess, you set sse3 = yes in the Makefile, should probably be ssse3 = yes (one more s)

@niklasf
Copy link
Member Author

niklasf commented Aug 16, 2020

Oh, wow, thanks! That would have taken me forever to find. Unfortunately it didn't do much for nps.

@niklasf niklasf closed this Dec 14, 2020
@hi-ogawa hi-ogawa mentioned this pull request Feb 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet