[EXPERIMENT] NNUE #21

niklasf · 2020-08-15T17:06:54Z

Loading a NNUE file.

Performance is horrible. Very roughly, for go depth 25 in starting position:

1600 knps native x86-64-sse41-popcnt
1000 knps native x86-64-sse41-popcnt nnue
1000 knps wasm master branch
10 knps wasm nnue
60 knps wasm nnue with -msimd128 (allows: sse, ssse3, sse41), requires --experimental-wasm-simd flag in Chrome

Not known where performance is lost.

This patch ports the efficiently updatable neural network (NNUE) evaluation to Stockfish. Both the NNUE and the classical evaluations are available, and can be used to assign a value to a position that is later used in alpha-beta (PVS) search to find the best move. The classical evaluation computes this value as a function of various chess concepts, handcrafted by experts, tested and tuned using fishtest. The NNUE evaluation computes this value with a neural network based on basic inputs. The network is optimized and trained on the evalutions of millions of positions at moderate search depth. The NNUE evaluation was first introduced in shogi, and ported to Stockfish afterward. It can be evaluated efficiently on CPUs, and exploits the fact that only parts of the neural network need to be updated after a typical chess move. [The nodchip repository](https://github.com/nodchip/Stockfish) provides additional tools to train and develop the NNUE networks. This patch is the result of contributions of various authors, from various communities, including: nodchip, ynasu87, yaneurao (initial port and NNUE authors), domschl, FireFather, rqs, xXH4CKST3RXx, tttak, zz4032, joergoster, mstembera, nguyenpham, erbsenzaehler, dorzechowski, and vondele. This new evaluation needed various changes to fishtest and the corresponding infrastructure, for which tomtor, ppigazzini, noobpwnftw, daylen, and vondele are gratefully acknowledged. The first networks have been provided by gekkehenker and sergiovieri, with the latter net (nn-97f742aaefcd.nnue) being the current default. The evaluation function can be selected at run time with the `Use NNUE` (true/false) UCI option, provided the `EvalFile` option points the the network file (depending on the GUI, with full path). The performance of the NNUE evaluation relative to the classical evaluation depends somewhat on the hardware, and is expected to improve quickly, but is currently on > 80 Elo on fishtest: 60000 @ 10+0.1 th 1 https://tests.stockfishchess.org/tests/view/5f28fe6ea5abc164f05e4c4c ELO: 92.77 +-2.1 (95%) LOS: 100.0% Total: 60000 W: 24193 L: 8543 D: 27264 Ptnml(0-2): 609, 3850, 9708, 10948, 4885 40000 @ 20+0.2 th 8 https://tests.stockfishchess.org/tests/view/5f290229a5abc164f05e4c58 ELO: 89.47 +-2.0 (95%) LOS: 100.0% Total: 40000 W: 12756 L: 2677 D: 24567 Ptnml(0-2): 74, 1583, 8550, 7776, 2017 At the same time, the impact on the classical evaluation remains minimal, causing no significant regression: sprt @ 10+0.1 th 1 https://tests.stockfishchess.org/tests/view/5f2906a2a5abc164f05e4c5b LLR: 2.94 (-2.94,2.94) {-6.00,-4.00} Total: 34936 W: 6502 L: 6825 D: 21609 Ptnml(0-2): 571, 4082, 8434, 3861, 520 sprt @ 60+0.6 th 1 https://tests.stockfishchess.org/tests/view/5f2906cfa5abc164f05e4c5d LLR: 2.93 (-2.94,2.94) {-6.00,-4.00} Total: 10088 W: 1232 L: 1265 D: 7591 Ptnml(0-2): 49, 914, 3170, 843, 68 The needed networks can be found at https://tests.stockfishchess.org/nns It is recommended to use the default one as indicated by the `EvalFile` UCI option. Guidelines for testing new nets can be found at https://github.com/glinscott/fishtest/wiki/Creating-my-first-test#nnue-net-tests Integration has been discussed in various issues: official-stockfish/Stockfish#2823 official-stockfish/Stockfish#2728 The integration branch will be closed after the merge: official-stockfish/Stockfish#2825 https://github.com/official-stockfish/Stockfish/tree/nnue-player-wip closes official-stockfish/Stockfish#2912 This will be an exciting time for computer chess, looking forward to seeing the evolution of this approach. Bench: 4746616

The idea is to use NNUE only on quite balanced material positions. This bring a big speedup on research since NNUE eval is slower than classical eval for most of the hardwares and specially on unbalanced positions with LazyEval. STC: https://tests.stockfishchess.org/tests/view/5f2c2680b3ebe5cbfee85b61 LLR: 2.95 (-2.94,2.94) {-0.50,1.50} Total: 3168 W: 560 L: 400 D: 2208 Ptnml(0-2): 21, 294, 819, 404, 46 LTC: https://tests.stockfishchess.org/tests/view/5f2c2ca6b3ebe5cbfee85b69 LLR: 2.98 (-2.94,2.94) {0.25,1.75} Total: 3200 W: 287 L: 183 D: 2730 Ptnml(0-2): 4, 149, 1191, 251, 5 closes official-stockfish/Stockfish#2916 Bench 4746616

STC: LLR: 2.93 (-2.94,2.94) {-0.50,1.50} Total: 10608 W: 1507 L: 1358 D: 7743 Ptnml(0-2): 94, 945, 3074, 1100, 91 https://tests.stockfishchess.org/tests/view/5f2c5921b3ebe5cbfee85b8b LTC: LLR: 2.94 (-2.94,2.94) {0.25,1.75} Total: 7536 W: 556 L: 448 D: 6532 Ptnml(0-2): 9, 383, 2881, 481, 14 https://tests.stockfishchess.org/tests/view/5f2c6f4461e3b6af64881e95 closes official-stockfish/Stockfish#2919 Bench: 4746616

Passed STC: https://tests.stockfishchess.org/tests/view/5f2aa49fa5abc164f05e4d1b LLR: 2.95 (-2.94,2.94) {-0.50,1.50} Total: 40888 W: 7977 L: 7726 D: 25185 Ptnml(0-2): 665, 4806, 9333, 4893, 747 Passed LTC: https://tests.stockfishchess.org/tests/view/5f2b1059b3ebe5cbfee85ae7 LLR: 2.98 (-2.94,2.94) {0.25,1.75} Total: 51264 W: 6445 L: 6134 D: 38685 Ptnml(0-2): 328, 4564, 15580, 4789, 371 closes official-stockfish/Stockfish#2920 bench: 4314943

STC https://tests.stockfishchess.org/tests/view/5f2955b1a5abc164f05e4c85 LLR: 2.96 (-2.94,2.94) {-1.50,0.50} Total: 29216 W: 5560 L: 5416 D: 18240 Ptnml(0-2): 466, 3329, 6902, 3417, 494 LTC https://tests.stockfishchess.org/tests/view/5f299154a5abc164f05e4ca1 LLR: 2.92 (-2.94,2.94) {-1.50,0.50} Total: 54144 W: 6635 L: 6594 D: 40915 Ptnml(0-2): 372, 4859, 16536, 4966, 339 closes official-stockfish/Stockfish#2910 Bench: 4609008

This alllows to simplify the code because the move counter haven't to be decremented later if a move isn't legal. As a side effect now illegal pruned moves doesn't included anymore in move counter. So slightly less pruning and reductions are done. STC: LLR: 2.94 (-2.94,2.94) {-1.50,0.50} Total: 111016 W: 21106 L: 21077 D: 68833 Ptnml(0-2): 1830, 13083, 25736, 12946, 1913 https://tests.stockfishchess.org/tests/view/5f28816fa5abc164f05e4c26 LTC: LLR: 2.94 (-2.94,2.94) {-1.50,0.50} Total: 39264 W: 4909 L: 4843 D: 29512 Ptnml(0-2): 263, 3601, 11854, 3635, 279 https://tests.stockfishchess.org/tests/view/5f297902a5abc164f05e4c8e closes official-stockfish/Stockfish#2906 Bench: 4390086

Net created at 20200806-1802 passed STC: https://tests.stockfishchess.org/tests/view/5f2d00b461e3b6af64881f21 LLR: 2.94 (-2.94,2.94) {-0.50,1.50} Total: 6672 W: 1052 L: 898 D: 4722 Ptnml(0-2): 63, 600, 1868, 730, 75 passed LTC: https://tests.stockfishchess.org/tests/view/5f2d052a61e3b6af64881f29 LLR: 2.96 (-2.94,2.94) {0.25,1.75} Total: 7576 W: 573 L: 463 D: 6540 Ptnml(0-2): 8, 392, 2889, 480, 19 closes official-stockfish/Stockfish#2923 Bench: 4390086

STC https://tests.stockfishchess.org/tests/view/5f2d237161e3b6af64881f43 LLR: 2.96 (-2.94,2.94) {-0.50,1.50} Total: 12712 W: 1823 L: 1664 D: 9225 Ptnml(0-2): 122, 1166, 3627, 1313, 128 LTC https://tests.stockfishchess.org/tests/view/5f2d473061e3b6af64881f6f LLR: 2.96 (-2.94,2.94) {0.25,1.75} Total: 12104 W: 912 L: 788 D: 10404 Ptnml(0-2): 13, 665, 4582, 769, 23 closes official-stockfish/Stockfish#2930 bench: 4271421

Allow any pawn in front of a minor piece to replace the pawn protection requirement for outposts. +-------+ +-------+ | . . o | | o . . | o Their pawns | . o x | | o . . | x Our pawns | o N . | | x o B | N,B New (reachable) outpost | . . . | | . _ . | _ Reachable square behind a pawn +-------+ +-------+ N outpost B reaches outpost We want outposts to be secured by pawns against major pieces. If a minor is shielded by any pawn from above, it is rarely at the same time protected by our pawn attacks from below. However, the pawn shield in itself offers some degree of protection. A pawn shield will now suffice to replace the pawn protection for the outpost (and reachable outpost) bonus. This effect stacks with the existing "minor behind pawn" bonus. STC https://tests.stockfishchess.org/tests/view/5f2bcd14b3ebe5cbfee85b2c LLR: 2.94 (-2.94,2.94) {-0.50,1.50} Total: 27248 W: 5353 L: 5119 D: 16776 Ptnml(0-2): 462, 3174, 6185, 3274, 529 LTC https://tests.stockfishchess.org/tests/view/5f2bfef5b3ebe5cbfee85b5a LLR: 2.96 (-2.94,2.94) {0.25,1.75} Total: 99432 W: 12580 L: 12130 D: 74722 Ptnml(0-2): 696, 8903, 30049, 9391, 677 Closes #2935 Bench: 4143673

Reintroduce vondele's late irreversible move extension for fortress keeping. This was removed when we only had classical eval. Now that we have the NNUE net, it seems that this is useful again. STC: LLR: 2.93 (-2.94,2.94) {-0.50,1.50} Total: 5352 W: 787 L: 653 D: 3912 Ptnml(0-2): 34, 451, 1579, 571, 41 https://tests.stockfishchess.org/tests/view/5f2dc8ad61e3b6af64881ff0 LTC: LLR: 2.94 (-2.94,2.94) {0.25,1.75} Total: 14416 W: 1013 L: 891 D: 12512 Ptnml(0-2): 15, 722, 5623, 822, 26 https://tests.stockfishchess.org/tests/view/5f2e0e3661e3b6af6488201e closes official-stockfish/Stockfish#2936 Bench: 4154696

This patch increases LMRdepth threshold for futility pruning at parent nodes so it can apply more often. With radical change to evaluation approach it seems that search is really far from optimal state, especially it parts that use static evaluation of position. passed STC https://tests.stockfishchess.org/tests/view/5f2da75661e3b6af64881fd0 LLR: 2.93 (-2.94,2.94) {-0.50,1.50} Total: 8744 W: 1305 L: 1156 D: 6283 Ptnml(0-2): 75, 789, 2500, 928, 80 passed LTC https://tests.stockfishchess.org/tests/view/5f2dcb2a61e3b6af64881ff3 LLR: 2.98 (-2.94,2.94) {0.25,1.75} Total: 17728 W: 1256 L: 1117 D: 15355 Ptnml(0-2): 22, 961, 6774, 1070, 37 Bench: 4067325

STC https://tests.stockfishchess.org/tests/view/5f2deb1661e3b6af6488200f LLR: 2.96 (-2.94,2.94) {-1.50,0.50} Total: 10376 W: 1481 L: 1359 D: 7536 Ptnml(0-2): 91, 953, 2981, 1069, 94 LTC: https://tests.stockfishchess.org/html/live_elo.html?5f2e0a0461e3b6af64882019 LLR: 2.99 (-2.94,2.94) {-1.50,0.50} Total: 5040 W: 375 L: 315 D: 4350 Ptnml(0-2): 7, 263, 1926, 311, 13 closes official-stockfish/Stockfish#2934 Bench: 4067325

Possible after the recent reording pos.legal(move) check official-stockfish/Stockfish#2941 No functional change.

STC: https://tests.stockfishchess.org/tests/view/5f2dc38561e3b6af64881fec LLR: 2.99 (-2.94,2.94) {-0.50,1.50} Total: 6120 W: 903 L: 758 D: 4459 Ptnml(0-2): 44, 535, 1775, 644, 62 LTC: https://tests.stockfishchess.org/tests/view/5f2dd55f61e3b6af64882003 LLR: 2.95 (-2.94,2.94) {0.25,1.75} Total: 7424 W: 577 L: 463 D: 6384 Ptnml(0-2): 16, 375, 2824, 473, 24 closes official-stockfish/Stockfish#2942 bench 4107833

This patch lines up with other patches which use better eval to produce more aggressive cutoffs based on static evaluation of position, it allows more aggressive futility pruning for captures - so now we will be producing them with bigger evaluation of position, so more often. passed STC https://tests.stockfishchess.org/tests/view/5f2da79e61e3b6af64881fd2 LLR: 3.87 (-2.94,2.94) {-0.50,1.50} Total: 27256 W: 3809 L: 3593 D: 19854 Ptnml(0-2): 221, 2578, 7830, 2762, 237 passed LTC https://tests.stockfishchess.org/tests/view/5f2df92061e3b6af64882012 LLR: 4.97 (-2.94,2.94) {0.25,1.75} Total: 43624 W: 3095 L: 2820 D: 37709 Ptnml(0-2): 66, 2410, 16608, 2639, 89 closes official-stockfish/Stockfish#2946 Bench: 4272280

This patch tries to run multiple LTO threads in parallel, speeding up the build process of optimized builds if the -j make parameter is used. This mitigates the longer linking times of optimized builds since the integration of the NNUE code. Roughly 2x build speedup. I've tried a similar patch some two years ago but it ran into trouble with old compiler versions then. Since we're on the C++17 standard now these old compilers should be obsolete. closes official-stockfish/Stockfish#2943 No functional change.

Tweak depth. STC https://tests.stockfishchess.org/tests/view/5f2d22ec61e3b6af64881f40 LLR: 2.94 (-2.94,2.94) {-0.50,1.50} Total: 17984 W: 2603 L: 2441 D: 12940 Ptnml(0-2): 133, 1751, 5094, 1849, 165 LTC https://tests.stockfishchess.org/tests/view/5f2d5a6a61e3b6af64881f7f LLR: 2.95 (-2.94,2.94) {0.25,1.75} Total: 85808 W: 5956 L: 5621 D: 74231 Ptnml(0-2): 149, 4748, 32785, 5063, 159 closes official-stockfish/Stockfish#2950 fixes two README.md typos: fixes official-stockfish/Stockfish#2932 bench: 4022669

All credit to Vizvezdenec, the original author of the idea. STC https://tests.stockfishchess.org/tests/view/5f2d606a61e3b6af64881f88 LLR: 2.95 (-2.94,2.94) {-0.50,1.50} Total: 8440 W: 1191 L: 1048 D: 6201 Ptnml(0-2): 59, 754, 2467, 865, 75 LTC https://tests.stockfishchess.org/tests/view/5f2d84ad61e3b6af64881fbd LLR: 2.95 (-2.94,2.94) {0.25,1.75} Total: 21896 W: 1557 L: 1406 D: 18933 Ptnml(0-2): 33, 1185, 8378, 1302, 50 closes official-stockfish/Stockfish#2951 bench: 4084753

fixes official-stockfish/Stockfish#2921 closes official-stockfish/Stockfish#2927 No functional change

small rewording, but also print the download url for the default net. closes official-stockfish/Stockfish#2954 No functional change

introduced with d7a2689 closes official-stockfish/Stockfish#2959 No functional change.

The idea of this patch is that positions are usually more complex and hard to evaluate even if there are more pawns. This patch adjusts NNUE threshold usage depending on number of pawns in position, if pawn count is <3 we use the classical evaluation more often, for pawn count = 3 patch the is non-functional, with pawn count > 3 NNUE evaluation is used more often. passed STC https://tests.stockfishchess.org/tests/view/5f2f02d09081672066536b1f LLR: 2.96 (-2.94,2.94) {-0.50,1.50} Total: 36520 W: 5011 L: 4823 D: 26686 Ptnml(0-2): 299, 3482, 10548, 3594, 337 passed LTC https://tests.stockfishchess.org/tests/view/5f2f4c329081672066536b5c LLR: 2.98 (-2.94,2.94) {0.25,1.75} Total: 39272 W: 2630 L: 2433 D: 34209 Ptnml(0-2): 53, 2066, 15218, 2229, 70 closes official-stockfish/Stockfish#2960 bench 4084753

after some testing, no version of MinGW/gcc has been found where this code is still necessary. Probably older code (pre-c++17?) closes official-stockfish/Stockfish#2891 No functional change

the stateInfo at the rootPos is no longer read-only, as the NNUE accumulator is part of it. Threads can thus not share this object and need their own copy. tested for no regression https://tests.stockfishchess.org/tests/view/5f3022239081672066536bce LLR: 2.96 (-2.94,2.94) {-1.50,0.50} Total: 52800 W: 6843 L: 6802 D: 39155 Ptnml(0-2): 336, 4646, 16399, 4679, 340 closes official-stockfish/Stockfish#2957 fixes official-stockfish/Stockfish#2933 No functional change

This reverts commit a6e8929. The offending setup has been found as gcc/mingw 7.3 (on Ubuntu 18.04). fixes official-stockfish/Stockfish#2963 closes official-stockfish/Stockfish#2968 No functional change.

First trained net using search eval instead of pv leaf static eval. Net created at: 20200810-0744 passed STC: https://tests.stockfishchess.org/tests/view/5f30995d90816720665373f8 LLR: 2.93 (-2.94,2.94) {-0.50,1.50} Total: 15416 W: 2071 L: 1920 D: 11425 Ptnml(0-2): 123, 1376, 4563, 1519, 127 passed LTC: https://tests.stockfishchess.org/tests/view/5f30a104908167206653742b LLR: 2.93 (-2.94,2.94) {0.25,1.75} Total: 29792 W: 2003 L: 1834 D: 25955 Ptnml(0-2): 50, 1541, 11550, 1700, 55 closes official-stockfish/Stockfish#2966 Bench: 4084753

STC https://tests.stockfishchess.org/tests/view/5f3059d1908167206653736b: LLR: 2.94 (-2.94,2.94) {-1.50,0.50} Total: 12520 W: 766 L: 727 D: 11027 Ptnml(0-2): 13, 624, 4949, 659, 15 LTC: https://tests.stockfishchess.org/tests/view/5f30863a90816720665373d1 LLR: 2.94 (-2.94,2.94) {-1.50,0.50} Total: 12520 W: 766 L: 727 D: 11027 Ptnml(0-2): 13, 624, 4949, 659, 15 closes: official-stockfish/Stockfish#2965 Bench: 4084753

despite usage of alignas, the generated (avx2/avx512) code with older compilers needs to use unaligned loads with older gcc (e.g. confirmed crash with gcc 7.3/mingw on abrok). Better performance thus requires gcc >= 9 on hardware supporting avx2/avx512 closes official-stockfish/Stockfish#2969 No functional change

Extend castling only if there are few friendly pieces on the castling side. Inspired by silversolver1's (Rahul Dsilva) test https://tests.stockfishchess.org/tests/view/5f0fef560640035f9d2978cf STC: LLR: 2.94 (-2.94,2.94) {-0.50,1.50} Total: 7096 W: 947 L: 818 D: 5331 Ptnml(0-2): 32, 604, 2181, 665, 66 https://tests.stockfishchess.org/tests/view/5f309f729081672066537426 LTC: LLR: 2.96 (-2.94,2.94) {0.25,1.75} Total: 4712 W: 300 L: 215 D: 4197 Ptnml(0-2): 2, 190, 1895, 259, 10 https://tests.stockfishchess.org/tests/view/5f30a2039081672066537430 closes official-stockfish/Stockfish#2970 Bench: 4094850

Makefile targets x86-64-sse42, x86-sse3 are removed; x86-64-sse41 is renamed to x86-64-sse41-popcnt (it did enable popcnt). Makefile variables sse3, sse42, their associated compilation flags and code in misc.cpp are removed. closes official-stockfish/Stockfish#2922 No functional change

AVX512 +4% faster AVX2 +1% faster SSSE3 +5% faster passed non-regression STC: STC https://tests.stockfishchess.org/tests/view/5f31249f90816720665374f6 LLR: 2.96 (-2.94,2.94) {-1.50,0.50} Total: 17576 W: 2344 L: 2245 D: 12987 Ptnml(0-2): 127, 1570, 5292, 1675, 124 closes official-stockfish/Stockfish#2962 No functional change

This patch allows old x86 CPUs, from AMD K8 (which the x86-64 baseline targets) all the way down to the Pentium MMX, to benefit from NNUE with comparable performance hit versus hand-written eval as on more modern processors. NPS of the bench with NNUE enabled on a Pentium III 1.13 GHz (using the MMX code): master: 38951 this patch: 80586 NPS of the bench with NNUE enabled using baseline x86-64 arch, which is how linux distros are likely to package stockfish, on a modern CPU (using the SSE2 code): master: 882584 this patch: 1203945 closes official-stockfish/Stockfish#2956 No functional change.

STC https://tests.stockfishchess.org/tests/view/5f31219090816720665374ec LLR: 2.96 (-2.94,2.94) {-0.50,1.50} Total: 3376 W: 487 L: 359 D: 2530 Ptnml(0-2): 17, 253, 1042, 337, 39 LTC https://tests.stockfishchess.org/tests/view/5f3127f79081672066537502 LLR: 2.93 (-2.94,2.94) {0.25,1.75} Total: 8360 W: 581 L: 475 D: 7304 Ptnml(0-2): 11, 407, 3238, 513, 11 closes official-stockfish/Stockfish#2971 bench: 4733874

and rename a variable closes official-stockfish/Stockfish#2819 No functional change

This adds -save-temps to the linker flags when parallel LTO is used on MinGW/MSYS. fixes #2977 closes official-stockfish/Stockfish#2978 No functional change.

Move to posix_memalign for those platforms, in particular android, that do not fully support c++17 std::aligned_alloc() (and are not windows) see official-stockfish/Stockfish#2860 closes official-stockfish/Stockfish#2973 No functional change

avoids an intrinsic that is missing in gcc < 10. For this target, might trigger another gcc bug on windows that requires up-to-date gcc 8, 9, or 10, or usage of clang. Fixes official-stockfish/Stockfish#2975 closes official-stockfish/Stockfish#2976 No functional change

…rofile-build) of the NNUE part of the code. Joint work gvreuls / vondele * Download the default NNUE net in AppVeyor * Download net in travis CI `make net` * Adjust tests to cover more archs, speedup instrumented testing * Introduce 'mixed' bench as default, with further options: classical, NNUE, mixed. mixed (default) and NNUE require the default net to be present, which can be obtained with ``` make net ``` Further examples (first is equivalent to `./stockfish bench`): ``` ./stockfish bench 16 1 13 default depth mixed ./stockfish bench 16 1 13 default depth classical ./stockfish bench 16 1 13 default depth NNUE ``` The net is now downloaded automatically if needed for `profile-build` (usual `build` works fine without net present) PGO gives a nice speedup on fishtest: passed STC: LLR: 2.93 (-2.94,2.94) {-0.50,1.50} Total: 3360 W: 469 L: 343 D: 2548 Ptnml(0-2): 20, 246, 1030, 356, 28 https://tests.stockfishchess.org/tests/view/5f31b5499081672066537569 passed LTC: LLR: 2.97 (-2.94,2.94) {0.25,1.75} Total: 8824 W: 609 L: 502 D: 7713 Ptnml(0-2): 8, 430, 3438, 519, 17 https://tests.stockfishchess.org/tests/view/5f31c87b908167206653757c closes official-stockfish/Stockfish#2931 fixes official-stockfish/Stockfish#2907 requires fishtest updates before commit Bench: 4290577

Change condition from three friendly pieces to two. This now means that we only extend castling on the king side if there are no other friendly pieces aside from king and rook. For the queen side, we only extend if there is only a rook and another friendly piece or if there is only a single rook and no other friendly piece but this is very rare. STC: LLR: 3.20 (-2.94,2.94) {-0.50,1.50} Total: 31144 W: 4086 L: 3903 D: 23155 Ptnml(0-2): 227, 2843, 9278, 2968, 256 https://tests.stockfishchess.org/tests/view/5f31487f9081672066537516 LTC: LLR: 2.93 (-2.94,2.94) {0.25,1.75} Total: 57816 W: 3786 L: 3538 D: 50492 Ptnml(0-2): 92, 2991, 22488, 3251, 86 https://tests.stockfishchess.org/tests/view/5f3167c3908167206653753d closes official-stockfish/Stockfish#2980 Bench: 4244812

this workaround is possibly rather a windows & gcc specific problem. See e.g. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412#c25 on Linux with gcc 8 this patch brings roughly a 8% speedup. However, probably needs some testing in the wild. includes a workaround for an old msys make (3.81) installation (fixes #2984) No functional change

fails to build on that target, because of missing Intel Intrinsics. macOS has posix_memalign() since ~2014 so we can simplify the code and just use that for all Apple platforms. closes official-stockfish/Stockfish#2985 No functional change.

Adds support for Vector Neural Network Instructions (avx512), as available on Intel Cascade Lake The _mm512_dpbusd_epi32() intrinsic (vpdpbusd instruction) is taylor made for NNUE. on a cascade lake CPU (AWS C5.24x.large, gcc 10) NNUE eval is at roughly 78% nps of classical (single core test) bench 1024 1 24 default depth: target classical NNUE ratio vnni 2207232 1725987 78.20 avx512 2216789 1671734 75.41 avx2 2194006 1611263 73.44 modern 2185001 1352469 61.90 closes official-stockfish/Stockfish#2987 No functional change

was missing in the list of outputs, slightly reorder flags. explicitly add -msse2 if USE_SSE2 (is implicit already, -msse -m64). closes official-stockfish/Stockfish#2990 No functional change.

Net created at: 20200812-2257 passed STC: https://tests.stockfishchess.org/tests/view/5f340ca99e5f2effc089da17 LLR: 2.96 (-2.94,2.94) {-0.50,1.50} Total: 5744 W: 756 L: 627 D: 4361 Ptnml(0-2): 28, 485, 1731, 586, 42 passed LTC: https://tests.stockfishchess.org/tests/view/5f341eba9e5f2effc089da23 LLR: 2.94 (-2.94,2.94) {0.25,1.75} Total: 17136 W: 1041 L: 917 D: 15178 Ptnml(0-2): 13, 813, 6807, 907, 28 closes official-stockfish/Stockfish#2992 Bench: 3935117

Do not show the details of the default architecture for a simple "make help" invocation, as the details are most likely to confuse beginners. Instead we make it clear which architecture is the default and put an example at the end of the Makefile as an incentative to use "make help ARCH=blah" to discover the flags used by the different architectures. ``` make help make help ARCH=x86-64-ssse3 ``` Also clean-up and modernize a bit the Makefile examples while at it. closes official-stockfish/Stockfish#2996 No functional change

check SHA of the available and downloaded file. Document the format requirement on the default net. Also allow curl to make possibly insecure connections, as needed for old curl. fixes official-stockfish/Stockfish#2998 closes official-stockfish/Stockfish#3000 No functional change.

Move the existing dampening function last so that NNUE evaluations are also handled as we approach the 50 move rule. STC: LLR: 2.95 (-2.94,2.94) {-0.50,1.50} Total: 4792 W: 695 L: 561 D: 3536 Ptnml(0-2): 19, 420, 1422, 478, 57 https://tests.stockfishchess.org/tests/view/5f3164179081672066537534 LTC: LLR: 8.62 (-2.94,2.94) {0.25,1.75} Total: 286744 W: 18494 L: 17430 D: 250820 Ptnml(0-2): 418, 14886, 111745, 15860, 463 https://tests.stockfishchess.org/tests/view/5f316b039081672066537541 closes official-stockfish/Stockfish#3004 Bench: 4001800

The idea is that since we are mixing NNUE and classical evals matching their magnitudes closer allows for better comparisons. STC https://tests.stockfishchess.org/tests/view/5f35a65411a9b1a1dbf18e2b LLR: 2.94 (-2.94,2.94) {-0.50,1.50} Total: 9840 W: 1150 L: 1027 D: 7663 Ptnml(0-2): 49, 772, 3175, 855, 69 LTC https://tests.stockfishchess.org/tests/view/5f35bcbe11a9b1a1dbf18e47 LLR: 2.93 (-2.94,2.94) {0.25,1.75} Total: 44424 W: 2492 L: 2294 D: 39638 Ptnml(0-2): 42, 2015, 17915, 2183, 57 also corrects the location to clamp the evaluation (non-function on bench). closes official-stockfish/Stockfish#3003 bench: 3905447

vondele · 2020-08-16T10:10:03Z

@niklasf
one guess, you set sse3 = yes in the Makefile, should probably be ssse3 = yes (one more s)

niklasf · 2020-08-16T10:48:27Z

Oh, wow, thanks! That would have taken me forever to find. Unfortunately it didn't do much for nps.

nodchip and others added 30 commits August 6, 2020 16:37

Remove unnecessay legality check

450b60a

Possible after the recent reording pos.legal(move) check official-stockfish/Stockfish#2941 No functional change.

Use fallback implementation for C++ aligned_alloc

d7a2689

fixes official-stockfish/Stockfish#2921 closes official-stockfish/Stockfish#2927 No functional change

Improve error message on missing net.

320fa1b

small rewording, but also print the download url for the default net. closes official-stockfish/Stockfish#2954 No functional change

Fix aligned_alloc on MinGW

cd1bb27

introduced with d7a2689 closes official-stockfish/Stockfish#2959 No functional change.

Avoid special casing for MinGW

a6e8929

after some testing, no version of MinGW/gcc has been found where this code is still necessary. Probably older code (pre-c++17?) closes official-stockfish/Stockfish#2891 No functional change

Revert "Avoid special casing for MinGW"

651ec3b

This reverts commit a6e8929. The offending setup has been found as gcc/mingw 7.3 (on Ubuntu 18.04). fixes official-stockfish/Stockfish#2963 closes official-stockfish/Stockfish#2968 No functional change.

mstembera and others added 23 commits August 10, 2020 14:38

Add comments to probCut code

a72cec1

and rename a variable closes official-stockfish/Stockfish#2819 No functional change

Fix parallel LTO issues on Windows

4ab8b0b

This adds -save-temps to the linker flags when parallel LTO is used on MinGW/MSYS. fixes #2977 closes official-stockfish/Stockfish#2978 No functional change.

Output the SSE2 flag in compiler_info

69cfe28

was missing in the list of outputs, slightly reorder flags. explicitly add -msse2 if USE_SSE2 (is implicit already, -msse -m64). closes official-stockfish/Stockfish#2990 No functional change.

Merge remote-tracking branch 'official-stockfish/master' into nnue

4e456a5

stockfish.wasm: Remove EvalFile option

e38ac20

stockfish.wasm: Fetch eval file

d313d30

stockfish.wasm: Add debug for NNUE loading

fc24878

stockfish.wasm: Enable SIMD

2b9474f

stockfish.wasm: Fix ssse3

6f0f56e

niklasf mentioned this pull request Nov 18, 2020

Make nnue build by merging from master #25

Closed

Merge remote-tracking branch 'origin/master' into nnue

87d56f5

niklasf closed this Dec 14, 2020

hi-ogawa mentioned this pull request Feb 10, 2021

WASM SIMD for NNUE #30

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EXPERIMENT] NNUE #21

[EXPERIMENT] NNUE #21

niklasf commented Aug 15, 2020 •

edited

vondele commented Aug 16, 2020

niklasf commented Aug 16, 2020

[EXPERIMENT] NNUE #21

[EXPERIMENT] NNUE #21

Conversation

niklasf commented Aug 15, 2020 • edited

vondele commented Aug 16, 2020

niklasf commented Aug 16, 2020

niklasf commented Aug 15, 2020 •

edited