Skip to content


Subversion checkout URL

You can clone with
Download ZIP

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also compare across forks.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also compare across forks.
Checking mergeability… Don’t worry, you can still create the pull request.
This comparison is big! We’re only showing the most recent 250 commits
Commits on Jan 20, 2015
@mcostalba mcostalba Don't use _pext_u64() directly
This intrinsic to call BMI2 PEXT instruction is
defined in immintrin.h. This header should be
included only when USE_PEXT is defined, otherwise
we define _pext_u64 as 0 forcing a nop.

But under some mingw platforms, even if we don't
include the header, immintrin.h gets included
anyhow through an include chain that starts with
STL <algorithm> header. So we end up both defining
_pext_u64 function and at the same time defining
_pext_u64 as 0 leading to a compile error.

The correct solution is of not using _pext_u64 directly.

This patch fixes a compile error with some mingw64
package when compiling with x86-64.

No functional change.
Commits on Jan 21, 2015
@mcostalba mcostalba Fun with lambdas
Use lambda functions instead of has_positive_value()
and toggle_case()

No functional change.
@mcostalba mcostalba Explicitly defaulted and deleted members
Better than a bit obscure implicit ones.

No functional change.
@mcostalba mcostalba Rearrange Endgames
Remove references to EndgameBase and use instead
Value and ScaleFactor as template parameters of
the endgames maps.

No functional change.
@mcostalba mcostalba Document how to enable PEXT with MSVC
When not using Makefile, e.g. with MSVC, if hardware
supports BMI2 instructions, then USE_PEXT should be
added in project configuration to enable pext support.

No functional change.
Commits on Jan 22, 2015
@mcostalba mcostalba Rearrange bitbases C++11 way
No functional change.
Commits on Jan 24, 2015
@mcostalba mcostalba Additional work in bitbases
Verified the generated bitbases are unchanged.

No functional change.
@mcostalba mcostalba Try hard to retrieve a ponder move
In case we stop the search during a fail-high
it is possible we return to GUI without a ponder
move. This patch try harder to find a ponder move
retrieving it from TT. This is important in games
played with 'ponder on'.

bench: 8080602

Resolves #221
@mcostalba mcostalba Don't use _pext_u64() directly
This intrinsic to call BMI2 PEXT instruction is
defined in immintrin.h. This header should be
included only when USE_PEXT is defined, otherwise
we define _pext_u64 as 0 forcing a nop.

But under some mingw platforms, even if we don't
include the header, immintrin.h gets included
anyhow through an include chain that starts with
STL <algorithm> header. So we end up both defining
_pext_u64 function and at the same time defining
_pext_u64 as 0 leading to a compile error.

The correct solution is of not using _pext_u64 directly.

This patch fixes a compile error with some mingw64
package when compiling with x86-64.

No functional change.

Resolves #222
@zamar zamar Stockfish 6 Release Candidate 2
- Fix a compilation issue related to BMI2 PEXT instruction
- Retrieve a ponder move from TT if PV is only one move long

Bench: 8080602

No functional change
Commits on Jan 25, 2015
@mcostalba mcostalba Fix a MSVC warning at W4
Warning is C4512 (assignment operator could not be generated)

Now, apart the foreign syzygy code, everything compiles
without warnings at warning level 4.

Backported from C++11 branch.

No functional change.
@mcostalba mcostalba Re-arrange Skill struct
Instead of swapping sub-optimal move in Skill
d'tor, make it explicitly at the end of the search.

Also streamline and clarify relation with multiPV
and pass it directly instead of relying on the hacky
'candidates' member.

No functional change.
@zamar zamar Revert "Fix profile build for gcc on Mac OSX"
Seems to be a performance regression for standard build.

For SF6 people compiling on Mac OSX using profile-build option
just need to make necessary adjustments manually...

No functional change

Resolves #223
Stefan Geschwentner Fix a skill level problem: Don't allow move pruning at root node
Bench: 8918745

Resolves #231
@zamar zamar Stockfish 6 Release Candidate 3
- Fix a skill level problem: Don't allow move pruning at root node
- Revert "Fix profile build for gcc on Mac OSX". Results for a faster binary in x86-64.
- Fix a MSVC warning

Bench: 8918745
Commits on Jan 27, 2015
@zamar zamar Stockfish 6
Stockfish bench signature is: 8918745
Commits on Jan 28, 2015
@zamar zamar Restore development version
No functional change
@NicklasPersson NicklasPersson King safety tuning with values obtained by SPSA.
Part I:


LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 11529 W: 2075 L: 1882 D: 7572

Part II:


ELO: 2.07 +-2.1 (95%) LOS: 97.3%
Total: 34859 W: 5967 L: 5759 D: 23133

Bench: 7374604

Resolves #228
@Rocky640 Rocky640 Simplify backward pawn definition
Make use of 'lever' attribute

No functional change

Resolves #234
Commits on Jan 29, 2015
@mcostalba mcostalba Simplify skill level and reduce ELO
This patch has two positive effects:

- Retire a hackish formula and leave
  just a natural, simple and plain one.

- Reduce strenght at very low level, but
  don't impact medium/high levels.

Actually even at level 0, SF is still too
strong for many beginners (this was reported
many times for instance on Droidfish user
comments on Google Play).

Test on fishtest shows that ELO drop is around
170 ELO at level 0 (good!), 130 ELO at level 1
and smoothly reduces (as expected) until level
10 where the drop is just of 8 ELO.

No functional change.
Commits on Jan 30, 2015
@mcostalba mcostalba Sync with master
bench: 7374604
@jromang jromang Ressurrect hashfull patch
This is an old patch from Jean-Francois Romang to send
UCI hashfull info to the GUI:

It was wrongly judged as a slowdown, but it takes much
less than 1 ms to run, indeed on my core i5 2.6Ghz it
takes about 2 microsecs to run!

Regression test is good:

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 7352 W: 1548 L: 1401 D: 4403

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 61432 W: 10307 L: 10251 D: 40874

I have set the name of the author to the original

No functional change.
Commits on Jan 31, 2015
@mcostalba mcostalba Fix a MSVC warning
warning C4805: '|' : unsafe mix of type 'Bitboard' and type 'bool' in operation

No functional change.
@mcostalba mcostalba Convert Reductions[] from int8_t to Depth
This is the type anyhow. Assorted cleanup while there.

No functional change.
@mcostalba mcostalba Move uci_pv under UCI namespace
That's the correct place.

No functional change.
@mcostalba mcostalba Use C++ loops in insert_pv_in_tt
Also small tweak to extract_ponder_from_tt

No functional change.
@mcostalba mcostalba Sync with master
bench: 7374604
@mcostalba mcostalba Another small tweak to skills
No functional change.
@mcostalba mcostalba Implicit conversion from ExtMove to Move
Verified with perft there is no speed regression,
and code is simpler. It is also conceptually correct
becuase an extended move is just a move that happens
to have also a score.

No functional change.
@mcostalba mcostalba Use C++11 loops in MovePicker
No functional change.
@mcostalba mcostalba More readable score<CAPTURES>()
No functional change.
Commits on Feb 01, 2015
@mcostalba mcostalba Silence a warning under MSVC
warning C4100: 'ci' : unreferenced formal parameter

It is a silly and wrong one, but just silent it.

No functional change.
@mcostalba mcostalba Small tweaks in movepick.cpp
No functional change.
@mcostalba mcostalba Delay checking for duplicated killer moves
Follow the usual approach to delay computation
as far as possible, in case an earlier killer
cut-offs we avoid to do useless work.

This also greatly simplifies the code.

No functional change.
@mcostalba mcostalba Allow to assign a Move to an ExtMove
After defining ExtMove::operator Move(), this is a
natural extension.

No fnctional change.
@mcostalba mcostalba Use move assignment in movegen.h
No functional change and same speed (tested with perft)
Commits on Feb 02, 2015
@NicklasPersson NicklasPersson Improved King Safety values
From an SPSA-session on king safety.

ELO: 3.21 +-2.1 (95%) LOS: 99.8%
Total: 40000 W: 8181 L: 7812 D: 24007

LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 47765 W: 8091 L: 7785 D: 31889

Bench: 8589262

Resolves #241
Commits on Feb 03, 2015
mstembera Profile build options
I went through all the individual compile options that differ between
-fprofile-generate/-fprofile-use  and  -fprofile-arcs/-fbranch-probabilities
and distilled the speed difference down to only turning off
-fno-peel-loops and -fno-tracer.  Using this we still get the full speedup
(maybe a bit more because other optimizations stay on) and it's also much cleaner
because we can get rid of the "@rm -f ucioption.gc*" hack for all versions of gcc.

No functional change.

Resolves #237
Stefan Geschwentner Add bonus for pawn attack threats
Latent pawn attacks: Add a bonus to safe pawn pushes which attacks an
enemy piece.  Based on an idea of Lyudmil Tsvetkov.

LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 7925 W: 1666 L: 1537 D: 4722

LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 40109 W: 6841 L: 6546 D: 26722

Bench: 7696257

Resolves #240
Commits on Feb 07, 2015
@lucasart lucasart Removes useless templates, some of which lead to code duplication: is…
…_K*() functions.

No functional change

Resolves #245
@mcostalba mcostalba Sync with master
bench: 7696257
@mcostalba mcostalba Rename dbg_hit_on_c() to dbg_hit_on()
Use an overload instead of a new named function.

I have found this handier and easier when adding
some quick debug code.

No functional change.
@mcostalba mcostalba Rename dbg_hit_on_c() to dbg_hit_on()
Use an overload instead of a new named function.

I have found this handier and easier when adding
some quick debug code.

No functional change.
@mcostalba mcostalba Micro-optimize SEE
Results for 10 tests for each version (gcc 4.8.3 on mingw):

            Base      Test      Diff
    Mean    1502447   1507917   -5470
    StDev   3119      1364      4153

p-value: 0,906
speedup: 0,004

Results for 10 tests for each version (MSVC 2013):

            Base      Test      Diff
    Mean    1400899   1403713   -2814
    StDev   1273      2804      2700

p-value: 0,851
speedup: 0,002

No functional change.
@mcostalba mcostalba Rewrite pos_is_ok()
No functional change.
@mcostalba mcostalba Avoid casting to char* in prefetch()
Funny enough, gcc __builtin_prefetch() expects
already a void*, instead Windows's _mm_prefetch()
requires a char*.

The patch allows to remove ugly casts from caller

No functional change.
Commits on Feb 08, 2015
@mcostalba mcostalba Small tweaks in do_move and friends
Also remove useless StateCopySize64 optimization:
compiler uses SSE movups instruction anyhow and
does not need this trick (verified with fishbench).

No functional change.
@mcostalba mcostalba Shuffle put_piece() and friends signatures
It is more consistent with the others member functions.

No functional change.
Joona Kiiski Pawn Center Bind Bonus
Bonus for two pawns controlling the same central square


LLR: 3.14 (-2.94,2.94) [-1.50,4.50]
Total: 15974 W: 3291 L: 3133 D: 9550


LLR: 3.24 (-2.94,2.94) [0.00,6.00]
Total: 10449 W: 1837 L: 1674 D: 6938

Idea from Lyudmil Tsvetkov.

Bench: 7699138

Resolves #248
@mcostalba mcostalba Sync with master
bench: 7699138
Commits on Feb 09, 2015
@hxim hxim Fix KingDanger[] array initialization
Use integer arithmetic instead of floating point arithmetic.
Floating point arithmetic was causing different results for some 32-bit compiles

No functional change

Resolves #249
Resolves #250
Commits on Feb 13, 2015
@mcostalba mcostalba Reformat tracing functions
No functional change.
@snicolet snicolet Small bonus for all safe pawn pushes
Pawn flexibility: add a small bonus for all safe pawn pushes

LLR: 2.70 (-2.94,2.94) [-1.50,4.50]
Total: 18233 W: 3705 L: 3557 D: 10971

LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 17684 W: 3042 L: 2854 D: 11788

Bench: 7369224

Resolves #253
Commits on Feb 14, 2015
@mcostalba mcostalba Sync with master
Bench: 7369224
@mcostalba mcostalba Further simplify KingDanger init
And remove a tale whitespace while there.

No functional change.
@mcostalba mcostalba Further simplify KingDanger init
And remove a tale whitespace while there.

No functional change.
Commits on Feb 15, 2015
@mcostalba mcostalba Revert "Delayed killers checking"
It seems a slowdown when run with fishbench.

No functional change.
@lucasart lucasart Compute checkers from scratch
This micro-optimization only complicates the code and provides no benefit.
Removing it is even a speedup on my machine (i7-3770k, linux, gcc 4.9.1):

stat        test     master    diff
mean   2,403,118  2,390,904  12,214
stdev     12,043     10,620   3,677

speedup       0.51%
P(speedup>0) 100.0%

No functional change.
@mcostalba mcostalba Retire one do_move() overload
After Lucas patch it is almost useless.

No functional change.
Commits on Feb 16, 2015
@lucasart lucasart Compute checkers from scratch
This micro-optimization only complicates the code and provides no benefit.
Removing it is even a speedup on my machine (i7-3770k, linux, gcc 4.9.1):

stat        test     master    diff
mean   2,403,118  2,390,904  12,214
stdev     12,043     10,620   3,677

speedup       0.51%
P(speedup>0) 100.0%

No functional change.
@zamar zamar Improve smp performance for high number of threads
Balance threads between split points.

There are huge differences between different machines and autopurging makes it very difficult to measure the improvement in fishtest, but the following was recorded for 16 threads at 15+0.05:

    For Bravone (1000 games): 0 ELO
    For Glinscott (1000 games): +20 ELO
    For bKingUs (1000 games): +50 ELO
    For fastGM (1500 games): +50 ELO

The change was regression for no one, and a big improvement for some, so it should be fine to commit it.
Also for 8 threads at 15+0.05 we measured a statistically significant improvement:
ELO: 6.19 +-3.9 (95%) LOS: 99.9%
Total: 10325 W: 1824 L: 1640 D: 6861

Finally it was verified that there was no (significant) regression for

4 threads:
ELO: 0.09 +-2.8 (95%) LOS: 52.4%
Total: 19908 W: 3422 L: 3417 D: 13069

2 threads:
ELO: 0.38 +-3.0 (95%) LOS: 60.0%
Total: 19044 W: 3480 L: 3459 D: 12105

1 thread:
ELO: -1.27 +-2.1 (95%) LOS: 12.3%
Total: 40000 W: 7829 L: 7975 D: 24196

Resolves #258
Commits on Feb 17, 2015
@mcostalba mcostalba Simplify attackUnits formula
Use '/ 8' instead of '* 31 / 256'

Passed STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 55077 W: 10999 L: 10940 D: 33138

LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 14751 W: 2530 L: 2400 D: 9821

bench: 7911944
Commits on Feb 18, 2015
@mcostalba mcostalba Compute SplitPoint::spLevel on the fly
And retire a redundant field. This is important also
from a concept point of view becuase we want to keep
SMP structures as simple as possible with the only
strictly necessary data.

Verified with

dbg_hit_on(sp->spLevel != level)

that the values are 100% the same out of more 50K samples.

No functional change.
Commits on Feb 19, 2015
@mcostalba mcostalba Remove useless condition in late join
In case of Threads.size() == 2 we have that sp->allSlavesSearching
is always false (because we have finished our search), bestSp is
always NULL and we never late join, so there is no need to special
case here.

Tested with dbg_hit_on(sp && sp->allSlavesSearching) and
verified it never fires.

No functional change.
@mcostalba mcostalba Add a couple of asserts to late join
Document and clarify that we cannot rejoin on ourselves
and that we never late join if we are master and all
slaves have finished, inded in this case we exit idle_loop.

No functional change.
@mcostalba mcostalba Fix a warning under MSVC
Assignment of size_t to int.

No functional change.
@mcostalba mcostalba Retire redundant sp->slavesCount field
It should be used slavesMask.count() instead.

Verified 100% equivalent when sp->allSlavesSearching:

dbg_hit_on(sp->allSlavesSearching, sp->slavesCount != sp->slavesMask.count());

No functional change.
@mcostalba mcostalba Use size_t consistently across thread code
No functional change.
@mcostalba mcostalba Clarify we don't late join with only 2 threads
Thanks to Gary for pointing this out.

No functional change.
Commits on Feb 20, 2015
@mcostalba mcostalba Sync with master
bench: 7911944
@mcostalba mcostalba Use range-based-for in late join
No functional change.
@mcostalba mcostalba Improve comments in SMP code
No functional change.
@snicolet snicolet Mobile phalanxes
Try to create mobile phalanxes

LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 52393 W: 10912 L: 10656 D: 30825

LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 30398 W: 5315 L: 5063 D: 20020

Bench: 8253813

Resolves #261
@snicolet snicolet Fix comment for kingAdjacentZoneAttacksCount
The comment for kingAdjacentZoneAttacksCount[] was bogus, using
reversed semantics for color.

No functional change

Resolves #262
Commits on Feb 21, 2015
@mcostalba mcostalba Use sp->master instead of bestThread
Verified with:

dbg_hit_on(th != sp->master);

It is 100% equivalent on more than 200K hits.

No functional change.
@mcostalba mcostalba Further refine SMP code
Backported from C++11 branch:


Fully verified it is equivalent to master (see log msg
of individual commits for details).

No functional change.
Commits on Feb 22, 2015
@mcostalba mcostalba Use only 'level' as late join metric
It seems other metric are useless, this allow us
to simplify the code and to prune useless stuff.

STC 20K games 4 threads
ELO: -0.76 +-2.8 (95%) LOS: 29.9%
Total: 20000 W: 3477 L: 3521 D: 13002

STC 10K games 16 threads
ELO: 1.36 +-3.9 (95%) LOS: 75.0%
Total: 10000 W: 1690 L: 1651 D: 6659

bench: 8253813
@mcostalba mcostalba Sync with master
bench: 8253813
@mcostalba mcostalba Fix build under OS X
Reported by Vince Negri

No functional change.
Commits on Feb 23, 2015
@mcostalba mcostalba Sync with master
bench: 8253813
@mcostalba mcostalba Introduce Spinlock class
Initialization is more complex than what I'd like due
to MSVC compatibility that for some reason does not like:

std::atomic_flag lock = ATOMIC_FLAG_INIT;

No functional change.
@mcostalba mcostalba Use spinlock instead of mutex for Threads and SplitPoint
It is reported to be defenitly faster with increasing
number of threads, we go from a +3.5% with 4 threads
to a +15% with 16 threads.

The only drawback is that now when testing with more
threads than physical available cores, the speed slows
down to a crawl. This is expected and was similar at what
we had setting the old sleepingThreads to false.

No functional change.
@mcostalba mcostalba Improve spinlock implementation
Calling lock.test_and_set() in a tight loop creates expensive
memory synchronizations among processors and penalize other
running threads. So syncronize only only once at the beginning
with fetch_sub() and then loop on a simple load() that puts much
less pressure on the system.

Reported about 2-3% speed up on various systems.

Patch by Ronald de Man.

No functional change.
Commits on Feb 24, 2015
@mcostalba mcostalba Small tweaks in pawns.cpp
No functional change.
@mcostalba mcostalba Don't assume the type of Time::point
But instead use the proper definition. Also
rewrite chrono functions while there.

No functional change.
Commits on Feb 26, 2015
@Rocky640 Rocky640 Apex Pawns
Pawns which are supported already have a bonus. Apex are pawns which are
supported twice.
This patch gives an additional 50% bonus for them.

LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 6549 W: 1333 L: 1209 D: 4007

LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 18002 W: 3037 L: 2850 D: 12115

Bench: 8069601

Resolves #267
@mcostalba mcostalba Normalize twice supported pawns
Align codying style to current conventions and move
formula for twice supported pawns to Pawns::init()
where it should be.

No functional change.
@mcostalba mcostalba Sync with master
bench: 8069601
Commits on Feb 27, 2015
@mcostalba mcostalba Retire apply_weight()
Use the more natural operator*() instead.

No functional change.
Commits on Feb 28, 2015
@snicolet snicolet Raise penalty for knight attacked by pawn
Raise a bit the penalty for knight attacked by pawn.

LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 27744 W: 5563 L: 5380 D: 16801

LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 87571 W: 14747 L: 14314 D: 58510

Bench: 8285241

Resolves #270
@snicolet snicolet Simplify pawn code a bit
Simplify a bit the number of bitwise operators used to calculate the
pawn evaluation in pawns.cpp

No functional change.

Resolves #269
@mcostalba mcostalba Sync with master
bench: 8285241
Commits on Mar 01, 2015
@mcostalba mcostalba Rename available_to()
Change this API to be more natural and simple.

Inspired by a patch by Joona.

No functional change.
@mcostalba mcostalba Allow to disable spinlocks
And use mutex instead. You may never want to do this.
It is a workaround to run c++11 on fishtest where many
machiens have HTenabled and this can be a problem when
number of cores set is higher than number of physical cores.

To disable spinlocks, just compile with -DNO_SPINLOCK flag

No functional change.
Commits on Mar 02, 2015
@mcostalba mcostalba Disable spinlocks
Now that c++11 branch has been merged in master,
disable unconditionally the spinlocks and use mutex
instead. This will allow to run fishtest even on HT
machines withouth changes.

In the future we will reintorduce spinlocks, once
we will have took care of fishtest.

No functional change.
Commits on Mar 05, 2015
@snicolet snicolet Update Makefile for Mac OS X compilation
This change in the Makefile restores the possibility to compile
Stockfish on Mac OS X 10.9 and 10.10 after the C++11 has been merged.

To use the default (fastest) settings, compile with:

make build ARCH=x86-64-modern

To test the clang settings, compile with

make build ARCH=x86-64-modern COMP=clang

Beware that the clang settings may provide a slightly slower (6%)

No functional change

Resolves #275
Commits on Mar 07, 2015
@zamar zamar Revert C++11 merge
Restore the state of repo back to commit 'Simplify pawn code a bit' (1e6d21d)

No functional change
@mcostalba mcostalba Re-enable spinlocks
For branch C++11, that doe snot run on fishtest,
there is no need of this kludge, let only master
have it.

No functional change.
@snicolet snicolet Update Makefile for Mac OS X compilation
This change in the Makefile restores the possibility to compile
Stockfish on Mac OS X 10.9 and 10.10 after the C++11 has been merged.

To use the default (fastest) settings, compile with:

make build ARCH=x86-64-modern

To test the clang settings, compile with

make build ARCH=x86-64-modern COMP=clang

Beware that the clang settings may provide a slightly slower (6%)

Backported from master.

No functional change

Resolves #275
@hxim hxim Transform minKingPawnDistance into a local variable
minKingPawnDistance is used only as local variable in one place so we don't need it to be part of "Pawns::Entry" structure.

No functional change.

Resolves #277
@mcostalba mcostalba Sync with master
No functional change.
@mcostalba mcostalba Sync with master
bench: 8285241
Commits on Mar 10, 2015
@mcostalba mcostalba Add thread_win32.h header
Workaround slow std::thread implementation in mingw
and gcc for Windows with our own old low level thread

No functional change.
@mcostalba mcostalba Disable spinlocks
To allow testing on fishtest.

No functional change.
@mcostalba mcostalba Cleanup thread_win.h
No functional change.
Commits on Mar 11, 2015
@mcostalba mcostalba Retire spinlocks
Use Mutex instead.

This is in preparaation for merging with master branch,
where we stilll don't have spinlocks.

Eventually spinlocks will be readded in some future
patch, once c++11 has been merged.

No functional change.
Joona Kiiski Use thread specific mutexes instead of a global one.
This is necessary to improve the scalability with high number of cores.

There is no functional change in a single thread mode.

Resolves #281
Commits on Mar 12, 2015
Stefan Geschwentner Introduce Counter Move History tables
Introduce a counter move history table which additionally is indexed by the last move's piece and target square.
For quiet move ordering use now the sum of standard and counter move history table.

LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 4747 W: 1005 L: 885 D: 2857

LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 5726 W: 1001 L: 872 D: 3853

Because of reported low NPS on multi core test
STC (7 threads):
ELO: 7.26 +-3.3 (95%) LOS: 100.0%
Total: 14937 W: 2710 L: 2398 D: 9829

Bench: 7725341

Resolves #282
mstembera New easy move implementation
Spend much less time in positions where one move is much better than all other alternatives.
We carry forward pv stability information from the previous search to identify such positions.
It's based on my old InstaMove idea but with two significant improvements.

1) Much better instability detection inside the search itself.
2) When it's time to make a FastMove we no longer make it instantly but still spend at least 10% of normal time verifying it.

Credit to Gull for the inspiration.
BIG thanks to Gary because this would not work without accurate PV!

ELO: 8.22 +-3.0 (95%) LOS: 100.0%
Total: 20000 W: 4203 L: 3730 D: 12067

LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 23266 W: 4662 L: 4492 D: 14112

LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 12470 W: 2091 L: 1931 D: 8448

Resolves #283
Commits on Mar 13, 2015
@zamar zamar Remove check for gcc version from Makefile.
This check is obsolete.
very old gcc versions can't compile c++11 code.

No functional change

Resolves #285
Commits on Mar 14, 2015
@zamar zamar Introduce yielding spin locks
Idea and original implementation by Stephane Nicolet

7 threads 15+0.05
ELO: 3.54 +-2.9 (95%) LOS: 99.2%
Total: 17971 W: 2976 L: 2793 D: 12202

There is no functional change in single thread mode
@mcostalba mcostalba Link with -static in mingw
Fixes reported startup error about missing libwinpthread-1.dll
when the dll is not in the path.

The current -static-xxxx flags, introduced with:


Only take in account standard libraries, but not thread

No functional change.

Resolves #289
@joergoster joergoster New values for Mobility and Outposts.
Both are the result of a SPSA tuning session with a custom book, 50k iterations each.

After an additional tuning session of the mobility values, tuning the delta values, with following result.

40k games at 9+0.05:
ELO: 4.13 +-2.2 (95%) LOS: 100.0%
Total: 40000 W: 8581 L: 8106 D: 23313

and LTC
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 36518 W: 6049 L: 5782 D: 24687

Bench: 8567402

Resolves #284
Commits on Mar 15, 2015
@zamar zamar Do not sleep, but yield
During the search, do not block on condition variable, but instead use std::this_thread::yield().

Clear gain with 16 threads. Again results vary highly depending on hardware, but on average it's a clear gain.

ELO: 12.17 +-4.3 (95%) LOS: 100.0%
Total: 7998 W: 1407 L: 1127 D: 5464

There is no functional change in single thread mode

Resolves #294
@zamar zamar Fix dependency generation for C++11
No functional change

Resolves #291
@cuddlestmonkey cuddlestmonkey Fix dependency generation for MacOSX
No functional change

Resolves #290
Commits on Mar 16, 2015
@mcostalba mcostalba Use acquire() and release() for spinlocks
It is more idiomatick than lock() and unlock()

No functional change.
@mcostalba mcostalba Re-arrange history update code
Unify the quites moves loop for both cases,
the compiler optimizes away the

 if (is_ok((ss-1)->currentMove))

inside loop, so that the result is same
speed as original.

No functional change.
Commits on Mar 17, 2015
@mcostalba mcostalba Fix a bogus use of mutex
Spinlock must be used instead.

Tested for no regression at 15+0.05 th 4:

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 25928 W: 4303 L: 4190 D: 17435

No functional change.

Resolves #297
Commits on Mar 18, 2015
@mcostalba mcostalba Simplify nosleep logic
Avoid redundant 'while' conditions. It is enough to
check them in the outer loop.

Quick tested for no regression 10K games at 4 threads
ELO: -1.32 +-3.9 (95%) LOS: 25.6%
Total: 10000 W: 1653 L: 1691 D: 6656

No functional change.
@mcostalba mcostalba Reformat FastMove
Align to SF coding style.

Verified no regression:

LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 55938 W: 10893 L: 10835 D: 34210

No functional change.
@lucasart lucasart connected should be bool, not Bitboard
There's no reason to define it as a Bitboard, so for consistency, use bool.

This is even a speedup on my machine: i7-3770k, using gcc 4.9.1 (linux):

    stat        test     master    diff
    mean   2,341,338  2,327,998  13,134
    stdev     15,765     14,717   5,405

    speedup       0.56%
    P(speedup>0) 100.0%

No functional change.

Resolves #298
@joergoster joergoster Fix the comment for Position::is_draw()
We no longer check for insufficient material.

No functional change

Resolves #299
Commits on Mar 20, 2015
@joergoster joergoster Tuned mobility with another SPSA run
Further improved mobility values after another SPSA session, 50k

Elo measure at very fast 9+0.05":
ELO: 3.40 +-2.2 (95%) LOS: 99.9%
Total: 40000 W: 8434 L: 8042 D: 23524

and LTC SPRT[0, 4]:
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 11052 W: 1874 L: 1687 D: 7491

Bench: 8226843

Resolves #301
@mcostalba mcostalba Retire ConditionVariable
Now that we use spinlocks everywhere and don't put
threads to sleep while idle, we can use the slower
(but no more in hot path) std::condition_variable_any
instead of our homwgrown ConditionVariable struct.

Verified fo rno regression at STC with 7 threads:
ELO: -0.66 +-2.7 (95%) LOS: 31.8%
Total: 20000 W: 3210 L: 3248 D: 13542

No functional change
@lucasart lucasart Fix comment
We always probe, but we do not prune at PV nodes.

No functional change.

Resolves #300
Commits on Mar 21, 2015
@mcostalba mcostalba Use only one ConditionVariable to sync UI
To sync UI with main thread it is enough a single
condition variable because here we have a single
producer / single consumer design pattern.

Two condition variables are strictly needed just for
many producers / many consumers case.

Note that this is possible because now we don't send to
sleep idle threads anymore while searching, so that now
only UI can wake up the main thread and we can use the
same ConditionVariable for both threads.

The natural consequence is to retire wait_for_think_finished()
and move all the logic under MainThread class, yielding the
rename of teh function to join()

No functional change.
Commits on Mar 23, 2015
@mcostalba mcostalba Get rid of nativeThread
No functional change.
@mcostalba mcostalba Double magics generation speed
Profiling shows that resetting attacks table after
a failed candidate magic attempt is the biggest
time consumer, so rewrite the logic avoiding the

Magics init for rook+bishop goes from 200msecs to
under 100msec.

No functional change.
@mcostalba mcostalba Allow Bitbases::init() to be called more than once
Currently if we call it more than once, we crash.

This is not a real problem, because this function is
indeed called just once. Nevertheless with this small fix,
that gets rid of a hidden 'static' variable, we cleanly
resolve the issue.

While there, fix also ThreadPool::exit to return in a
consistent state. Now all the init() functions but
UCI::init() are reentrant and can be called multiple

No functional change.
Commits on Mar 24, 2015
@zamar zamar Fully yielding locks, no spinning
7 threads:

ELO: 2.00 +-2.7 (95%) LOS: 92.4%
Total: 20000 W: 3276 L: 3161 D: 13563

There is no functional change in single thread mode

Resolves #304
@VoyagerOne VoyagerOne Introduce a new counter move history penalty
Extra penalty for TT move in previous ply when it gets refuted


LLR: 2.94 (-2.94,2.94) [-1.50,4.50]
Total: 31303 W: 6216 L: 6025 D: 19062


LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 6950 W: 1189 L: 1054 D: 4707

Bench: 8191926

Resolves #309
@joergoster joergoster Tuned values for the pawn piece square table
Quick measure at very fast tc:
ELO: 4.77 +-2.2 (95%) LOS: 100.0%
Total: 40124 W: 8711 L: 8160 D: 23253

LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 52284 W: 8880 L: 8559 D: 34845

Bench: 8865736

Resolves #311
Commits on Mar 25, 2015
@mcostalba mcostalba Clean up previous patch
No functional change.
Commits on Mar 28, 2015
@VoyagerOne VoyagerOne Use CounterMoveHistory when calculating LMR for cut nodes
If the sum of CounterMoveHistory heuristic and History heuristic is below zero,
then reduce an extra ply in cut nodes

LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 6479 W: 1099 L: 967 D: 4413

Bench: 7773299

Resolves #315
@mbootsector mbootsector Retire follow-up move heuristic
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 34891 W: 6904 L: 6808 D: 21179

LLR: 3.10 (-2.94,2.94) [-3.00,1.00]
Total: 182653 W: 29866 L: 29993 D: 122794

Bench: 8396161

Resolves #310
@ajithcj ajithcj Give a reduced bonus for threats by hanging pawns
Passed STC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 105539 W: 20389 L: 20001 D: 65149

and LTC:
LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 9629 W: 1577 L: 1432 D: 6620

Bench: 7658627

Resolves #317
@Rocky640 Rocky640 PSV3_1
Small speed-up in pawn.cpp
Results for 10 tests for each version:

Base      Test      Diff
Mean    1435636   1445238   -9602
StDev   22576     23189     1848

p-value: 1
speedup: 0.007

No functional change

Resolves #295
Commits on Mar 29, 2015
@lucasart lucasart Remove some difficult to understand C++11 constructs
Code like this is more a case of showing off one's C++ knowledge, rather than
using it adequately, IMHO.

**First loop (std::generate)**

Iterators are inadequate here, because they lose the key information which is
idx. As a result, we need to carry a redundant idx variable, and increment it
along the way. Very clumsy.
Usage of std::generate and a lambda function only obfuscate the code, which is
merely a simple and stupid loop over the elements of a vector.

**Second loop (std::accumulate)**

This code is thoroughlly incomprehensible. Restore the original, which was much
simpler to understand.

**Third loop (range based loop)**

Again, a range based loop is inadequate, because we lose idx! To resolve this
artificially created problem, the data model was made redundant (idx is a data
member of db[] elements!?), which is ugly and unjustified. A simple and stupid
for loop with idx does the job much better.

No functional change.

Resolves #313
@zamar zamar Fix indentations for hanging pawns code
No functional change
@mcostalba mcostalba Assorted code style of latest commits
No functional chnage.
Commits on Apr 02, 2015
mstembera Simplification to use only one counter move.
LLR: 2.95 (-2.94,2.94) [-3.50,0.50]
Total: 18868 W: 3638 L: 3530 D: 11700

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 69767 W: 11019 L: 10973 D: 47775

Extracted from

Original patch by hxim.  All credit goes to him.

Bench: 7664249

Resolves #320
Commits on Apr 03, 2015
@mcostalba mcostalba Introduce elapsed_time()
And reformat a bit time manager code.

Note that now we set starting search time in think() and
no more in ThreadPool::start_thinking(), the added delay
is less than 1 msec, so below timer resolution (5msec) and
should not affect time lossses ratio.

No functional change.
@mcostalba mcostalba Rename of TimeMgr and friends
More natural naming IMO.

No functional change.
@mcostalba mcostalba Add support for playing in 'nodes as time' mode
When running more games in parallel, or simply when running a game
with a background process, due to how OS scheduling works, there is no
guarantee that the CPU resources allocated evenly between the two
players. This introduces noise in the result that leads to unreliable
result and in the worst cases can even invalidate the result. For
instance in SF test framework we avoid running from clouds virtual
machines because are a known source of very unstable CPU speed.

To overcome this issue, without requiring changes to the GUI, the idea
is to use searched nodes instead of time, and to convert time to
available nodes upfront, at the beginning of the game.

When nodestime UCI option is set at a given nodes per milliseconds
(npmsec), at the beginning of the game (and only once), the engine
reads the available time to think, sent by the GUI with 'go wtime x'
UCI command. Then it translates time in available nodes (nodes =
npmsec * x), then feeds available nodes instead of time to the time
management logic and starts the search. During the search the engine
checks the searched nodes against the available ones in such a way
that all the time management logic still fully applies, and the game
mimics a real one played on real time. When the search finishes,
before returning best move, the total available nodes are updated,
subtracting the real searched nodes. After the first move, the time
information sent by the GUI is ignored, and the engine fully relies on
the updated total available nodes to feed time management.

To avoid time losses, the speed of the engine (npms) must be set to a
value lower than real speed so that if the real TC is for instance 30
secs, and npms is half of the real speed, the game will last on
average 15 secs, so much less than the TC limit, providing for a
safety 'time buffer'.

There are 2 main limitations with this mode.

1. Engine speed should be the same for both players, and this limits
the approach to mainly parameter tuning patches.

2. Because npms is fixed while, in real engines, the speed increases
toward endgame, this introduces an artifact that is equivalent to an
altered time management. Namely it is like the time management gives
less available time than what should be in standard case.

May be the second limitation could be mitigated in a future with a
smarter 'dynamic npms' approach.

Tests shows that the standard deviation of the results with 'nodestime'
is lower than in standard TC, as is expected because now all the introduced
noise due the random speed variability of the engines during the game is
fully removed.

Original NIT idea by Michael Hoffman that shows how to play in NIT mode
without requiring changes to the GUI. This implementation goes a bit
further, the key difference is that we read TC from GUI only once upfront
instead of re-reading after every move as in Michael's implementation.

No functional change.
@mcostalba mcostalba Fix elapsed()
Messed up during merge.

No functional change.
@mcostalba mcostalba Fix MSVC warning from previous patch
No functional change.
Commits on Apr 09, 2015
@snicolet snicolet Use minimumSplitDepth = 5
Using minimumSplitDepth = 5 seems to be the best compromise in the
current SMP implementation

STC, 11 threads:

ELO: 14.87 +-4.1 (95%) LOS: 100.0%
Total: 8509 W: 1497 L: 1133 D: 5879

STC, 4 threads:

ELO: 0.30 +-2.8 (95%) LOS: 58.2%
Total: 20000 W: 3365 L: 3348 D: 13287

STC, 2 threads:

ELO: -1.02 +-2.0 (95%) LOS: 16.4%
Total: 40000 W: 7087 L: 7204 D: 25709

Resolves #324
@lucasart lucasart Prune evasions when we can castle
A minor simplification.


LLR: 2.95 (-2.94,2.94) [-3.50,0.50]
Total: 67877 W: 12882 L: 12904 D: 42091


LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20677 W: 4023 L: 3901 D: 12753


LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 12221 W: 2022 L: 1888 D: 8311

Bench: 7911336

Resolves #326
Stefan Geschwentner update stats also in check
Update stats also if in check (drop condition).

LLR: 3.22 (-2.94,2.94) [-3.00,1.00]
Total: 87472 W: 16929 L: 16913 D: 53630

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 39971 W: 6436 L: 6345 D: 27190

Bench: 7086031

Resolves #327
Commits on Apr 10, 2015
mstembera New formula for quiet move scoring: 3 * cmh + 1 * hist

LLR: 2.97 (-2.94,2.94) [-1.50,4.50]
Total: 45363 W: 8759 L: 8532 D: 28072


LLR: 3.51 (-2.94,2.94) [0.00,4.00]
Total: 125092 W: 20032 L: 19468 D: 85592

Bench: 7058819

Resolves #328
Stefan Geschwentner Update stats at pv nodes
If a quiet best move is found at a pv node then always update stats.

LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 41485 W: 8047 L: 7830 D: 25608

LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 14351 W: 2420 L: 2250 D: 9681

Bench: 6985247

Resolves #330
@mcostalba mcostalba Allow Position::init() to be called more than once
Currently Zobrist::castling[] are not properly zeroed
and rely on the compiler to do this at startup, but this
makes Position::init() to set different values every time
it is called!

This is a bit odd, and although not impacting normal usage,
can yield to subtle misbehaviour, very difficult to track
down, in case we happen to call it more than once for some
reason. I found this while developing tuning support and
it took me a while to track it down.

So properly init Zobrist::castling[]

No functional change.

Resolves #329
Commits on Apr 11, 2015
@mcostalba mcostalba Assorted cleanup of last patches
No functional change.
Commits on Apr 12, 2015
@VoyagerOne VoyagerOne Removed extra condition (history < 0) in LMR to help sync up with mov…
…e ordering.

LMR condition is now cmh+history<0
Instead of history<0 OR cmh+history<0

LLR: 2.96 (-2.94,2.94) [-3.00, 1.00]
Total: 26446 W: 5092 L: 4980 D: 16374

LLR: 2.96 (-2.94,2.94) [-3.00, 1.00]
Total: 14129 W: 2340 L: 2209 D: 9580

Bench: 7815183

Resolves #331
Commits on Apr 15, 2015
@lucasart lucasart Retire FORCE_INLINE
No speed regression on my machine (i7-3770k, gcc 4.9.1, linux 3.16):

        stat        test     master   diff
        mean   2,482,415  2,474,987  7,906
        stdev      4,603      5,644  2,497

        speedup        0.32%
        P(speedup>0)  100.0%

Fishtest 9+0.03:

ELO: 0.26 +-1.8 (95%) LOS: 61.2%
Total: 60000 W: 12437 L: 12392 D: 35171

No functional change.

Resolves #334
Commits on Apr 18, 2015
@Rocky640 Rocky640 Exclude queen from Rook Contact Check computation
In ei.attackedBy, Queen does not x-ray through Rook, but the Rook does
X-ray through the Queen.

So most of the rook contact checks supported by queen are, in fact,
Queen Contact Checks and they are already scored separately.

Bench: 7762189

Resolves #338
Commits on Apr 26, 2015
@VoyagerOne VoyagerOne Change extra ply LMR condition to: cmh <= 0 && hist < 0
Extra ply LMR condition is now cmh <= 0 && h < 0
Instead of cmh + h < 0

LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 55210 W: 10812 L: 10557 D: 33841

LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 13212 W: 2239 L: 2045 D: 8928

Bench: 8420865

Resolves #339
Commits on Apr 28, 2015
@Stefano80 Stefano80 Replace MVV/LVA by MVV for good captures
Passed STC

LLR: 3.71 (-2.94,2.94) [-3.00,1.00]
Total: 64363 W: 12299 L: 12214 D: 39850

and LTC

LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 69976 W: 11056 L: 11011 D: 47909

Bench: 8012532

Resolves #340
Commits on May 03, 2015
@Stefano80 Stefano80 Improve ordering of good captures using rank term
Rank based term improved approximation of pos.see() for scoring good

LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 4632 W: 945 L: 827 D: 2860

LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 25770 W: 4184 L: 3964 D: 17622

Bench: 7593704

Resolves #342
@mcostalba mcostalba Split PSQT init from Position init
Easier for tuning psq tables:

TUNE(myParameters, PSQT::init);

Also move PSQT code in a new *.cpp file, and retire the
old and hacky psqtab.h that required to be included only
once to work correctly, this is not idiomatic for a header

Give wide visibility to psq tables (previously visible only
in position.cpp), this will easy the use of psq tables outside
Position, for instance in move ordering.

Finally trivial code style fixes of the latest patches.

Original patch of Lucas Braesch.

No functional change.
@mcostalba mcostalba Halve PSQT row data
Use symmetry along vertical middle axis of the board
to reduce the number of parameters.

For instance psqt value of SQ_A5 == SQ_A4 and value of
SQ_F8 == SQ_F1.

This is always true, at least until now nobody came in
with an asymmetric psqt table that worked.

Original patch by Lucas.

No functional change.
Commits on May 06, 2015
@lucasart lucasart Never clear stats
Based on an idea and patch by VoyagerOne.

Small simplification, but was tedted for an ELO gain anyway.

LLR: 2.95 (-2.94,2.94) [-1.00,4.00]
Total: 5375 W: 1119 L: 977 D: 3279

LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 17893 W: 2984 L: 2792 D: 12117

bench 8322847
Commits on May 07, 2015
@lucasart lucasart Restore deterministic search state
Introduce helper function Search::reset() which clears all kind of search
memory, in order to restore a deterministic search state.

Generalize TT.clear() into Search::reset() for the following use cases:
- bench: needed to guarantee deterministic bench (ie. if you call bench from
interactive command line twice in a row you get the same value).
- Clear Hash: restore clean search state, which is the purpose of this button.
- ucinewgame: ditto.

No functional change.

Resolves #346
Commits on May 09, 2015
@lucasart lucasart Edge distance
Instead of crafting a clever formula to calculate the array offset, simply use a
3 dimensional array. Remove the comment while at it, because now the code is

No functional change.

Resolves #344
mstembera Smart TT save
Don't overwrite more valuable data with less valuable data

LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 21132 W: 4108 L: 3946 D: 13078

LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 13381 W: 2149 L: 1987 D: 9245

STC 16MB regression w/ zero effective hash pressure
LLR: 2.96 (-2.94,2.94) [-5.00,0.00]
Total: 18944 W: 3607 L: 3564 D: 11773

Bench: 8787152

Resolves #347
Commits on May 10, 2015
@mcostalba mcostalba Cleanup work in misc.cpp
Also some code style tidy up of latest patches.

Also renamed checkSq -> checkSquares because it
is a bitboard and not a square.

No functional change.
Commits on May 18, 2015
@lucasart lucasart Tuned PSQT
LLR: 3.11 (-2.94,2.94) [-0.50,4.50]
Total: 58764 W: 11530 L: 11185 D: 36049

LLR: 2.96 (-2.94,2.94) [0.00,4.50]
Total: 282710 W: 46339 L: 45209 D: 191162

Bench: 8512947

Resolves #349
Stefan Geschwentner Remove Gain Stats
Additionally in futility pruning the margin is raised for compensation.

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 48481 W: 9229 L: 9156 D: 30096

LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 32058 W: 5134 L: 5031 D: 21893

Bench: 8098149

Resolves #350
Commits on May 21, 2015
@lucasart lucasart Fix merge error for Tuned PSQT
Fall-out from 411e704

Bench: 7907776

Resolves #352
Commits on May 24, 2015
@zamar zamar Resolve build failure for Mac
Remove '-Wl' switch from gcc arguments when compiling for Mac

No functional change

Resolves #353
Commits on May 27, 2015
@lucasart lucasart Simplify backward pawn scoring

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 52322 W: 10011 L: 9945 D: 32366


LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 14143 W: 2334 L: 2203 D: 9606

Bench: 7976423

Resolves #354
Commits on May 29, 2015
@mcostalba mcostalba Checking for rook color when setting castling
In Chess960 we can have legal positions with
opponent rook in A or H file and with castling
available, for instance:

4k3/pppppppp/8/8/8/8/PPPPPPPP/rR2K3 w Q - 0 1

In those cases we pick up the wrong rook when
setting castling.

Fix it by checking the color of the rook.

Bug reported by Matthew Lai.

No functional change.
Commits on Jun 02, 2015
@snicolet snicolet Tweak backward pawns definition
Advanced pawns cannot be backward. Also lower the backward penalty in

Passed STC:
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 18534 W: 3588 L: 3433 D: 11513

and LTC:
LLR: 2.96 (-2.94,2.94) [0.00,6.00]
Total: 21319 W: 3415 L: 3217 D: 14687

Bench: 7271152

Resolves #359
@lucasart lucasart Tune pawn shelter/storm
LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 31679 W: 6183 L: 5912 D: 19584

LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 43550 W: 6885 L: 6602 D: 30063

Bench: 9219343

Resolves #360
Commits on Jun 04, 2015
@mcostalba mcostalba Rename stages
Hopefully more clear.

No functional change.
Commits on Jun 05, 2015
@cuddlestmonkey cuddlestmonkey Remove intermediate re-search in LMR
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20149 W: 3830 L: 3707 D: 12612

LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 45384 W: 7089 L: 7006 D: 31289

Bench: 8110365

Resolves #361
Commits on Jun 07, 2015
@lucasart lucasart Simplify outpost evaluation

LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 114149 W: 21986 L: 22032 D: 70131


LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 52232 W: 8468 L: 8396 D: 35368

Bench: 6716940

No functional change

Resolves #363
Commits on Jun 13, 2015
@lucasart lucasart Retire -Wno-long-long
long long is part of the C++11 standard.

No functional change.

Resolves #364
Commits on Jun 17, 2015
@joergoster joergoster Small coding style fix for Outpost array
No functional change

Resolves #367
Commits on Jun 25, 2015
@mcostalba mcostalba Fix compile on icc
Error is:

  a value of type "int" cannot be assigned to an entity of type "Value"

Also retire the now unused squares_of_color() function.

No functional change.
Commits on Jun 27, 2015
@VoyagerOne VoyagerOne LMR Tweak: Decrease reduction if cmh>0 && history>0.
LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 9627 W: 1879 L: 1748 D: 6000

LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 21543 W: 3433 L: 3234 D: 14876

Bench: 8646407

Resolves #370 #371
Commits on Jun 28, 2015
@mcostalba mcostalba Correctly check for no-makefile compile
Under Windows with MSVC we use the IDE to compile,
in this case we infer some compiler flags usually
set by Makefile.

The condition to check this was wrong, namely when compiling
with mingw under Windows 64 bit we always set IS_64BIT and
USE_BSFQ even if compiled with ARCH=x86-32 (this is how I
found it).

Small code style touches while there.

No functional change.
Commits on Jul 04, 2015
@VoyagerOne VoyagerOne CMH Fix: Exclude captures for TT move refutation penalty
This will make sure we store only quiet moves for TT Penalty.

LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 34748 W: 6617 L: 6420 D: 21711

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 19975 W: 3259 L: 3137 D: 13579

Bench: 8063826

Resolves #373
Commits on Jul 09, 2015
@mcostalba mcostalba Remove useless razoring condition
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 20626 W: 3977 L: 3855 D: 12794

LLR: 3.10 (-2.94,2.94) [-3.00,1.00]
Total: 87334 W: 13675 L: 13648 D: 60011

Retire also the now unused pawn_on_7th() helper.

bench: 8248166
Commits on Jul 12, 2015
@joergoster joergoster Use distance<file>() function in endgame.cpp
This one occurance of distance function was most likely overlooked.

No functional change.

Resolves #376
Commits on Jul 15, 2015
mstembera Consistent TT replace policy
This fixes an inconsistency bug where TT entries were valued differently
depending on which pointer they were accessed through.

LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11301 W: 2176 L: 2038 D: 7087

LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 17732 W: 2870 L: 2745 D: 12117

LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 17401 W: 3324 L: 3227 D: 10850

Bench: 8248164

Resolves #377
@VoyagerOne VoyagerOne LMR Simplification: Remove countermove condition
Removed countermove condition for decreasing reduction.

LLR: 3.01 (-2.94,2.94) [-3.00,1.00]
Total: 32410 W: 5092 L: 4986 D: 22332

LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 24450 W: 4632 L: 4517 D: 15301

Bench: 6943812

Resolves #378
Commits on Jul 16, 2015
@mcostalba mcostalba Fix formatting of previous patch
No functional change.
Commits on Jul 19, 2015
@mcostalba mcostalba Tidy up in movepick.cpp
Some formattng after recent changes.

No functional change.
Commits on Jul 24, 2015
mstembera Tuned version of TT replacement policy
If the used multiplier of 8 was any number larger than DEPTH_MAX
this would be a non functional patch.

LLR: 2.96 (-2.94,2.94) [-1.50,4.50]
Total: 16353 W: 3216 L: 3066 D: 10071

LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 134618 W: 21276 L: 20716 D: 92626

LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 22549 W: 4257 L: 4178 D: 14114

Bench: 7372460

Resolves #380
Commits on Jul 29, 2015
@Rocky640 Rocky640 MobilityArea (simplified)
Based off of Pull request #383:

Include squares occupied by some pawns in the MobilityArea
a) not blocked
b) on rank 4 and above
c) or captures

Passed STC
LLR: 2.95 (-2.94,2.94) [-1.50,4.50]
Total: 8157 W: 1644 L: 1516 D: 4997

LLR: 2.97 (-2.94,2.94) [0.00,6.00]
Total: 26086 W: 4274 L: 4051 D: 17761


Then, a simplification test failed, trying to remove b and c)
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 6048 W: 1117 L: 1288 D: 3643

Another simplification test, was run to remove just (c)
Passed STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 28073 W: 5364 L: 5255 D: 17454

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 34652 W: 5448 L: 5348 D: 23856

A parameter tweak test showed that changing b) for "on rank 3 and above"
does not work
LLR: -2.95 (-2.94,2.94) [0.00,4.00]
Total: 5233 W: 937 L: 1077 D: 3219

Finally, a small rewrite, and we have this version

Include squares occupied by some pawns in the MobilityArea which are
a) not blocked
b) on rank 4 and above

Bench: 8977899

Resolves #385
@VoyagerOne VoyagerOne PV refutation penalty
Extra penalty for PV move in previous ply when it gets refuted.

LLR: 4.49 (-2.94,2.94) [-1.50,4.50]
Total: 41094 W: 7889 L: 7620 D: 25585

LLR: 2.95 (-2.94,2.94) [0.00,6.00]
Total: 12304 W: 1967 L: 1811 D: 8526

Bench: 8373608

Resolves #386
Commits on Jul 30, 2015
@mcostalba mcostalba Simplify IID depth formula
Restore original formula messed up during
half-ply removal.

LLR: 4.11 (-2.94,2.94) [-3.00,1.00]
Total: 21349 W: 4091 L: 3909 D: 13349

LLR: 5.42 (-2.94,2.94) [-3.00,1.00]
Total: 52819 W: 8321 L: 8122 D: 36376

bench: 8040572
Commits on Aug 04, 2015
@mcostalba mcostalba Rename Position::list
Use Position::square and Position::squares instead.

This allow us to remove king_square(), simplify
endgames and to have more naming uniformity.

Moreover, this is a prerequisite step in case
in the future we decide to retire piece lists
altoghter and use pop_lsb() to loop across
pieces and serialize the moves. In this way
we just need to change definition of Position::square
to something like:

template<PieceType Pt> inline
Square Position::square(Color c) const {
  return lsb(byColorBB[c]);

No functional change.
Commits on Aug 08, 2015
mstembera Revert TT replacement strategy changes (#380)
It could cause problems with high depths and long time controls

Bench: 8626315

Resolves #390
Commits on Aug 09, 2015
@DiscanX DiscanX Tuned values for mid and end game passed pawns.
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 22691 W: 4468 L: 4228 D: 13995

LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 13620 W: 2216 L: 2023 D: 9381

Bench: 8384669

Resolves #391
Commits on Aug 15, 2015
mstembera TT entry value based on depth and relative age
Calculate TT replace value as depth minus eight times relative age.

LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 45258 W: 8595 L: 8279 D: 28384

LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 56401 W: 8809 L: 8489 D: 39103

LLR: 2.96 (-2.94,2.94) [-4.00,0.00]
Total: 34764 W: 6565 L: 6529 D: 21670

Bench: 9069474

Resolves #395
@mcostalba mcostalba Reformat PassedPawnsBonus
Align to SF coding standards.

No functional change.
Commits on Aug 17, 2015
@Rocky640 Rocky640 Retire PawnSafePush bonus
PawnSafePush, with the value S(5,5) proved not "necessary"
possibly due to recent changes to MobilityArea and other changes to Connected bonus.

LLR: 3.22 (-2.94,2.94) [-3.00,1.00]
Total: 98528 W: 18757 L: 18759 D: 61012

LLR: 5.30 (-2.94,2.94) [-3.00,1.00]
Total: 204194 W: 31698 L: 31734 D: 140762

Bench: 7620871

Resolves #396
Commits on Aug 20, 2015
@lucasart lucasart Retire dangerous flag
Replace by its value where it is used. Code is more clear that

No functional change.

Resolves #402
mstembera Better document entry age calculation used in TT replace.
No functional change.

Resolves #401
Resolves #400
Commits on Aug 28, 2015
@lucasart lucasart Prune castling moves
Align the behaviour with reductions. Initially castling moves had to be
treated differently, because the SEE did not handle them correctly. But now it

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 83750 W: 15722 L: 15711 D: 52317

LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 97183 W: 15120 L: 15115 D: 66948

bench 7759837

Resolves #403
Commits on Aug 29, 2015
@mcostalba mcostalba Reformat trace code
Apart from usual renaiming, take advantage of
C++11 function template default parmeter to
get rid of Eval trampoline functions.

Some triviality fixes while there.

No functional change.
@zamar zamar History gravity
Instead of using hard coded Min and Max values for history,
always adjust the old value slightly downwards before adding a new value.

The adjustment acts like gravity that prevents the value escaping too
far from zero.

Bench: 8020484

Resolves #407
Commits on Aug 30, 2015
@guliashvili guliashvili A small code simplification
No functional change

Resolves #411
Commits on Sep 06, 2015
mstembera Fix syzygy en passant issue
v = value without ep capture being considered
v1 = value of the ep capture

The correct logic is:
if without e.p. capture we are losing, and the value of e.p is either draw, or win or "loss, but 50 move rule saves us", then we should use the value of ep capture.

Credit and thanks to syzygy1 and lantonov !

No functional change (except with syzygy bases)

Resolves #415
Resolves #394
Commits on Sep 07, 2015
mstembera Minor clean up of some function parameters
No function change

Resolves #416
Commits on Sep 10, 2015
@zamar zamar Careful SMP locking - Fix very occasional hangs
Louis Zulli reported that Stockfish suffers from very occasional hangs with his 20 cores machine.

Careful SMP debugging revealed that this was caused by "a ghost split point slave", where thread
was marked as a split point slave, but wasn't actually working on it.

The only logical explanation for this was double booking, where due to SMP race, the same thread
is booked for two different split points simultaneously.

Due to very intermittent nature of the problem, we can't say exactly how this happens.

The current handling of Thread specific variables is risky though. Volatile variables are in some
cases changed without spinlock being hold. In this case standard doesn't give us any kind of
guarantees about how the updated values are propagated to other threads.

We resolve the situation by enforcing very strict locking rules:
- Values for key thread variables (splitPointsSize, activeSplitPoint, searching)
can only be changed when the thread specific spinlock is held.
- Structural changes for splitPoints[] are only allowed when the thread specific spinlock is held.
- Thread booking decisions (per split point) can only be done when the thread specific spinlock is held.

With these changes hangs didn't occur anymore during 2 days torture testing on Zulli's machine.

We probably have a slight performance penalty in SMP mode due to more locking.

STC (7 threads):
ELO: -1.00 +-2.2 (95%) LOS: 18.4%
Total: 30000 W: 4538 L: 4624 D: 20838

However stability is worth more than 1-2 ELO points in this case.

No functional change

Resolves #422
Commits on Sep 15, 2015
Stefan Geschwentner Scales the endgame score by the number of pawns.
Credits goes also to Stephane Nicolet for his great idea of scaling by pawns.

LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 9994 W: 1929 L: 1760 D: 6305

LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 11240 W: 1789 L: 1626 D: 7825

bench 7298564

Resolves #423
Commits on Sep 18, 2015
mstembera Remove unnecessary generation check in TT save
Checking for generation is unnecessary because if the key matches then the entry was probed and refreshed earlier.

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 57391 W: 10671 L: 10613 D: 36107

LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 60732 W: 9260 L: 9199 D: 42273

LLR: 2.95 (-2.94,2.94) [-4.00,0.00]
Total: 23443 W: 4369 L: 4293 D: 14781

No functional change

Resolves #427
mstembera Reduce writes in TT::probe().
Only refresh TT entry when it's really necessary.
This should give a small speed boost for some machines.
And it's a risk-free change.

No functional change.

Resolves #429
Commits on Sep 19, 2015
@jcalovski jcalovski Refine ranks and increase resulting bonus.
LLR: 2.94 (-2.94,2.94) [0.00,4.00]
Total: 272379 W: 51773 L: 50658 D: 169948

LLR: 3.06 (-2.94,2.94) [0.00,4.00]
Total: 41504 W: 6555 L: 6273 D: 28676

bench: 7658406

Resolves #430
Commits on Sep 30, 2015
@mcostalba mcostalba Rework lock protecting
When changing 'search' and 'splitPointsSize' we have to
use thread locks, not split point ones, because can_join()
is called under the formers.

Verified succesfully with 24 hours toruture tests with 20
cores machine by Louis Zulli: it does not hangs.

Verifyed for no regressions with STC, 7 threads:
LLR: 2.94 (-2.94,2.94) [-3.00,1.00]
Total: 52804 W: 8159 L: 8087 D: 36558

No functional change.
Commits on Oct 03, 2015
@jcalovski jcalovski Bonus for checking moves
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 14531 W: 2765 L: 2576 D: 9190

LLR: 3.20 (-2.94,2.94) [0.00,5.00]
Total: 52518 W: 8107 L: 7782 D: 36629

Bench: 7556477

Resolves #435
Stefan Geschwentner File based passed pawn bonus
Add file based bonus for passed pawns. Values tuned by SPSA.

LLR: 3.33 (-2.94,2.94) [0.00,5.00]
Total: 36889 W: 6805 L: 6507 D: 23577

LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 32301 W: 5101 L: 4858 D: 22342

Bench: 8073614

Resolves #436
Commits on Oct 05, 2015
@mcostalba mcostalba Run PVS-STUDIO analyzer
Fix issues after a run of PVS-STUDIO analyzer.
Mainly false positives but warnings are anyhow
useful to point out not very readable code.

Noteworthy is the memset() one, where PVS prefers ss-2
instead of stack. This is because memeset() could
be optimized away by the compiler when using 'stack',
due to stack being a local variable no more used after
memset. This should normally not happen, but when
it happens it leads to very sublte and difficult
to find bug, so better to be safe than sorry.

No functional change.
@mcostalba mcostalba Fix a comment in TTEntry::save
Comment was slightly incorrect.

No functional change.
@mcostalba mcostalba Add Trevis CI support
Add Travis CI support to GitHub repo.

After every push to master, Travis will build
the sources directly from GitHub repo according
to .travis.yml and verify everything is ok.

No functional change.
Commits on Oct 06, 2015
@Rocky640 Rocky640 Remove queen threat evaluation
Threats by queen seem to be worthless.

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 13627 W: 2607 L: 2473 D: 8547

LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 19146 W: 2950 L: 2827 D: 13369

Bench: 8222484

Resolves #439
@Stefano80 Stefano80 Tuning of assorted values
Passed STC

LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 45401 W: 8590 L: 8274 D: 28537

Passed LTC

LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 36089 W: 5589 L: 5331 D: 25169

Bench: 8397672

Resolves #445
@mcostalba mcostalba Travis CI: add clang and osx
Extend builds to clang and osx platforms.

And check bench numbers.

No functional change.
Commits on Oct 07, 2015
@mcostalba mcostalba Travis CI: add gcc 4.8 for osx
This setup was still missing.

Suggested by Stéphane Nicolet.

No functional change.
@jcalovski jcalovski Retire rook contact checks
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 34114 W: 6363 L: 6265 D: 21486

LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 61776 W: 9349 L: 9289 D: 43138

LTC (after rebasing):
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 15261 W: 2343 L: 2214 D: 10704

Bench: 7523382

Resolves #442
Commits on Oct 12, 2015
@VoyagerOne VoyagerOne Combination of two ideas:
Apply bonus for the prior CMH that caused a fail low.

Balance Stats: CMH and History bonuses are updated differently.
This eliminates the "fudge" factor weight when scoring moves. Also
eliminated discontinuity in the gravity history stat formula. (i.e. stat
scores will no longer inverse when depth exceeds 22)

LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 21802 W: 4107 L: 3887 D: 13808

LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 46036 W: 7046 L: 6756 D: 32234

Bench: 7677367
Commits on Oct 16, 2015
@snicolet snicolet Asymmetry bonus for the attacking side
Use asymmetry in the position (king separation, pawn structure) to
compute an "initiative bonus" for the attacking side.

Passed STC:
LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 14563 W: 2826 L: 2636 D: 9101

And LTC:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 14363 W: 2317 L: 2141 D: 9905

Bench: 8116244

Resolves #462
Commits on Oct 20, 2015
@mbootsector mbootsector Lazy SMP
Start all threads searching on root position and
use only the shared TT table as synching scheme.

It seems this scheme scales better than YBWC for
high number of threads.

Verified for nor regression at STC 3 threads
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40232 W: 6908 L: 7130 D: 26194

Verified for nor regression at LTC 3 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 28186 W: 3908 L: 3798 D: 20480

Verified for nor regression at STC 7 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 3607 W: 674 L: 526 D: 2407

Verified for nor regression at LTC 7 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 4235 W: 671 L: 528 D: 3036

Tested with fixed games at LTC with 20 threads
ELO: 44.75 +-7.6 (95%) LOS: 100.0%
Total: 2069 W: 407 L: 142 D: 1520

Tested with fixed games at XLTC (120secs) with 20 threads
ELO: 28.01 +-6.7 (95%) LOS: 100.0%
Total: 2275 W: 349 L: 166 D: 1760

Original patch of mbootsector, with additional work
from Ivan Ivec (log formula), Joerg Oster (id loop
simplification) and Marco Costalba (assorted formatting
and rework).

Bench: 8116244
Commits on Oct 21, 2015
@Stefano80 Stefano80 Almost passed tuning attempts
Collect and give a second try to some almost passed tuning attempts and
one-line tweaks from the last month.

Passed STC

LLR: 3.07 (-2.94,2.94) [0.00,4.00]
Total: 15124 W: 2974 L: 2756 D: 9394


LLR: 2.95 (-2.94,2.94) [0.00,4.00]
Total: 21577 W: 3507 L: 3289 D: 14781

Bench: 8855226

Resolves #464
Commits on Oct 22, 2015
@mcostalba mcostalba Update authors
Fishtest is a key factor of SF success.

Thanks to Fishtest we have not only greately
improved ELO but, even more important, we
have enabled a kind of joint development and
testing that it is the herat of on open
source project like SF.

Open sourcing is not just about open code, it is
about commuity development. In case of a chess engine
this has never been possible before due to missing
a strong and strict testing environment that allows
many people to contribute in a safe and coordinate way.

Fishtest is a new way of developing chess engines,
something that has never exsisted before.

No functional change.
Commits on Oct 24, 2015
@VoyagerOne VoyagerOne History pruning
Prune moves with negative History
and CMH scores at low depth.

LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 24182 W: 4672 L: 4439 D: 15071

LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 12579 W: 1959 L: 1792 D: 8828

bench: 8907701
@Rocky640 Rocky640 Simplify threats
Using less parameters and code to compute Threats
Includes also a few spacing edits.

Run as a simplification.

Passed STC 10+0.1
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 18879 W: 3725 L: 3600 D: 11554

Passed LTC 60+0.4
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 74116 W: 11001 L: 10958 D: 52157

bench: 8004751
@mcostalba mcostalba Cleanup history stats
And other assorted trivia.

No functional change.
@Stefano80 Stefano80 KRPPKRP endgame: Simplify ugly switch statement
No functional change

Resolves #470
Commits on Oct 25, 2015
@lucasart lucasart Use atomics instead of volatile
Rely on well defined behaviour for message passing, instead of volatile. Three
versions have been tested, to make sure this wouldn't cause a slowdown on any

v1: Sequentially consistent atomics

No mesurable regression, despite the extra memory barriers on x86. Even with 15
threads and extreme time pressure, both acting as a magnifying glass:

threads=15, tc=2+0.02
ELO: 2.59 +-3.4 (95%) LOS: 93.3%
Total: 18132 W: 4113 L: 3978 D: 10041

threads=7, tc=2+0.02
ELO: -1.64 +-3.6 (95%) LOS: 18.8%
Total: 16914 W: 4053 L: 4133 D: 8728

v2: Acquire/Release semantics

This version generates no extra barriers for x86 (on the hot path). As expected,
no regression either, under the same conditions:

threads=15, tc=2+0.02
ELO: 2.85 +-3.3 (95%) LOS: 95.4%
Total: 19661 W: 4640 L: 4479 D: 10542

threads=7, tc=2+0.02
ELO: 0.23 +-3.5 (95%) LOS: 55.1%
Total: 18108 W: 4326 L: 4314 D: 9468

As suggested by Joona, another test at LTC:

threads=15, tc=20+0.05
ELO: 0.64 +-2.6 (95%) LOS: 68.3%
Total: 20000 W: 3053 L: 3016 D: 13931

v3: Final version: SeqCst/Relaxed

threads=15, tc=10+0.1
ELO: 0.87 +-3.9 (95%) LOS: 67.1%
Total: 9541 W: 1478 L: 1454 D: 6609

Resolves #474
Commits on Oct 29, 2015
@snicolet snicolet Some code and comment cleanup
- Remove all references to split points
- Some grammar and spelling fixes

No Functional change

Resolves #478
Commits on Oct 30, 2015
@ajithcj ajithcj Reduce variation in rootDepth between different threads
Reduce the variation in Root Depth between different threads. This
prevents threads from searching at a depth much higher than Main Thread.

Performed well at STC 24 Threads:
ELO: 3.44 +-3.8 (95%) LOS: 96.1%
Total: 10000 W: 1627 L: 1528 D: 6845

And LTC 24 Threads
LLR: 1.43 (-2.94,2.94) [0.00,4.00]
Total: 3804 W: 500 L: 420 D: 2884
ELO : +7.31
p-value: 73.16%

Passed no regression at STC 3 Threads:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40457 W: 7148 L: 7060 D: 26249

And LTC 3 Threads:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 17704 W: 2489 L: 2364 D: 12851

Raising a pull request early as 24 Thread tests are very expensive and
this is clearly a positive gain at high thread counts and high time
controls. The change is a small parameter tweak with no additional

No functional change for single thread mode.

Resolves #481
Commits on Oct 31, 2015
@VoyagerOne VoyagerOne New History Bonus Formula
bonus = d^2 + d - 1

Bench: 8639247

Resolves #484
@mcostalba mcostalba Assorted trivia in search.cpp
The only interesting change is the moving of
stack[MAX_PLY+4] back to its original position
in id_loop (now renamed Thread::search).

No functional change.
Commits on Nov 02, 2015
@mbootsector mbootsector Pick bestmove from the deepest thread.
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 26930 W: 4441 L: 4214 D: 18275

LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 7783 W: 1017 L: 876 D: 5890

No functional change in single thread mode

Resolves #485
Commits on Nov 03, 2015
@mcostalba mcostalba Get rid of timer thread
Unfortunately std::condition_variable::wait_for()
is not accurate in general case and the timer thread
can wake up also after tens or even hundreds of
millisecs after time has elapsded. CPU load, process
priorities, number of concurrent threads, even from
other processes, will have effect upon it.

Even official documentation says: "This function may
block for longer than timeout_duration due to scheduling
or resource contention delays."

So retire timer and use a polling scheme based on a
local thread counter that counts search() calls and
a small trick to keep polling frequency constant,
independently from the number of threads.

Tested for no regression at very fast TC 2+0.05 th 7:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 32969 W: 6720 L: 6620 D: 19629

TC 2+0.05 th 1:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 7765 W: 1917 L: 1765 D: 4083

And at STC TC, both single thread
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 15587 W: 3036 L: 2905 D: 9646

And with 7 threads
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 8149 W: 1367 L: 1227 D: 5555

bench: 8639247
Commits on Nov 10, 2015
@lucasart lucasart Ensure that rootDepth < DEPTH_MAX
Indeed, if we use a depth >= DEPTH_MAX, we start having negative depth in the
TT (due to int8_t cast).

No functional change in single thread mode

Resolves #490
@lucasart lucasart Avoid friend
operator<<(os, pos) does not need to access any private members of pos.

No functional change.

Resolves #492
Commits on Nov 13, 2015
@mcostalba mcostalba Fix broken UCI 'wait for stop'
When we reach the maximum depth, we can finish the
search without a raise of Signals.stop. However, if
we are pondering or in an infinite search, the UCI
protocol states that we shouldn't print the best move
before the GUI sends a "stop" or "ponderhit" command.

It was broken by lazy smp. Fix it by moving the stopping
of the threads after waiting for GUI.

No functional change.
@mcostalba mcostalba Retire ThreadBase
Now that we don't have anymore TimerThread, there is
no need of this long class hierarchy.

Also assorted reformatting while there.

To verify no regression, passed at STC with 7 threads:
LLR: 2.97 (-2.94,2.94) [-5.00,0.00]
Total: 30990 W: 4945 L: 4942 D: 21103

No functional change.
Commits on Nov 14, 2015
Stefan Geschwentner Bonus for reachable outpost
Give a bonus for outpost squares which in reach of a bishop or knight.

LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 22725 W: 4570 L: 4339 D: 13816

LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 15019 W: 2333 L: 2157 D: 10529

Bench: 8503181

Resolves #495
Commits on Nov 16, 2015
@kenta2 kenta2 Do not conceal the invocation of the benchmark program
It is better to be able to see what arguments it is being called with.

No functional change

Resolves #497
@VoyagerOne VoyagerOne History Pruning: Don't prune the main killer move.
Also increased pruned depth to 4.

LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 23380 W: 4581 L: 4350 D: 14449

LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 28934 W: 4329 L: 4105 D: 20500

Bench: 8369743

Resolves #498
Commits on Nov 21, 2015
@mcostalba mcostalba Rewrite how threads are spawned
Instead of creating a running std::thread and
returning, wait in Thread c'tor that the native
thread of execution goes to sleep in idle_loop().

In this way we can simplify how search is started,
because when main thread is idle we are sure also
all other threads will be idle, in any case, even
at thread creation and startup.

After lazy smp went in, we can simpify and rewrite
a lot of logic that is now no more needed. This is
hopefully the final big cleanup.

Tested for no regression at 5+0.1 with 3 threads:
LLR: 2.95 (-2.94,2.94) [-5.00,0.00]
Total: 17411 W: 3248 L: 3198 D: 10965

No functional change.
@lucasart lucasart Fix TT comment and static_assert()
Comment is based on a misunderstanding of what unaligned memory access is. Here
is an article that explains it very clearly:

No matter how we define TTEntry or TTCluster, there will never be any unaligned
memory access. This is because the complier knows the alignment rules, and does
the necessary adjustments to make sure unaligned memory access does not occur.

The issue being adressed here has nothing to do with unaligned memory access. It
is about cache performance. In order to achieve best cache performance:
- we prefetch the cacheline as soon as possible.
- we ensure that TT clusters do not spread across two cachelines. If they did,
  we would need to prefetch 2 cachelines, which could hurt cache performance.

Therefore the true conditions to achieve this are:
1/ start adress of TT is cache line aligned. void TranspositionTable::resize()
enforces this.
2/ TT cluster size should *divide* the cache line size. Currently, we pack 2
clusters per cache lines. It used to be 1 before "TT sardines". Does not matter
what the ratio is, all we want is to fit an integer number of clusters per cache

No functional change.

Resolves #506
mstembera Clean up RootMove less operator
This is used by std::stable_sort() to sort moves from highest score to lowest score.

1) The comment is incorrect since highest to lowest means descending.
2) It's more natural to implement a less operator using another less operator rather than a greater operator.

No functional change.

Resolves #504