Permalink
Browse files

A better contempt implementation for Stockfish (#1325)

* A better contempt implementation for Stockfish

The round 2 of TCEC season 10 demonstrated the benefit of having a nice contempt implementation: it gives the strongest programs in the tournament the ability to slow down the game when they feel the position is slightly worse, prefering to stay in a complicated (even if slightly risky) middle game rather than simplifying by force into a drawn endgame.

The current contempt implementation of Stockfish is inadequate, and this patch is an attempt to provide a better one.

Passed STC non-regression test against master:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 83360 W: 15089 L: 15075 D: 53196
http://tests.stockfishchess.org/tests/view/5a1bf2de0ebc590ccbb8b370

This contempt implementation is showing promising results in certains situations. For instance, it obtained a nice +30 Elo gain when playing with contempt=40 against Stockfish 7, compared to current master:

• master against SF 7 (20000 games at LTC): +121.2 Elo
• this patch with contempt=40 (20000 games at LTC): +154.11 Elo

This was the result of real cooperative work from the Stockfish team, with key ideas coming from Stefan Geschwentner (locutus2) and Chris Cain (ceebo) while most of the community helped with feedback and computer time.

In this commit the bench is unchanged by default, but you can test at home with the new contempt in the UCI options. The style of play will change a lot when using contempt different of zero (I repeat: not done in this version by default, however)!

The Stockfish team is still deliberating over the best default contempt value in self-play and the best contempt modeling strategy, to help users choosing a contempt value when playing against much weaker programs. These informations will be given in future commits when available :-)

Bench: 5051254

* Remove the prefetch

No functional change.
  • Loading branch information...
snicolet authored and mcostalba committed Dec 5, 2017
1 parent d193482 commit be382bb0cf5927dc10ff9be882f6980a78d1484a
Showing with 11 additions and 11 deletions.
  1. +2 −1 src/evaluate.cpp
  2. +2 −0 src/evaluate.h
  3. +7 −10 src/search.cpp
View
@@ -840,7 +840,7 @@ namespace {
// Initialize score by reading the incrementally updated scores included in
// the position object (material + piece square tables) and the material
// imbalance. Score is computed internally from the white point of view.
Score score = pos.psq_score() + me->imbalance();
Score score = pos.psq_score() + me->imbalance() + Eval::Contempt;
// Probe the pawn hash table
pe = Pawns::probe(pos);
@@ -903,6 +903,7 @@ namespace {
} // namespace
Score Eval::Contempt = SCORE_ZERO;
/// evaluate() is the evaluator for the outer world. It returns a static evaluation
/// of the position from the point of view of the side to move.
View
@@ -31,6 +31,8 @@ namespace Eval {
const Value Tempo = Value(20); // Must be visible to search
extern Score Contempt;
std::string trace(const Position& pos);
Value evaluate(const Position& pos);
View
@@ -96,8 +96,6 @@ namespace {
Move best = MOVE_NONE;
};
Value DrawValue[COLOR_NB];
template <NodeType NT>
Value search(Position& pos, Stack* ss, Value alpha, Value beta, Depth depth, bool cutNode, bool skipEarlyPruning);
@@ -202,8 +200,9 @@ void MainThread::search() {
TT.new_search();
int contempt = Options["Contempt"] * PawnValueEg / 100; // From centipawns
DrawValue[ us] = VALUE_DRAW - Value(contempt);
DrawValue[~us] = VALUE_DRAW + Value(contempt);
Eval::Contempt = (us == WHITE ? make_score(contempt, contempt / 2)
: -make_score(contempt, contempt / 2));
if (rootMoves.empty())
{
@@ -444,7 +443,7 @@ void Thread::search() {
int improvingFactor = std::max(229, std::min(715, 357 + 119 * F[0] - 6 * F[1]));
Color us = rootPos.side_to_move();
bool thinkHard = DrawValue[us] == bestValue
bool thinkHard = bestValue == VALUE_DRAW
&& Limits.time[us] - Time.elapsed() > Limits.time[~us]
&& ::pv_is_draw(rootPos);
@@ -532,8 +531,7 @@ namespace {
{
// Step 2. Check for aborted search and immediate draw
if (Threads.stop.load(std::memory_order_relaxed) || pos.is_draw(ss->ply) || ss->ply >= MAX_PLY)
return ss->ply >= MAX_PLY && !inCheck ? evaluate(pos)
: DrawValue[pos.side_to_move()];
return ss->ply >= MAX_PLY && !inCheck ? evaluate(pos) : VALUE_DRAW;
// Step 3. Mate distance pruning. Even if we mate at the next move our score
// would be at best mate_in(ss->ply+1), but if alpha is already bigger because
@@ -1074,7 +1072,7 @@ namespace {
if (!moveCount)
bestValue = excludedMove ? alpha
: inCheck ? mated_in(ss->ply) : DrawValue[pos.side_to_move()];
: inCheck ? mated_in(ss->ply) : VALUE_DRAW;
else if (bestMove)
{
// Quiet best move: update move sorting heuristics
@@ -1142,8 +1140,7 @@ namespace {
// Check for an instant draw or if the maximum ply has been reached
if (pos.is_draw(ss->ply) || ss->ply >= MAX_PLY)
return ss->ply >= MAX_PLY && !InCheck ? evaluate(pos)
: DrawValue[pos.side_to_move()];
return ss->ply >= MAX_PLY && !InCheck ? evaluate(pos) : VALUE_DRAW;
assert(0 <= ss->ply && ss->ply < MAX_PLY);

0 comments on commit be382bb

Please sign in to comment.