Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
Introduce Null Threat extension
In case of null search at low depths returns a fail low due to a threat then, rather than return beta-1 (to cause a re-search at full depth in the parent node), we set a flag threatExtension = true (false by default) that will cause moves that prevent the threat to be extended of one ply in the following search. Idea and patch is by Lucas Braesch. Lucas also did the tests: 1500 games in 5"+0.05": SF_threatExtension vs SF_20121222: 366 - 331 - 803 [51.2%] LOS=90.8% 3000 games in 10"+0.1": SF_threatExtension vs SF_20121222: 610 - 559 - 1831 [50.8%] LOS=93.2% Tests confirmed by Gary after 10570 games, ELO: 2.79 +- 99%: 8.72 95%: 6.63 LOS: 94.08% Wins: 1523 Losses: 1438 Draws: 7607 And finally by me at 15"+0.05, single thread, 3824 games threatExtension vs master 768 - 692 - 2364 +7 ELO bench 4918443
- Loading branch information
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marco,
Really a two cent cosmetic detail here, but I don't really like the names yields_threat() and prevents_threat().
How about allows_move() and prevents_move(). This has a more general appeal, at least, and is more self-documenting:
(in practice m2 is always going to be the threat move, but who know, maybe it will be used somewhere else in the future)
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marco,
I have another patch for you, which is the natural extension of this one:
8000 games in 10"+0.1" (in progress) on my new powerful computer:
test vs SF_20121225: 608 - 510 - 1562 [0.518] 2680
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot the signatures:
I was quite surprised by the large increase in number of nodes on the ./stockfish bench. But that's hardly relevant. Results are still looking good:
test vs SF_20121225: 684 - 565 - 1761 [0.520] 3010
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that should be enough games, given that the results are quite clear:
test vs SF_20121225: 1249 - 1091 - 3260 [0.514] 5600
LOS = 99.95%
I think we've milked dry the null threat area for the moment. Perhaps increasing a little the "depth < 5*ONE_PLY" condition could improve something, but I doubt it would give much.
I'm going to play with the eval now, and see if I can remove some useless things there. The more I experiment with my eval, the more I understand the "less is more" concept ;-)
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Lucas,
Great stuff :). I'm running a test of the (ss-1)->reduction idea right now, results don't seem quite as clear locally. I will keep the test running. One thing I've noticed is running all cores of my system caused results to be inaccurate. So on my 4 core machine, I typically run 3 1-thread games at a time.
ELO: -1.31 +- 99%: 15.28 95%: 11.60
LOS: 34.32%
Wins: 511 Losses: 524 Draws: 2412
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've stopped testing the (ss-1)->reduction by itself, and started with Marcos idea as well. It was still at 50% almost exactly when I stopped.
Current version: https://github.com/glinscott/Stockfish/compare/null_threat
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gary,
You're right about testing with N-1 games in parrallel, when having N processors. I'm going to do that from now on. It seems that everyone who knows what they are doing, is doing that. I remember a post on talkchess, where Robert Houdart was running 31 concurrent games on a 32 CPU machine.
I am going to run the same experiment as you. Ideally that means we can sum up our results and have more games in this way, and more precise error bar.
master = 4918443
test = 5045645
test running with 7 concurrent games (1 thread for each stockfish process): 8000 games in 10"+0.1".
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not SMP knowledgeable at all :-(
Do you think that enabling hyper threading in the BIOS will allow me to run 8 concurrent games safely (on 8 CPU) ?
Or is it better to switch hyper threading off, and play 7 concurrent games ?
PS: I'm using Linux 3.5.0-15 (x86_64 SMP)
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, I'm not sure it's a good idea to do the threat extension with captures. At low depths, quiet moves are often victim of aggressive reuction/pruning in SF, but not so much captures. What we typically want not to miss here are quiet moves than may refute the threat. So I am restarting a test with this patch only, as I did before:
Just wanna verify that my result wasn't biaised due to (i) early stopping, or to (ii) using 8 concurrent games i/o 7. This time I run 6000 games and I don't stop until finished, regardless of the result (early stopping is a very dangerous source of biais).
signatures:
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would definitely avoid hyperthreading when running engine matches. Engines push the cpu cores so hard that hyperthreading is usually not very effective. So if there are 8 physical cpu cores, running 7 games with no pondering is safe. That leaves a core for the OS/XWindows/ssh/etc.
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I tedted both versions, and my conclusion is that none of them is either an improvement or a regression (within error bar, after 6000 games, concurrency=7 no early stopping biais).
(1) signature=5627788: 2221-2222-3557 49.99% LOS=49.40% (ss-1)->reduction
(2) signature=5045645: 2235-2312-3453 49.52% LOS=12.67% (ss-1)->reduction + marco's modification to prevent_threat() for captures
So you can commit whichever version you prefer, or nothing at all. Up to you Marco: you're the boss!
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
894c43aThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right Marco: less is more :-)