-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WDL value head support #635
Conversation
Strength increase in testing is probably from the additional training on TB rescored test10 data instead of WDL head. I wouldn't expect it to be any stronger. All convolutional layers and policy head were set to non-trainable, so the only difference in this network is the fully connected layers of the value head. Tests fail because protobuf needs to be updated. |
Added the tree search part. Previous commit also includes parameter for adjusting the score of the draw, but I removed since it didn't gain any Elo in testing. Verbose-move-stats was modified to report WDL scores for testing purposes.
|
src/mcts/search.cc
Outdated
<< ") "; | ||
|
||
oss << "(U: " << std::setw(6) << std::setprecision(5) << edge.GetU(U_coeff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verbose stats still needs to print U. (And Q+U is convenient.)
This implements #79, right? Like I said there, it would then also be possible to shorten the average training game length by allowing draws by agreement. |
Yes, but we can do that as a follow up PR. |
But wouldn't the usefulness of that come from the ability to do stuff like force the engine to play for a draw or for a win or with some other contempt-like factor? |
More work can be done on this aspect after the PR is submitted. |
Needs modifications for protobuf changes in: LeelaChessZero/lczero-common#8 |
Maybe change https://github.com/orgs/LeelaChessZero/projects/1#card-10518034 to point to this instead? |
Merged with master and implemented the protobuf changes. Test network with WDL value head and convolutional policy head: http://hforsten.com/leelaz/128x10-az-pol-map-wdl-200000.pb.gz |
Just tested that 32 bit builds work. |
iirc, currently move_count is disabled |
WDL head support for all backends. Includes just the backend support and doesn't do anything yet with the draw information.
Needs updated protobuf: LeelaChessZero/lczero-common#6
Training code for replacing the old value head with WDL in existing old type network: https://github.com/Ttl/lczero-training/tree/wdl_surgery
11248 with WDL value head: http://hforsten.com/leelaz/11248-wdl.pb.gz(Obsolete due to protobuf changes)