Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porting certainty propagation to new codebase. Notes and Outline. #13

Open
Videodr0me opened this issue Aug 17, 2018 · 18 comments
Open
Labels
documentation Improvements or additions to documentation

Comments

@Videodr0me
Copy link

Certainty propagation

Certainty propagation is known in literature as MCTS-Solver (Winands et. al). If we reach terminal (which are "certain") nodes, we can propagate the "certain" results from these nodes efficiently up the search-tree. For example if one side can play a move thats leads to a terminal win node, we can make the parent node a "certain" node scored as a loss from the opponents view. Further, nodes whose children are all certain, become certain themselves with the -max q among the certain childs. With this even MCTS can solve some shallow mates, and steer exploring the tree more efficiently and avoid branches that are "proven" to no longer need exploration. See also https://github.com/Videodr0me/leela-chess-experimental/wiki/MCTS-Solver---Certainty-Propagation-and-Autoextending

Features of full Implementation (as in my experimental build):

  • One ply look-ahead. Checking for terminals is relatively cheap compared to NN evals. Checking when creating the skeleton-childs (pre edge-node separation) detects terminals (and make the skeleton-child a full terminal node) one ply earlier, making leela see some mating combos earlier. Though not impossible with the new codebase, this can be skipped initially.
  • Two-Fold draw scoring. I folded this into certainty propagation, but this could easily be a separate option. I reused the certain state of a node to indicate a two-fold, and as soon as the node becomes root (tree-reuse), I uncertain it and recalculate its stats (more about this later). Not strictly needed, can be skipped initially.
  • Root-selection rule. In the lastest version of this, I went as far as to not search certain childs of root at all (so if a root-child became certain it would stop getting visits). This requires some changes, to (1) PUCT to choose non-certain childs at root, to (2) GetBestChild, so that the most visited is chosen unless there is a certain move with a higher q and to (3) the best move update along the same lines as GetBestChild. This also ins not necessary as a first step, even though I would propose to at least (independent of visits) choose a winning certain child of root and avoid certain loosing childs at root in GetBestChild and best move update. PUCT can remain unchanged (and if changed should NOT avoid loosing moves except at root). For choosing between a terminal threefold and a two-fold, I used NN eval of parent. But again, this is maybe not for an inital version.
  • Mate Display. Purely Cosmetic can be omitted. I do it in a hacky way, in that if best move is a certain win (or certain loss), I just take the length of x = PV/2 as the length of mate (this is approximate, as we only know its winning, but do not know what is the shortest path). Uci Output in this case is modified from cp score to mate x (or -x). Not really needed, and also if the certain state of a node is overloaded with TB results, it might not be a mate but just the path to a TB win.
  • Autoextending single legal moves. Plays well with certainty propagation (especially for forced mates). But for such a little change, was awkward to implement. I opted for further extending if picknodeextend returned a move with only one single move, and when backuping that score copying it from child to parent (also for chains of single legal moves, even though extremely rare in chess). Even though it helps with forced mate combos, it can be skipped initially.

Note on the basic implementation (what are the essentials):

-IsCertain(). Nodes get a new attribute "certain". All terminal nodes are certain, but not all certain nodes are terminal. MakeCertain(v) makes a node certain. MakeTerminal(result) makes a node terminal and certain. Most checks in the main loop are changed from IsTerminal() to IsCertain(), This changes nothing in regard to terminals (all terminals are also certain). The terminal state can be "overloaded" with this, but i opted for a separate bool for clarity and so i can turn the feature of more easily via uci options (for self-play testing). Later this became useful as for example two-fold draw-scoring needs to differentiate between terminal draws and two-fold draws. I could imagine something similar with TB and a "swindle" mode when playing against non TB opponents.

-The main propagation is done in the (new codebase) DoBackupUpdate(). I store the certain scores in both Q and V, and opted to fetch them via V. But fetching them via Q would work as well. I use hacky float == x.x compares, which works because i set them exactly as such, but there are nicer ways (for example use game:result or something). For efficiency I only do the certainty check if the update originated from a certain node (which can also be a terminal) -> v_is_certain = true. Root is never made certain (actually it is made uncertain before search, if it was certain due to tree-reuse). I "overloaded" the code to at root also aid in the best move determination (thats what the n==root_node_ condition is for) and for tie breaking (two-folds, certain wins vs. terminal wins.). This is a simplified version without the overloading, its actually really simple:

  // MCTS Solver or Proof-Number-Search
  // kCertaintyProp > 0
     if (kCertaintyProp && v_is_certain && !n->IsCertain())) {
	bool children_all_certain = true;
        float best_certain_v = -3.0f;
        Node *best_certain_node = nullptr;
	for (Node* iter : n->Children()) {
	  if (!iter->IsCertain()) { children_all_certain = false; }
	    else if ((iter->GetV()) >= best_certain_v) {
	      best_certain_node = iter;
	      best_certain_v = iter->GetV();
	      }
	 }
	 if (((children_all_certain) || (best_certain_v == 1.0f)) && (n != root_node_)) {
	     v = -best_certain_v;
	     n->MakeCertain(v);
	    ++madecertain_;
         }

This is basically it, if all childs are certain we make the node certain with the -max of the certain childs. Or if we find at least one certain winning child, then the node is a certain loss for the opponent. This gets backpropagated in the tree.

The full version of this with the following best move determination and the tie break rules are here https://github.com/Videodr0me/leela-chess-experimental/blob/master/src/mcts/search.cc#L339 I currently do best move determination in two stages (mainly because of the not searching certain childs of root). First the normal update (which is preliminary) and then at root (one ply higher) I resuse the info about the best certain move from the linked full version. This again can be done simpler initially.

The changes to GetBestMove are here https://github.com/Videodr0me/leela-chess-experimental/blob/d2088837fff01779dbf8fc606af6e92c98c1a449/src/mcts/search.cc#L505 These compute normal best move and best certain move, and then decide between them. This can greatly be simplified if we initially continue searching certain childs at root (basically prioritizing everything over certain losses and always take certain wins.). Also the tiebreak rules can safely be ignored (not really sure they yield any meaningful elo anyhow for example in prefering a certain loss over a terminal loss, in hope the opponent did not see the mate).

The changes only affect node.h, node.cc, search.h and search.cc. Only conditionals with if (kCertaintyProb) are relevant. And of those only those that are general "kCertaintyProb" or "kCertaintyProp==1". The rest is irrelevant. I hope this helps for a start. The changes in ExtendNode (old codebase) are only relevant for the one-ply lookahead and two-fold draw scoring (and the tree balancing).

@RedDenver
Copy link

@Videodr0me @dubslow I originally came over to this project with the thought of doing certainty propagation. If I have some time this weekend, I'll try to get this implemented.

@Videodr0me
Copy link
Author

Videodr0me commented Aug 18, 2018

@RedDenver Great! Feel free to add any questions you might have to this issue, i will try to clarify and can also look at your code and give input. Don't expect to much, though. In my testing doing this is approx. +10 Elo without TB in selfplay, I suspect that this will get more valuable with TB and some tests indicate it also helps a little more against a/b opponents, but nothing major. It is kind of nice that it finds shallow mates, sometimes missed by standard lc0 and also is a little better than human mating abilities, sometimes finding "hidden" mates in 8 or 9.

@dubslow
Copy link
Member

dubslow commented Aug 18, 2018

Of course once this makes it into the training it will be worth at least 100 elo, is my guess

@RedDenver
Copy link

Should I wait for egtb to get merged and then build off of that since tb hits will be certain?

@Videodr0me
Copy link
Author

I am under the impression that the merge of TB code is imminent. So you might wait, even though there are almost no interactions between two code changes. The main one is that TB scored nodes are also made certain (and not terminal - if you do not want to "overload" the terminal bool), except for that almost nothing should change in the TB code by adding certainty prop.

@dubslow
Copy link
Member

dubslow commented Aug 19, 2018

I would say also wait on, or start from, the Node rearrangement PR. Adding multibool support will be conflicts galore when that PR is added. I also might start soon based on that PR, perhaps we should make a branch in the main repo?

@RedDenver
Copy link

I'll have some time tonight to get started on this, so let me know if you branch; otherwise, I'll just start off of current master.

@dubslow
Copy link
Member

dubslow commented Aug 20, 2018

@RedDenver
Copy link

I'm having issues with protobuf and can't really do much since I can't run it: LeelaChessZero/lc0#283

@Videodr0me
Copy link
Author

I hope the protobug issues are sorted now!? I would recommend to make this a separate branch at first. Actually i am advocating for a separate search extension branch in general and hope it gathers some traction. All search changes (for example: minimaxq, certainty prop, sticky checkmate) would after some testing go to that branch first. Only after a second round of testing and some experience in the "wild" they would get merged to master. This "search" branch would be the cutting edge and somewhat experimental. It would prevent frequent changes to master so that contributors have a stable version for a longer period of time to contribute training games.

@gonzalezjo
Copy link

gonzalezjo commented Aug 25, 2018

@dubslow

once this makes it into the training it will be worth at least 100 elo

Was that a typo? That's huge. If it wasn't a typo, it might be a good idea to wait for this before any reset. But is 100 Elo realistic?

@gonzalezjo
Copy link

Interesting game from self-play, where Leela misses many opportunities for a mate in 3. https://lichess.org/o6XsH9X6#180

Would certainty propagation solve this?

@dubslow
Copy link
Member

dubslow commented Sep 6, 2018

Depends on how temperature interacts with certainty. Playing mates in 1 should never be temperature-ified, as per my sticky checkmate code, but I'm thinking we still want temperature even with certain mates seen, because certain mates doesn't exclude possibly shorter mates.

I do think the constant t=1 is rather suboptimal, but that's slightly outside the scope of this topic. In scenarios where we improve the temperature scheme, we can "reduce the temperature when certain mates are found", so to speak.

@oscardssmith
Copy link
Contributor

@dubslow how will temperature help find shorter mates? Leela's training says that mate is mate, so policy of different mating positions won't change much on average.

@dubslow
Copy link
Member

dubslow commented Sep 7, 2018

Leela's training says that mate is mate

Actually this isn't true at the moment. Even in training, policy is trained that non-mate alternatives to mate should get some portion of the policy. I think this is an oversight and proposed fixing that some weeks ago, but it was tabled.

And shorter mates doesn't have anything to do with the mating move, but rather with finding the moves before the mate that lead to that quicker mate.

@Videodr0me
Copy link
Author

As Dubslow is MIA, i am porting it right now. A basic version will probably be ready for bug-hunting in a week.

@mooskagh mooskagh changed the title Porting certainty propagation to new codebase. Notes and Outline. Porting certainty propagation to new codebase. Nov 21, 2018
@mooskagh mooskagh changed the title Porting certainty propagation to new codebase. Porting certainty propagation to new codebase. Notes and Outline. Nov 21, 2018
@mooskagh
Copy link
Member

Outdated unfortunately. But worth keeping in documentation.

@mooskagh mooskagh transferred this issue from LeelaChessZero/lc0 Feb 27, 2020
@mooskagh mooskagh added the documentation Improvements or additions to documentation label Feb 27, 2020
@Dboingue
Copy link

Dboingue commented Feb 27, 2020

I agree, it may serve as an educational basis for those of lc0-wide willing to enter the discussions seriously by exposing the points where one needs to be careful (logical joints in the discussion). The type of ideas that were receivable, and the keywords. Would help me build a discussion glossary a bit like I started in the forum, from sifting though those issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

7 participants