Full implementation of resign #47

gonzalezjo · 2018-06-06T20:25:33Z

Seriously concerned about code quality here, because I'm really not the best at C++. I feel like there were better ways to do things better. That said, these edits seem to work perfectly fine in my testing.

GetBestMove was modified because I couldn't think of a cleaner way of doing things. If you know a better way (there has to be) then I'd say go ahead and implement it. I would've done a struct, but I wasn't sure if that would be okay with the project or not.

Really concerned about code quality here and in search.cc. Better approaches are welcome.

Gets rid of a comment from when resign was only partially implemented.

mooskagh · 2018-06-06T21:07:09Z

src/mcts/search.cc

@@ -83,6 +85,9 @@ void Search::PopulateUciParams(OptionsParser* options) {
                            "policy-softmax-temp") = 1.0f;
  options->Add<IntOption>(kAllowedNodeCollisionsStr, 0, 1024,
                          "allowed-node-collisions") = 0;
+  options->Add<IntOption>(kResignPercentageStr, 0, 100, 


FloatOption

Good catch! Thanks.

mooskagh · 2018-06-06T21:08:36Z

src/mcts/search.cc

@@ -49,6 +49,8 @@ const char* Search::kExtraVirtualLossStr = "Extra virtual loss";
 const char* Search::kPolicySoftmaxTempStr = "Policy softmax temperature";
 const char* Search::kAllowedNodeCollisionsStr =
    "Allowed node collisions, per batch";
+const char* Search::kResignPercentageStr = 
+    "Minimum estimated win chance before resign. Zero is off.";


In many GUIs long UCI params don't fit, so consider shortening it.
Also UCI params don't have periods in the end (and --help adds period itself)

mooskagh · 2018-06-06T21:12:42Z

src/mcts/search.cc

@@ -718,17 +725,29 @@ std::pair<Node*, bool> Search::PickNodeToExtend(Node* node,
  }
 }

-std::pair<Move, Move> Search::GetBestMove() const {
+std::tuple<Move, Move, int> Search::GetBestMove() const {


If it's search's setting and not calling class, then it should return decision, and not percentage.
In this case, it's better to replace that tuple with a struct (maybe reuse BestMoveInfo, but actually it has too many unneeded fields, so probably make another struct).

However, as it doesn't really makes sense in some modes (uci), maybe it's better to move this setting out of search.cc into tournament.cc or game.cc, and instead of GetBestMove() returning decision, have a separate function float Search::GetBestMoveWinrate() const.

maybe it's better to move this setting out of search.cc into tournament.cc or game.cc, and instead of GetBestMove()

Not sure how to do this cleanly w/regards to getting an OptionsParser in game.cc. I'll take a look tomorrow, but if you can offer any tips or code for that, I'd be super grateful!

mooskagh · 2018-06-06T21:15:47Z

src/mcts/search.cc


+  // if resigning is enabled, we check if we should resign.
+  auto win_percent = GetBestNodeInternal()->GetQ(0, 0) + 1;


GetQ returns values from -1 to 1. Therefore percent will be 0% .. 200% in this case. Should be divided by 2.
Also win_percent doesn't really contain percents, it's in scale of 0..2 at that point, not 0..200, so the name is incorrect.

GetQ returns values from -1 to 1. Therefore percent will be 0% .. 200% in this case. Should be divided by 2.

This was intentional. From my understanding, it only matters when it's negative, because that's what the bot that would resign would see. If GetQ is positive, the current side has an advantage, but it's not like it can decide for the other player to resign anyway. I'm probably really misunderstanding something, though.

Also win_percent doesn't really contain percents, it's in scale of 0..2 at that point, not 0..200, so the name is incorrect.

Good catch.

mooskagh · 2018-06-06T21:17:27Z

src/mcts/search.cc

+  return std::make_tuple(
+    best.first, 
+    best.second, 
+    win_percent * 100 < kResignPercentage ? 1 : 0


So it doesn't return winrate but rather decision? Why int then instead of bool?

Used to C. My bad. That should've been a pretty obvious simplification.

mooskagh · 2018-06-06T21:20:31Z

src/mcts/search.h

@@ -69,7 +69,8 @@ class Search {
  void Wait();

  // Returns best move, from the point of view of white player. And also ponder.
-  std::pair<Move, Move> GetBestMove() const;
+  // The integer represents whether or not the engine believes this is a helpless move based on resign settings.


If you have clang installed, run
clang-format -I -style=google filename.cc
on modified files

mooskagh · 2018-06-06T21:21:50Z

src/mcts/search.h


 private:
  // Can run several copies of it in separate threads.
  void Worker();

+  Node* GetBestNodeInternal() const;


If it's not expected to be modified, consider returning const Node*

mooskagh · 2018-06-06T21:25:24Z

src/selfplay/game.cc

+    if (std::get<2>(move_data)) { 
+      game_result_ = blacks_move ? 
+        GameResult::WHITE_WON : GameResult::BLACK_WON;
+      puts("Will resign.");


Is that intended? It surely shouldn't write anything to stdout because it's for protocol (uci or interaction with client or whatever).
For logging into stderr, I think it's not important enough to output unconditionally.

If we had proper logging (like glog), it would surely be good to log it there, but we don't have it yet.

I copied what lczero does: https://github.com/glinscott/leela-chess/blob/39009b46065af65ab23181308aa56ab609d2f7f0/src/UCTSearch.cpp#L228

I will remove it, though.

Fix type mismatch + accidentally removing line... sigh

gonzalezjo · 2018-06-07T01:28:30Z

I made some changes. I'm still looking for help with the GetQ/win_percent stuff, and with the organization of GetBestMove(). I replied to your (very nicely presented, by the way. thanks!) requested changes with more specific questions and comments. Hopefully you or someone else can help me out more with that, but I'm not going to be able to do anything else today.

Tilps · 2018-06-11T02:55:58Z

src/mcts/search.cc

+  // std::min()'d to make the variable name accurate 
+  // someone, *please* suggest a better name.
+  auto win_percent = std::min(
+    (GetBestNodeInternal()->GetQ(0, 0) + 1) * 100, 100.0f);


I don't think this is what we want.
There are 3 options that seem valid to me.

An overload of GetBestNodeInternal which disables temperature to ensure we use the actual best node.

Use the Q of the current head instead (if that is being updated by the search code, I've not checked).

Use the node which corresponds to the selected move, rather than resampling again at random.
The scripts which are used to decide on a resignpct which is acceptable have been tuned on one of these three, so it should probably match those scripts - or the scripts will need updating and a new resignpct threshold decided.

I don't really like option 3 because it might have a q value based off one visit, which is a very poor estimate so it makes it more likely to resign incorrectly.

mm, I see what you mean.

dubslow

I'm thinking that the resign option should be in game.cc, not in search.cc -- search doesn't really care about resigning or not, it's just evaluating. The game driving code is what cares -- either engine.cc in usermode case, or game.cc in training mode case, and since UCI doesn't do resign, there's not much point in putting it in engine.cc.

So GetBestMove should return the float eval rather than a resign bool, and let the driver do what it wants with that.

Having said all that, Tilps is correct that how it is right now isn't quite right, since resign needs to be decided on the best move's eval, not on the chosen-with-temperature move's eval. And you are also correct that the GetBest{Move,Child} hierarchy is a bit convoluted, I've run into similar problems with my dynamic temp implementation as well.

I think what I'm going to do is submit a separate PR which redesigns Search's GetBest* API altogether, and make it both cleaner, and more suitable to resign/dynamic temp purposes. Then we can rebase both this PR and my dynamictemp one top of that redesigned API.

dubslow · 2018-06-12T05:14:40Z

src/mcts/search.cc

+  // std::min()'d to make the variable name accurate 
+  // someone, *please* suggest a better name.
+  auto win_percent = std::min(
+    (GetBestNodeInternal()->GetQ(0, 0) + 1) * 100, 100.0f);


mm, I see what you mean.

dubslow · 2018-06-12T07:21:22Z

Upon further consideration, I believe crem's current setup is indeed ~optimal. It's not the first time it's happened to me that "more thought" -> "oh, crem was right the whole time" :) What resign should do is to leave current Search functions untouched (except possibly rename GetBestChild to GetBestChildNoTemperature), and instead add new separate function named GetBestEval or similar which returns the eval of the no-temp-best move, then selfplay/game.cc can use that to make an adjudication decision.

Edit: And of course none of this isn't stuff that crem hasn't already suggested. Doh.

I could make the changes and make my own PR, or push my own changes to your branch here, or if you prefer you can just do it yourself

gonzalezjo · 2018-06-12T19:56:40Z

Fixing up the GetBests (either through lots of refactoring, or just the rename + new function) would be a really good place to start. Seems like you've already went ahead with that on PR 80 (and then some) which cleans things up a lot. I'm completely Git-clueless, so I think continuing on your PR is the best way forward.

gonzalezjo added 7 commits June 5, 2018 22:58

Update search.cc

9e46741

Finish adding no-op resign percentage support.

7edee6c

Add support for resign in search.cc & refactor

94afad8

Corresponding changes to search.h

6d82ae8

Add resign code to game.cc.

26931f9

Really concerned about code quality here and in search.cc. Better approaches are welcome.

Remove outdated comment.

b4553b3

Gets rid of a comment from when resign was only partially implemented.

This was referenced Jun 6, 2018

Add support for no-op resign #41

Closed

Implement resign in lc0 #20

Closed

mooskagh requested changes Jun 6, 2018

View reviewed changes

gonzalezjo added 4 commits June 6, 2018 21:21

clang-formatted + misc. code cleanup

8f4ed28

clang-format'd + misc. code cleanup

aa24bdc

clang-format'd + misc. code cleanup

815471a

Fix two dumb mistakes.

6ed09aa

Fix type mismatch + accidentally removing line... sigh

Tilps reviewed Jun 11, 2018

View reviewed changes

dubslow requested changes Jun 12, 2018

View reviewed changes

dubslow mentioned this pull request Jun 12, 2018

Implement resign in selfplay games #80

Merged

gonzalezjo closed this Jun 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full implementation of resign #47

Full implementation of resign #47

gonzalezjo commented Jun 6, 2018

mooskagh Jun 6, 2018

gonzalezjo Jun 7, 2018

mooskagh Jun 6, 2018

mooskagh Jun 6, 2018

gonzalezjo Jun 7, 2018

mooskagh Jun 6, 2018

gonzalezjo Jun 7, 2018 •

edited

Loading

mooskagh Jun 6, 2018

gonzalezjo Jun 7, 2018 •

edited

Loading

mooskagh Jun 6, 2018

mooskagh Jun 6, 2018

mooskagh Jun 6, 2018

gonzalezjo Jun 7, 2018

gonzalezjo commented Jun 7, 2018 •

edited

Loading

Tilps Jun 11, 2018

dubslow Jun 12, 2018

dubslow left a comment

dubslow Jun 12, 2018

dubslow commented Jun 12, 2018 •

edited

Loading

gonzalezjo commented Jun 12, 2018


		// if resigning is enabled, we check if we should resign.
		auto win_percent = GetBestNodeInternal()->GetQ(0, 0) + 1;

Full implementation of resign #47

Full implementation of resign #47

Conversation

gonzalezjo commented Jun 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gonzalezjo Jun 7, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gonzalezjo Jun 7, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gonzalezjo commented Jun 7, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dubslow left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dubslow commented Jun 12, 2018 • edited Loading

gonzalezjo commented Jun 12, 2018

gonzalezjo Jun 7, 2018 •

edited

Loading

gonzalezjo Jun 7, 2018 •

edited

Loading

gonzalezjo commented Jun 7, 2018 •

edited

Loading

dubslow commented Jun 12, 2018 •

edited

Loading