You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One can implement MCTS based on the value model to improve model performance. This is not intended to be used during data generation due to the high cost of computation.
The text was updated successfully, but these errors were encountered:
I implemented a first MCTS version which can already me used in interactive mode. I thought about implementing this as a model wrapper but the interface doesn't quite match. Maybe we can find a way to merge the two interfaces s.t. MCTS can be used in comparison with models without MCTS without too much extra code.
c_puct Parameter certainly needs to be tuned. Also currently the prior policy is set to 1 uniformly. Softmax of the move evaluation might make more sense and should certainly be tried.
One can implement MCTS based on the value model to improve model performance. This is not intended to be used during data generation due to the high cost of computation.
The text was updated successfully, but these errors were encountered: