-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU accelerated batch MCTS #10
Comments
Batch moving successfully implemented in cfc92db. Speed comparisons coming... |
Speed comparisons (in seconds) for various number of games. Result for 1 game is the average of 10 games. Standard deviation in parens.
CPU
CUDA
|
At 10000 games with 3 addition randomly generated tiles, |
Functions are slower on GPU when data size is small. For the function |
Timings for
Timings for 'play_nn' which does not do mcts (it only plays 1 game). Even this is slower due to slower
|
Selfplay game generation is still too slow. However, timing tests suggest that the minimum size for a GPU to be better than CPU is 200,000 batches in parallel. It is very hard for me to reach these numbers when only searching 50 lines per move or 200 games per mcts_nn. I would need to run 1000 mcts_nn in parallel to get that benefit. For now, I am focusing on improving speed on the CPU. |
Using pytorch
The text was updated successfully, but these errors were encountered: