New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the playouts #201

Closed
ujh opened this Issue Nov 8, 2015 · 4 comments

Comments

Projects
None yet
2 participants
@ujh
Owner

ujh commented Nov 8, 2015

We've come a long way already but unfortunately we're still only at 100pps per threads (playouts per second) on 19x19. Judging from the numbers other people get at least 10x faster is what we should aim for.

There are many things to try, but most boil down to doing less work. It seems to be especially important to do less counting of liberties. See for example this thread on the computer go mailing list.

@ujh ujh added this to the 0.4.0 milestone Nov 8, 2015

@ujh ujh modified the milestones: 0.3.2, 0.4.0 Feb 23, 2016

@ujh ujh self-assigned this Feb 23, 2016

@ujh

This comment has been minimized.

Show comment
Hide comment
@ujh

ujh Feb 24, 2016

Owner

Version 0.3.1 has the following performance characteristics:

9x9 13x13 19x19
pps (per thread) 1023 428 106
pps (all threads) 8187 3421 846
full_uct_cycle benchmark 404,103 ns/iter (+/- 68,514) 1,067,521 ns/iter (+/- 587,417) 4,094,998 ns/iter (+/- 9,257,384)
playout benchmark 357,326 ns/iter (+/- 114,342) 1,079,648 ns/iter (+/- 819,840) 3,575,719 ns/iter (+/- 10,425,836)
Owner

ujh commented Feb 24, 2016

Version 0.3.1 has the following performance characteristics:

9x9 13x13 19x19
pps (per thread) 1023 428 106
pps (all threads) 8187 3421 846
full_uct_cycle benchmark 404,103 ns/iter (+/- 68,514) 1,067,521 ns/iter (+/- 587,417) 4,094,998 ns/iter (+/- 9,257,384)
playout benchmark 357,326 ns/iter (+/- 114,342) 1,079,648 ns/iter (+/- 819,840) 3,575,719 ns/iter (+/- 10,425,836)
@ujh

This comment has been minimized.

Show comment
Hide comment
@ujh

ujh Feb 24, 2016

Owner

Running full_uct_cycle_19x19 through a profiler it seems that most of the time is spent in fix_atari. Both during the playouts and when calculating the priors.

Owner

ujh commented Feb 24, 2016

Running full_uct_cycle_19x19 through a profiler it seems that most of the time is spent in fix_atari. Both during the playouts and when calculating the priors.

@iopq

This comment has been minimized.

Show comment
Hide comment
@iopq

iopq Feb 24, 2016

Collaborator

Of course it is, it's reading out ladders.

Collaborator

iopq commented Feb 24, 2016

Of course it is, it's reading out ladders.

@ujh

This comment has been minimized.

Show comment
Hide comment
@ujh

ujh Feb 24, 2016

Owner

I will have to have a closer look at the numbers but it seems that on 19x19 it's a 4x slowdown which is massive. I wonder if it's really worth it. I guess I will have to run the benchmarks. :)

Owner

ujh commented Feb 24, 2016

I will have to have a closer look at the numbers but it seems that on 19x19 it's a 4x slowdown which is massive. I wonder if it's really worth it. I guess I will have to run the benchmarks. :)

@ujh ujh referenced this issue Apr 27, 2016

Merged

Optimise atari #279

@ujh ujh added the 2 - Working <= 5 label May 2, 2016

@ujh ujh added 4 - Done and removed 2 - Working <= 5 labels Jul 2, 2016

@ujh ujh closed this Jul 2, 2016

@ujh ujh removed the 4 - Done label Jul 15, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment