Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensor accum #5

Merged
merged 188 commits into from
Apr 27, 2019
Merged

Tensor accum #5

merged 188 commits into from
Apr 27, 2019

Conversation

godmoves
Copy link
Owner

Try to merge 'tensor-accum-0.17' branch into 'distributed'

ihavnoid and others added 30 commits October 14, 2018 07:28
… commandqueue. (Reverted "one queue for each GPU".)
Hersmunch and others added 29 commits April 2, 2019 13:19
Added extra support for "TM" and "OT" and other sgf time control
properties on printsgf and loadsgf GTP commands.

* Added parsing and loading of "TM" and "OT" sgf properties on GTP command
  loadsgf. Only supports "OT" syntax matching output from a printsgf GTP
  command.
* Change SGFTree to have a shared_ptr for a time control.
* Added saving and loading of "BL", "WL", "OB" and "OW" sgf properties on
  GTP commands printsgf and loadsgf.
* Change to make TimeControl::make_from_text_sgf() a time control factory
  and other minor tidying.

Pull request leela-zero#2172.
As noted in pull request leela-zero#2172, the default
constructor set byo yomi stones but no time or
periods.
We currently will either crash or do strange things if we're
fed a weights file that doesn't match the board size we're compiled
for.

See issue leela-zero#2289.
Add an lz-analyze tag to suggest the minimum amount of moves the
engine should post info about (rather than only those it considers
interesting, i.e. the ones with at least a visit).

This allows some very flexible constructs:

Getting a heatmap:

    lz-setoption name visits value 1
    lz-analyze interval 1 minmoves 361

Forcing a move among the top policy moves only:

    lz-setoption name visits value 1
    lz-analyze interval 1 minmoves 2
    (store those moves, e.g. A1, B1)
    lz-setoption name visits value 0
    lz-genmove_analyze b interval 1 allow b A1 1 allow b B1 1
Only pass when winning or low on legal moves.
Disabled in self-play.

Fixes issue leela-zero#2273.
Based on pull request leela-zero#2277.

Pull request leela-zero#2301.
Adding the minmoves tag exposes a small bug in the PV
output formatting. Avoid extra blank spaces.

Small style fixups.
As pointed out by @gjm11 in leela-zero#2277, when there's few legal moves we might
want to allow passing even if this loses on the board count. The
alternative might be to self-destruct large groups and carry the game
on endlessely even if the policy wouldn't want to.

No difference in "dumbpass" mode.
Seems like the previous test regex is causing MSVC's regex engine to run
out of stack space.
leela-zero's default build directory is `build`.

It is very annoying when using leela as a git submodule that 
the repository updates whenever it builds.

Pull request leela-zero#2199.
Group evaluations and run them in parallel. Roughly 50% speedup on my setup, but there are a couple of points that is debatable.

- Thread / batch sizing heuristics : This PR changes how the default threads / default batch sizes are picked.  See Leela.cpp
- Batch-forming heuristic : See OpenCLScheduler.cpp for the batch forming heuristic : the heuristic exists so that we can wait for the rest of the engine to create more NN evaluations so that we can run larger batches.  We can't wait indefinitely since there are cases we enter 'serial' paths.  Since heuristics are heuristics, these might need some tests on a larger variety of types of systems.

Did make sure that winrate improves when running default vs. default command line `./leelaz -w (weight file)` on time parity.

Pull request leela-zero#2188.
* Calculate node variance.
* Use normal distribution LCB to choose the played move.
* Cached student-t.
* Sort lz-analyze output according to LCB.
* Don't choose nodes with very few visits even if LCB is better.

Guard against NN misevaluations when top move has lot of visits.
Without this it's possible for move with few hundred visits to be picked
over a move with over ten thousand visits.

The problem is that the evaluation distribution isn't really normal
distribution. Evaluations correlate and the distribution can change
if deeper in the tree it finds a better alternative.

Pull request leela-zero#2290.
* Add mixed precision training support.
* Do not use loss scale if training with fp32
* Fix potential reg_term overflow of large networks.

Pull request leela-zero#2191.
Don't autodetect or default to fp32 when all cards have
Tensor Cores. We will assume fp16 is the fastest.

This avoids problems in tune-only mode which does not
detect the precision to use and would use fp32 on such cards.

Pull request leela-zero#2312.
We have a first implementation of batching now.
AutoGTP will always send --batchsize, but CPU only
compiles don't support the option. Ignore the option
in those builds.

The same problem exists with --tune-only, but quitting
immediately happens to be sane behavior so we don't need
to fix that.

Pull request leela-zero#2313.
It will recursively include OpenCL.h and that
is bad.

Pull request leela-zero#2314.
# Conflicts:
#	src/Network.cpp
#	src/Network.h
#	src/OpenCLScheduler.cpp
#	src/UCTNode.cpp
#	src/UCTSearch.cpp
#	src/kernels/convolve3.opencl
# Conflicts:
#	src/GTP.cpp
#	src/Leela.cpp
#	src/UCTSearch.cpp
#	src/UCTSearch.h
# Conflicts:
#	src/OpenCL.cpp
#	src/Tuner.cpp
#	src/kernels/clblast/hgemm_tensorcore.opencl
#	src/kernels/tensorcore_test.opencl
# Conflicts:
#	src/GTP.cpp
#	src/GTP.h
#	src/Leela.cpp
#	src/OpenCL.cpp
#	src/OpenCL.h
#	src/OpenCLScheduler.cpp
#	src/OpenCLScheduler.h
#	src/UCTSearch.cpp
# Conflicts:
#	src/GTP.cpp
#	src/Leela.cpp
#	src/OpenCLScheduler.cpp
#	src/Training.cpp
#	src/UCTNode.cpp
#	src/UCTNode.h
#	src/UCTNodePointer.h
#	src/UCTSearch.cpp
@godmoves godmoves merged commit cffc01b into distributed Apr 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet