Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiling on Windows #2

Closed
alreadydone opened this issue Mar 1, 2019 · 51 comments

Comments

@alreadydone
Copy link

commented Mar 1, 2019

Will update Windows build here.

Complete package: https://drive.google.com/file/d/1bdIlVDJ3x6FZtX5fmuG6wNbb57GFU8S0/view (6/27/2019, use new cudnn DLL)


Original content:

I tried to compile the GTP engine on Windows but failed:

>------ 生成 已启动: 项目: CMakeLists,配置: RelWithDebInfo ------
  [1/53] "e:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.14.26428\bin\HostX64\x64\cl.exe"  /nologo /TP -DUSE_CUDA_BACKEND -IC:\KataGo\cpp\external -IC:\KataGo\cpp\external\tclap-1.2.1\include -IE:\zlib\include -IE:\CUDA\include /DWIN32 /D_WINDOWS /W3 /GR /EHsc /MD /Zi /O2 /Ob1 /DNDEBUG   -std:c++14 /showIncludes /FoCMakeFiles\main.dir\core\elo.cpp.obj /FdCMakeFiles\main.dir\ /FS -c C:\KataGo\cpp\core\elo.cpp
  FAILED: CMakeFiles/main.dir/core/elo.cpp.obj 
  "e:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.14.26428\bin\HostX64\x64\cl.exe"  /nologo /TP -DUSE_CUDA_BACKEND -IC:\KataGo\cpp\external -IC:\KataGo\cpp\external\tclap-1.2.1\include -IE:\zlib\include -IE:\CUDA\include /DWIN32 /D_WINDOWS /W3 /GR /EHsc /MD /Zi /O2 /Ob1 /DNDEBUG   -std:c++14 /showIncludes /FoCMakeFiles\main.dir\core\elo.cpp.obj /FdCMakeFiles\main.dir\ /FS -c C:\KataGo\cpp\core\elo.cpp
c:\katago\cpp\core\global.h(32): error C3646: “__attribute__”: 未知重写说明符
...

When I run code analysis, I was told
C:/KataGo/cpp/core/global.cpp(14): fatal error C1083: 无法打开包括文件: “dirent.h”: No such file or directory
and when I looked into core/global.cpp, I found
#include <dirent.h> //TODO this is not portable to windows, use C++17 filesystem library when C++17 is available

Is this the only obstruction to porting to Windows? (Seems it can work under Windows though: https://github.com/tronkko/dirent)

I experienced some problem with the Git portion in CMakeLists.txt, so I removed it (not sure about what it is for), but some source files require program/gitinfo.h; would it work if I rename gitinfotemplate.h to gitinfo.h?

@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 1, 2019

Possibly. I don't develop on windows, so have not attempted to compile on windows, but it's plausible that this might be the only obstruction. I've attempted to write all my code to in-theory generalize between windows/linux, but this was one of the points where I skipped that. I'd be happy to work on a fix for that in the next few days, if you like.

Renaming gitinfotemplate.h to gitinfo.h would "work" but isn't ideal. The git portion in CMakeLists.txt is supposed to regenerate gitinfo.h based on gitinfotemplate.h to contain a #define for the current git revision hash of the repo whenever the git revision hash changes. This #define is used in a few places so that the executable can be queried for what git revision it was compiled from, as well as being automatically output in various log files so that when you run self-play or other processes there is a record of what version of the code was used to produce that data. If you hack it to use the template directly, then you won't get this behavior.

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 1, 2019

Making progress:

  • removed __attribute__ ((noreturn)) and __attribute__ ((pure)) (I guess they're mainly for compiler optimization)
  • #define BYTE_ORDER LITTLE_ENDIAN for Windows
  • #include <algorithm> when using std::max
  • Now get an error at

    KataGo/cpp/game/board.cpp

    Lines 228 to 233 in e7fae0b

    int Board::getNumLibertiesAfterPlay(Loc loc, Player pla, int max) const
    {
    Player opp = getOpp(pla);
    int numLibs = 0;
    Loc libs[max];

    from when does C++ support runtime-sized arrays, and what compiler accepts it? I'm surprised that some people compiled successfully (even for Linux)...
@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 1, 2019

Oh, that makes things more interesting.
Yeah, looking online this appears to be a feature supported by g++ (including mingw and/or cygwin toolchains for Windows) and is also part of the C99 standard, but is not part of the standard in C++, so I'm guessing most of the compilers other than g++ don't support it.

Unfortunately, there are probably a nontrivial number of locations that do this, as it's incredibly convenient for tiny arrays that need to live only for the duration of a function without paying the cost of dynamic allocation or the mess of passing in buffers as arguments, and I've never really thought about not doing it when that was what the situation needed. Are there any simple alternatives?

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 1, 2019

Probably no good alternatives... Looks like variable sized arrays are proposed to be added into C++ standard but retracted or rejected for some reason. It's amazing that g++ allocate them on the stack. I replaced them with new and delete since I think those perform better than std::vector.

I am not familiar with mingw or cygwin. Are binaries compiled with their g++ easy to use? i.e. I'd just distribute the .exe (not sure about license) and with the dlls (cuda, cudnn, etc.) it would then work on any Windows machine (with cuda 10 installed)? Or the user need to have mingw/cygwin installed?

I am doing this because your engine includes many long-desired feature and deserves to be widely used.

@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 1, 2019

For Cygwin, users will either need to have Cygwin installed, or else the executable needs to come bundled with the appropriate Cygwin dll files that it will depend on, but for MinGW, it compiles to a native windows binary.

So it sounds like if you're able to try MinGW, that's a promising option, and could be less work than getting the code to compile with MSVC. I've used MinGW before as many years ago I did some C++ coding on windows. Let me know if I can help out. I could also investigate the feasibility of getting rid of variable length stack allocations in the code, but that may be some work.

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 2, 2019

Thanks! Compiled successfully in VS finally (since CUDA can't work without VS, and I am not sure (if possible) how to treat it separately). When I launch main.exe it displays possible arguments. But main.exe gtp or with many other arguments crashes immediately. When I debug with main.exe gtp I found
image
Any ideas? I don't think any changes I made could corrupt the stack ...

If you'd like to look at it, you can see what are changed at master...alreadydone:master
(In hindsight I really should change all runtime-sized arrays to vectors...)

@l1t1

This comment has been minimized.

Copy link

commented Mar 2, 2019

if g++ is ok, try compile with https://nuwen.net/mingw.html

@l1t1

This comment has been minimized.

Copy link

commented Mar 2, 2019

@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 2, 2019

@alreadydone - I looked over your code and didn't see any obvious problems. Do you have the callstack for that error?

Perhaps more illustrative, you could try running main.exe runtests to run a bunch of tests of the low-level components of the code, and see if you fail in any of them, rather than going straight to the full gtp engine. Also main.exe runoutputtests runs some somewhat higher-level end-to-end stuff and dumps a big pile of output that should exactly equal the contents of tests/results/runOutputTests.txt (what's intended to be tested is that its output should equal the contents of this file, and of course that it doesn't hit any asserts or exceptions).

@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 2, 2019

@alreadydone - I just pushed a branch "pedantic": https://github.com/lightvector/KataGo/tree/pedantic

This makes the code compile under g++ using the "-pedantic" flag, which marks things like variable length arrays and other non C++ standard things. It also does so in a way more likely to preserve good performance (many of the locations in the board.cpp class that you found were fairly inner-loopish, changing them to use new/delete could be very harmful for performance).

All tests for me pass with this code. You'll have to redo your changes involving __attribute__ ((noreturn)) and the Git revision logging and a few of the defines you had to add, but otherwise, let me know if this is working for you or if you still run into stack corruption.

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 2, 2019

Thanks for the work! I'll try that branch.
The crash I got seems to be at very early stage:
call stack
Could it be endianness?

@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 2, 2019

Yeah, there's a chance it's something like that. The SHA2 implementation is not my own implementation, but an open source one I found online.

If you still crash around that place, reply back. I'll still be here to chat and help figure things out. :)

@l1t1

This comment has been minimized.

Copy link

commented Mar 3, 2019

can you upload the windows binary ?

@intenseG

This comment has been minimized.

Copy link

commented Mar 7, 2019

I tried to compile a project that @alreadydone forked on Windows but it failed.

  Building Custom Rule C:/Users/inten/Desktop/BSK/KataGo/cpp/CMakeLists.txt
  CMake does not need to re-run because C:/Users/inten/Desktop/BSK/KataGo/cpp/CMakeFiles/generate.stamp is up-to-date.
  Microsoft(R) C/C++ Optimizing Compiler Version 19.16.27027.1 for x64
  Copyright (C) Microsoft Corporation.  All rights reserved.
  
  cl /c /IC:\Users\inten\Desktop\BSK\KataGo\cpp\external /I"C:\Users\inten\Desktop\BSK\KataGo\cpp\external\tclap-1.2.1\include" /I"C:\Users\inten\Desktop\sai-sai-0.15\msvc\packages\boost.1.68.0.0\lib\native\include" /I"C:\Users\inten\Desktop\sai-sai-0.15\msvc\packages\zlib-msvc14-x64.1.2.11.7795\build\native\include" /I"C:\Users\inten\Desktop\sai-sai-0.15\msvc\packages\libzip.1.1.2.7\build\native\include" /I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\include" /Zi /W3 /WX- /diagnostics:classic /Od /Ob0 /D WIN32 /D _WINDOWS /D USE_CUDA_BACKEND /D "CMAKE_INTDIR=\"Debug\"" /D _MBCS /Gm- /EHsc /RTC1 /MDd /GS /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /GR /std:c++14 /Fo"main.dir\Debug\\" /Fd"main.dir\Debug\vc141.pdb" /Gd /TP /FC /errorReport:prompt C:\Users\inten\Desktop\BSK\KataGo\cpp\core\global.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\core\config_parser.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\core\elo.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\core\fancymath.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\core\hash.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\core\logger.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\core\makedir.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\core\md5.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\core\rand.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\core\sha2.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\core\timer.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\game\board.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\game\rules.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\game\boardhistory.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\dataio\sgf.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\dataio\numpywrite.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\dataio\trainingwrite.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\dataio\loadmodel.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\dataio\lzparse.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\neuralnet\nninputs.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\neuralnet\modelversion.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\neuralnet\nneval.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\neuralnet\cudabackend.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\search\timecontrols.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\search\searchparams.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\search\mutexpool.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\search\search.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\search\asyncbot.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\search\distributiontable.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\program\setup.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\program\play.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\tests\testboardarea.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\tests\testboardbasic.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\tests\testrules.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\tests\testscore.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\tests\testnninputs.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\tests\testsearch.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\tests\testtime.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\tests\testtrainingwrite.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\evalsgf.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\gatekeeper.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\gtp.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\match.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\matchauto.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\selfplay.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\misc.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\runtests.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\lzcost.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\sandbox.cpp C:\Users\inten\Desktop\BSK\KataGo\cpp\main.cpp
エラー	LNK1120	38 件の未解決の外部参照	main	C:\Users\inten\Desktop\BSK\KataGo\cpp\Debug\main.exe	1	

An unresolved external referencing error appears when generating main.exe.

I was searching for a long time, but the error was not resolved.
Is there a way to solve this error?

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 7, 2019

In the readme it's said CUDA 10.0 is required, so maybe CUDA 9.0 may not be good. Also have you installed cuDNN? https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#install-windows
BTW I tried @lightvector's pedantic branch but couldn't get rid of the stack corruption. I followed execution step by step till the error popped up but couldn't identify any out-of-bound writes or reads.

@intenseG

This comment has been minimized.

Copy link

commented Mar 8, 2019

I may be having a CUDA version issue as @alreadydone says an external reference error occurred from cudabackend.obj.
I will try to create an instance for GCP (not Author of leela-zero project).
Thank you!

@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 8, 2019

@alreadydone - Sounds like I should try at some point to compile on windows myself to track down where the stack corruption you're seeing might be from. Given that the pedantic branch didn't work for you, that means it was not related to any of the variable length array fixes you had to do for the stack, which means it's... something else? Interesting. I'll see if I can investigate in the next few days.
@intenseG - CUDA 10.0 is the version I've been using that I know the code works with. There's some chance that CUDA 9.0 works too, but I haven't tested it, so better use CUDA 10.0 if you can.

@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 11, 2019

@alreadydone - I found a long-standing bug in sha2.cpp that would indeed cause stack corruption. Presumably the way g++ lays out the stack purely by chance it never caused any problems, so I hadn't noticed it, but it is now fixed! I also verified that a better address sanitizer than the one I had tested with before does flag the problem, and doesn't flag any other problems as far as I could test. Unfortunately, it was not compatible with CUDA, so I was not able to do a full test up to the point of a real search, but test searches without a neural net now run cleanly with the stricter address checking as well.

I also went ahead and cleaned up a few more things along the lines of the changes you had to make to get it to compile under Visual Studio, although not all of them. In particular, I just went ahead and made <algorithm> a global include, and also added some preprocessor checking to screen out "NORETURN" and "PURE" if not on g++.

I spent some time spinning up a windows machine on AWS with a GPU (since I don't have any here) and tried to actually compile myself, but having never used Visual Studio before, I couldn't get past the phase where I get all the libraries installed in ways that Visual Studio is happy with and is able to find in order to compile the project. So this is untested on Windows but let me know if the new fix makes it work for you. I updated both 'master' and 'pedantic'.

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 11, 2019

@lightvector Good catch! not sure how I missed the stupid *buffer = (char)0;. This is present also in https://github.com/HowardHinnant/hash_append/blob/master/sha2.c for example, and it looks like that buffer = (char*)0; is intended. Stack corruption indeed goes away after the fix.

Thanks for the efforts to make compilation with VS easier. I might look into creating a VS project (to manage packages conveniently through NuGet) instead of using CMake (which I don't know how to direct it to find packages on Windows), though I haven't created one from scratch before.

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 13, 2019

I successfully compiled and ran several genmove and things seem to be working properly. I made an archive https://userscloud.com/vtfosn1sgqqe with the binary and the released 15b net, along with some notes on the configs, mostly comparisons with similar parameters in LZ. I changed chosenMoveTemperature(Early) to 0 for maximal strength, and I am tempted to change numNNServerThreadsPerModel to 2 because I think one server thread won't be able to saturate a GPU, but I abandoned the idea when I saw that memory usage doubled.
The up-to-date branch that compiles with Visual Studio (when include/library directories are properly set in CMakeLists.txt) is at https://github.com/alreadydone/KataGo/tree/vs.
Strangely, zlib (zstr) won't read a .gz file properly for me. For the 15b net, Error parsing model file will be thrown. When I examine the stream in, I see it terminates with -0.06479036 -0.02955167 0.02968, the 86th weight in conv1. Reading decompressed .txt file works without trouble.

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 13, 2019

https://userscloud.com/uug7j4awz0w7
Updated my readme.txt a bit and removed the restriction to board sizes 9-19. The engine plays reasonably on a 4x4 board, but on 21x21 it hangs on genmove. Does it take the engine equal time to evaluate a 9x9 position and a 19x19 position, or is the former faster?

I see that board sizes beyond 19 are impossible:

namespace NNPos {
  //Currently, neural net policy output can handle a max of 19x19 boards.
  const int MAX_BOARD_LEN = 19;
@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 13, 2019

There's an excellent chance if you change that number to be larger along with the number in game/board.h (search for "19" in that file as well) then it will work. Try it!

Although note that the tests in "main.exe runtests" will not pass since it will be generating zobrist hashes for a larger maximum board size, so all the hash values of all the board states will change.

Edit: Tested bumping it up to 23, it looks like at least a short search from the empty position works fine. Also, I'm not sure why zlib isn't working for you, maybe something about how zstr works with the Windows version of the library isn't good or something. Not exactly sure how to go about debugging that.

Edit#2 - By the way, in case you're curious, in the past with a much older neural net version (trained on LZ data) I tried measuring the effect of temperature for making moves. Recalling from memory without digging deep into my notes, the effect was a loss of the order of 20 to 40 Elo when using 400 or 800 visits for temperatures as high as using 0.3 for the whole game. So actually surprisingly little in the grand scheme of things for such a noticeable temperature.

My intuition is that temperatures of much less than that should have very little harm on the strength, particularly very early in the game. For example, at 0.1, it will be happy to choose between two moves that are very close in visits randomly, but if one of them even gets, say, a 30% advantage in visits over the other, that translates to only a 7% chance that it will play the lesser-visited one, and by the time you get to double the visits of the lesser one, that's only a 1:1000 chance of playing the lesser one. This allows for a bit more opening diversity between nearly equal moves beyond just neural net symmetry randomization, but only when two moves are very close.

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 13, 2019

Thanks for the info. On my notebook (940M) with batchsize 16 (default) I got
486 batches in 23.75s on 19x19, 15.29 b/s
856 batches in 55.98s on 8x8, 20.46 b/s
so it seems inference is faster on small boards, but the gain is small. Should I increase batch size as board size decreases?

Now that I look at more log entries it seems it's also around 20 b/s on 19x19. Do I need to adjust NNPos::MAX_BOARD_LEN to speed up inference on small boards? That would require compiling one binary for each board size.

BTW, Are there ways to query/interact with the engine to get more search/eval info other than looking into gtp.log?

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 13, 2019

I compiled a board size 37 version and the row/column numbering disappears when calling showboard. Moreover genmove b yields (3,34) (no longer letter+number! where is the conditional switch?). GTP really needs to be extended to handle large board sizes; in the future people will watch exhibition matches on large boards between AIs. The speed indeed dropped a lot: now 986 batches in 115.7s, amounting to 8.5 b/s. Same speed on 9x9. Memory usage also bumps (up from ~1GB to ~2GB).

So how did you take advantage of smaller boards during self-play? From my tests it seems inference speed is completely determined by the compile-time constant NNPos::MAX_BOARD_LEN. Did you only benefit from shorter length of games and not faster NN evaluations. To take advantage of faster NN evals with the current code, it seems that you need to quit the engine after each game and start another engine compiled for a different board size.

BTW I am a bit surprised that 0.3 for the whole game only loses 20-40 Elo even when chosenMoveSubtract = 0 and chosenMovePrune = 1 (the default settings).

@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 14, 2019

I didn't try to take advantage of smaller boards during self play computationally. One could do so in theory, but it would be more complicated, since you would not be able to batch the neural net queries from each together, in which case they would be running separately and then the logic for how to keep them running at the right proportions or how to feed the queries to the right GPU server threads would get a lot more complicated. Taking advantage of the shorter games and greater effectiveness of search was already good enough. Also, once you start hitting pro level, my intuition is that although 19x19 generalizes downward pretty well, smaller boards will not generalize upward quite so well, which means you want a lot of your training on the largest sizes anyways where you don't gain anything.

You should be able to do much better than recompiling for a particular size if you want faster evaluations on small boards, although it's not prominently documented. Try adding "maxBoardSizeForNNBuffer=9" into the config for a 9x9 game even when you've compiled for up to 19x19. That should make the tensors on the GPU a lot smaller. I haven't done the work to make this buffer sizing dynamic based on the GTP commands.

Turning locations to strings is handled in game/board.cpp, in Location::toString. I agree GTP is a pretty hacky protocol, it gives no advice on how to do lettering for boards larger than 25x25 and says they're not supported, so I just did something arbitrary in that case. If there's a different arbitrary choice that works better with other programs then I could implement that instead.

@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 14, 2019

There aren't any more ways to interact with the engine built into GTP right now, since that would also require adding nonstandard commands and up until recently I had no idea which nonstandard extensions were starting to become more de-facto standard simply because people were implementing them in practice. Are there important such de-facto extensions you think would be useful to implemented?

You can evaluate individual positions in an existing sgf file using the "evalsgf" subcommand of the program. That will dump to stdout much the same kind of output that gets dumped to the log, but with slightly more options for interacting with it. Of course, more options still could be implemented, I've only been implementing things as I've needed for getting things to work (e.g. playing on OGS) or for my own debugging use, but it's pretty easy for me to add more things.

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 14, 2019

OK, thanks for letting me know about the hidden parameter maxBoardSizeForNNBuffer which I didn't see in gtp_example.cfg. (I guess I just need to dig a bit more into program/setup.cpp to find more.)

Evaluating one move at a time seems a strange idea; software like GoReviewPartner analyze one game at a time. It would be desirable to implement lz-analyze when KataGo gets stronger, but currently I think people just want to see its winrate estimate (to see if it thinks it has caught up in handicap games, for example). (I told people that KataGo never gives up and has no ladder problems, and people have been testing the 15x192 net's performance in handicap games. The 10x128 and 6x96 nets (along with Zen7 and LZ 10b nets) have been serving as "punching bags" (given handicap) in handicap games.)

@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 14, 2019

Well, ideally maxBoardSizeForNNBuffer wouldn't exist at all for GTP, it would just do it based on the GTP-sent board size. :)

Yeah evalsgf has mostly been a debugging tool, not a user tool. I had only been implementing exactly what I needed myself and nothing more, since doing otherwise would mean more weeks before being able to complete my paper.

I'll look into implementing lz-analyze. Thanks!

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 14, 2019

For Windows users, I want to clarify that with cudart, cublas and cudnn CUDA10 DLLs you can run compiled binaries above even with only CUDA 9 installed. (A friend tested this since he didn't want to break Tensorflow by installing CUDA 10.)

@l1t1

This comment has been minimized.

Copy link

commented Mar 14, 2019

can you add support of cuda9?

@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 14, 2019

I'm not sure. Does it work with CUDA 9 right now or is there something that breaks? I specified CUDA 10 since that's what my cloud setup had and so it's the one I'm using, but there's a chance it already works with CUDA 9.

@alreadydone

This comment has been minimized.

Copy link
Author

commented Mar 14, 2019

@l1t1 Have you tried to run it with CUDA9 installed? What error did you get (if any)? Probably it will say you are missing one of cublas64_100.dll, cudart64_100.dll, cudnn64_7.dll, or msvcr110.dll. The first three are bundled in https://userscloud.com/c8llnul1lmrr, and the last one can be installed with vcredist_x64.exe included in KataGo GTP engine and networks (up to 37x37, arbitrary komi, handicap play).

@l1t1

This comment has been minimized.

Copy link

commented Mar 14, 2019

thanks @alreadydone

@l1t1

This comment has been minimized.

Copy link

commented Mar 14, 2019

do you know which version did cuda lc0 use?
https://github.com/LeelaChessZero/lc0/releases

@l1t1

This comment has been minimized.

Copy link

commented Mar 14, 2019

i see, they use cublas64_92.dll

@Friday9i

This comment has been minimized.

Copy link

commented Mar 15, 2019

Seems nice @alreadydone, but could you ideally share the zip somewhere else? Usercloud seems doubtful ...: my antivirus stopped a "malicious script" (is it really malicious or not, I don't know, but I won't try) and to downolad the file, I also need to install a chrome extrension changing by default my preferred search engine: no way...
But then I have no access to the compiled version : -(
So if you can upload it somewhere else, that would be very nice. Thanks a lot

@l1t1

This comment has been minimized.

Copy link

commented Mar 15, 2019

when use Usercloud, you don't install anything at IE /Firefox brower, don't click at popup windows, only need click continue button at the first page until the real link shows

@lightvector

This comment has been minimized.

Copy link
Owner

commented Mar 16, 2019

@alreadydone - I implemented enough of lz-analyze to I hope make it work with Lizzie. Changes pushed to master and pedantic. Note that Lizzie attempts to parse the version output from GTP and complains if it is not the version it expects from Leela Zero, so to make it work you will need to pass -override-version 0.16 when running main gtp to make KataGo pretend that it is version 0.16 for the sake of mimicking what Leela Zero would say so that Lizzie does not complain. I did not test other tools that use lz-analyze.

I was also not able to fully test Lizzie either though, so let me know if there are issues. I don't actually have a local computer with a GPU, only cloud and remote machines, and could not figure out how to make a local Windows machine connect through an ssh tunnel to run KataGo on Linux remotely for Lizzie locally. I did eventually figure out how to run Lizzie remotely as well and X11 forward the window. Unfortunately the X11 was so laggy as to be unusable, allowing me to barely verify that KataGo's lz-analyze output is a close enough mimic that Lizzie starts and displays some numbers, but not to do any further testing for whether there were any further problems.

I also implemented kata-analyze. Same as lz-analyze but it does not multiply the winrates and such by 10000 and round them, it just leaves them as floats, and it also reports the expected score.

Additionally, with another (unpushed) hack, I got the version running on OGS (https://online-go.com/player/592684/) to report the winrate and mean expected score each move. Yay. Although the PV isn't displayed as nicely as roy7 did for Leela Zero, since that requires quite a bit more work with the OGS api than I had time to dig into - sending a chat message is just a matter of sending a string, sending a full variation requires constructing some more complex json object.

@Friday9i @l1t1 - Let me know if there's anything reasonable I can help with on the windows stuff, although it sounds like I don't have much to contribute over @alreadydone having not actually compiled on Windows myself.

@thorsilver

This comment has been minimized.

Copy link

commented Mar 16, 2019

@l1t1 the link does not work at all for me. I just get bombarded with popups and the link loads but leads to an error page. Have tried repeatedly and always the same result.

Can someone put the compiled version somewhere else please?

@alreadydone

This comment has been minimized.

@Friday9i

This comment has been minimized.

Copy link

commented Apr 10, 2019

I tried to use the windows version (provided above) with Sabaki, but no success ; -(
Anyone knows the parameters to be used? I tried several versions for the 3 lines in "manage engines", around:

  • "C:[...]\KataGo\main37.exe" (or main.exe)
  • "-gtp -model C:[...]\KataGo\15b.txt.gz -config C:[...]\KataGo\configs\gtp_example.cfg" (and I tried also with -v 1000)
  • Nothing on the third line (or "time_settings 0 1 0" as for LZ)

But whatever I tried, I get a "connection failed".
Thanks a lot

@alreadydone

This comment has been minimized.

Copy link
Author

commented Apr 11, 2019

  • There should not be a dash before gtp.
  • Maybe 15x192.txt instead of 15b.txt.gz? It's reported above that gzipped model doesn't work...
  • Maybe (definitely for Lizzie) you need to append -override-version 0.16 (after gtp_example.cfg).
  • I think time_setting works but -v 1000 doesn't.
  • You may want to refer to the readme.txt or 说明 Instructions.txt in the archives I made.
@Friday9i

This comment has been minimized.

Copy link

commented Apr 11, 2019

Thanks a lot for the advices @alreadydone!
I already read the readme.txt, I tried without the dash before gtp but I also forgot to use .txt instead of .gz: it may be why it didn't work. I'll try again tonight (European time) and will give an update of the result here, as well as the precise parameters used if it works ; -)
Thanks a lot!

@alreadydone

This comment has been minimized.

Copy link
Author

commented Apr 11, 2019

You can run the command in the console for more visible and faster check. Though Katago won't spit out messages when the model finished loading, you may type showboard or genmove to check whether things initialized correctly (optionally, watch task manager until memory usage stabilizes, that's a sign of model loading completion; look into gtp.log to see if you get any errors).

I checked it works fine in Lizzie, but seemed to get problems with mylizzie or Sabaki analysis mode.

@Friday9i

This comment has been minimized.

Copy link

commented Apr 11, 2019

Problem was the .gz, thx! Hence, I use in Sabaki, "Manage Engines":

  • "D:\[...]\KataGo\main37.exe"
  • "gtp -model D:\[...]\KataGo\15b.txt -config D:\[...]\KataGo\configs\gtp_example.cfg"
  • "time_settings 0 6 1" (or 0 1 0 and visits/playouts in the cfg file)

Note: to modify the cfg file, I change it to a txt extension, modify it, then change it back to cfg.

@lightvector

This comment has been minimized.

Copy link
Owner

commented Apr 13, 2019

@alreadydone pedantic branch changes are all merged into master now, along with various bugfixes, implementation of LCB, and other minor things.

@alreadydone

This comment has been minimized.

Copy link
Author

commented Apr 14, 2019

Thanks for the work!
Updated executable main.zip and my vs branch.

@l1t1

This comment has been minimized.

Copy link

commented Jun 19, 2019

could you post a compile guide from zero step by step?

@petgo3

This comment has been minimized.

Copy link

commented Jul 16, 2019

@alreadydone: Since i still can't compile on Windows, can you perhaps try a merge(to your fork)/compile including latest pushes of lightvector?
There is at least an interesting addition for handicap included.

@lightvector

This comment has been minimized.

Copy link
Owner

commented Jul 16, 2019

@alreadydone

This comment has been minimized.

Copy link
Author

commented Jul 23, 2019

Closing as obsolete :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.