-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small fixes to atomic and NNUE support based on Fairy-Stockfish changes #606
Conversation
Nice. I would assume that KotH, atomic, and extinction should (be possible to make) work in scope of this project with official SF NNUE architecture, and 3check to some degree if you add a check bonus on top of the NNUE eval. Racing kings, crazyhouse, antichess, and horde are much more problematic due to symmetry, pockets, and king count, so they probably only make sense in Fairy-SF. |
Regarding the other variants, yeah, it would be quite the work (which was already done brilliantly in Fairy-Stockfish, big props to you @ianfab ). I did not test extinction or KOTH, but in principle, it should work, even losers and racing kings as well. I did think about reverting the architecture to match Fairy-Stockfish's NNUE, but I don't think the eval would be that different, considering it's from the same family, only difference being this one has access to Antichess and Atomic tablebases (but reading the discord, back then, before using Leela data, they didn't even generate data with Tablebases). Also thought about re-tuning the parameters from the HCE here with texel-tuning, SPSA using NNUE, but I'm not sure how different it would become, and I don't know how to do it myself as well. Fairy-Stockfish already has all the tools and the appropriate balance with HCE and NNUE for variants, and it covers everything really well. Again, big props to both of you for amazing projects! |
Yes, losers I missed, that should be fine as well. On racing kings I disagree, because while it might technically work, if you do not change symmetry assumptions NNUE will likely return rather nonsensical evals, so that would at least require some extra code in the engine and trainer to be usable. Due to higher degree of specialization this project in principle always should have a higher Elo ceiling than Fairy-SF for the variants it supports, the main question is of course whether this edge is worth the extra efforts to also support NNUE here. For the low hanging fruits it might be worth it, although a split in focus between the two projects also is suboptimal. But of course I am a bit biased. |
Oh wow, thanks for this submission. At present although I am overwhelmed so I struggle to maintain this repository, I am glad to merge PRs such as this which appear not to raise security concerns. |
@ianfab Yeah, about racing kings, you are absolutely right, didn't consider the symmetry at all (and possibly, removing pawns from the pieces to consider in the training as well) But it's always nice to see 2 different "entities" if we can say that, although, from what I observed, the more data used in training for all variants, the more similar both entities will play. Since the official Stockfish completely removed the HCE, it's basically impossible to maintain MV-Stockfish to keep up with their changes, even more that they completely use Leela's Data (with billions and billions of positions), and their architecture is extended to support enormous amounts of data. Fairy-Stockfish is already super human with just HCE, even more with NNUE. And again big props to both of you for amazing projects. |
Hey, I modified MultiVariant a bit and fixed some bugs regarding pawn captures (aka, ignoring pawns when in check), promotions not working well.
It's working, but I feel like there's something missing.
Regardless, I did train a NNUE for atomic which can be downloaded at:
https://www.kaggle.com/datasets/chocobakery/multi-variant-atomic-v0
Merged the tools from official-stockfish and generated games to .BIN format (since .BINPACK is strictly used for standard chess as described by the author of Fairy-Stockfish).
The NNUE was trained using the official https://github.com/official-stockfish/nnue-pytorch by changing the layer size.
To add that, training with the official NNUE-pytorch with just .bin format is way slower than using binpacks, so it was sort of painful.
Reading the Official Stockfish discord, the formula I used for Epochs/Batch Size was:
3M / (DataSize / BatchSize) = Total Epochs
https://discord.com/channels/435943710472011776/882956631514689597/1102536960544870441
as described here
Which i learnt later that the whole network is an actual overkill already.
Reducing the layer size for variants would still work, resulting in faster training.
The architecture is obviously different from Fairy-Stockfish (which uses HalfKA_v2), this one has access to HalfKA_v2_hm (mirrored buckets for kings)
I know the project is archived, but it has marked me so I wanted to give at least some contribution to it.
Best Regards,