Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small fixes to atomic and NNUE support based on Fairy-Stockfish changes #606

Merged
merged 2 commits into from
Jul 3, 2024

Conversation

chocolatebakery
Copy link

Hey, I modified MultiVariant a bit and fixed some bugs regarding pawn captures (aka, ignoring pawns when in check), promotions not working well.
It's working, but I feel like there's something missing.
Regardless, I did train a NNUE for atomic which can be downloaded at:
https://www.kaggle.com/datasets/chocobakery/multi-variant-atomic-v0

Merged the tools from official-stockfish and generated games to .BIN format (since .BINPACK is strictly used for standard chess as described by the author of Fairy-Stockfish).

The NNUE was trained using the official https://github.com/official-stockfish/nnue-pytorch by changing the layer size.

To add that, training with the official NNUE-pytorch with just .bin format is way slower than using binpacks, so it was sort of painful.

Reading the Official Stockfish discord, the formula I used for Epochs/Batch Size was:
3M / (DataSize / BatchSize) = Total Epochs
https://discord.com/channels/435943710472011776/882956631514689597/1102536960544870441
as described here

Which i learnt later that the whole network is an actual overkill already.
Reducing the layer size for variants would still work, resulting in faster training.
The architecture is obviously different from Fairy-Stockfish (which uses HalfKA_v2), this one has access to HalfKA_v2_hm (mirrored buckets for kings)

I know the project is archived, but it has marked me so I wanted to give at least some contribution to it.

Best Regards,

@ianfab
Copy link
Collaborator

ianfab commented Jul 3, 2024

Nice. I would assume that KotH, atomic, and extinction should (be possible to make) work in scope of this project with official SF NNUE architecture, and 3check to some degree if you add a check bonus on top of the NNUE eval. Racing kings, crazyhouse, antichess, and horde are much more problematic due to symmetry, pockets, and king count, so they probably only make sense in Fairy-SF.

@chocolatebakery
Copy link
Author

Regarding the other variants, yeah, it would be quite the work (which was already done brilliantly in Fairy-Stockfish, big props to you @ianfab ).

I did not test extinction or KOTH, but in principle, it should work, even losers and racing kings as well.

I did think about reverting the architecture to match Fairy-Stockfish's NNUE, but I don't think the eval would be that different, considering it's from the same family, only difference being this one has access to Antichess and Atomic tablebases (but reading the discord, back then, before using Leela data, they didn't even generate data with Tablebases).

Also thought about re-tuning the parameters from the HCE here with texel-tuning, SPSA using NNUE, but I'm not sure how different it would become, and I don't know how to do it myself as well.

Fairy-Stockfish already has all the tools and the appropriate balance with HCE and NNUE for variants, and it covers everything really well. Again, big props to both of you for amazing projects!

@ianfab
Copy link
Collaborator

ianfab commented Jul 3, 2024

I did not test extinction or KOTH, but in principle, it should work, even losers and racing kings as well.

Yes, losers I missed, that should be fine as well. On racing kings I disagree, because while it might technically work, if you do not change symmetry assumptions NNUE will likely return rather nonsensical evals, so that would at least require some extra code in the engine and trainer to be usable.

Due to higher degree of specialization this project in principle always should have a higher Elo ceiling than Fairy-SF for the variants it supports, the main question is of course whether this edge is worth the extra efforts to also support NNUE here. For the low hanging fruits it might be worth it, although a split in focus between the two projects also is suboptimal. But of course I am a bit biased.

@ddugovic
Copy link
Owner

ddugovic commented Jul 3, 2024

Oh wow, thanks for this submission.

At present although I am overwhelmed so I struggle to maintain this repository, I am glad to merge PRs such as this which appear not to raise security concerns.

@chocolatebakery
Copy link
Author

chocolatebakery commented Jul 3, 2024

🧡 @ddugovic

@ianfab Yeah, about racing kings, you are absolutely right, didn't consider the symmetry at all (and possibly, removing pawns from the pieces to consider in the training as well)
3Checks, I've been reading about that it's already hard to do on Fairy-Stockfish (on discord).
Regarding NNUE eval, and maintaining 2 different projects, the extra effort is probably not worth it, like you said.

But it's always nice to see 2 different "entities" if we can say that, although, from what I observed, the more data used in training for all variants, the more similar both entities will play.

Since the official Stockfish completely removed the HCE, it's basically impossible to maintain MV-Stockfish to keep up with their changes, even more that they completely use Leela's Data (with billions and billions of positions), and their architecture is extended to support enormous amounts of data.
Unless someone would go crazy and pay for a VPS to generate trillions of data for each variant of course, then match a Big NNUE/Small NNUE for each variant, but that's way too much computational power and time to consume.

Fairy-Stockfish is already super human with just HCE, even more with NNUE.

And again big props to both of you for amazing projects.

@ddugovic ddugovic merged commit 8a59ec9 into ddugovic:master Jul 3, 2024
22 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants