Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dual head network like AlphaZero #560

Open
b0n541 opened this issue Jun 8, 2023 · 3 comments
Open

Dual head network like AlphaZero #560

b0n541 opened this issue Jun 8, 2023 · 3 comments

Comments

@b0n541
Copy link
Contributor

b0n541 commented Jun 8, 2023

I would like to use KotlinDL to create a AlphaZero like dual head network for my game.

Unfortunately I haven't found any hint on how to accomplish that.

Is this already possible or would it be a new feature to be implemented?

@zaleslaw
Copy link
Collaborator

zaleslaw commented Jun 9, 2023

Hi @b0n541 if you want to create an AlphaZero, could you please share some notes about Keras/PyTorch implementation for that model?

Probably after that I could say more about is it possible or not

@b0n541
Copy link
Contributor Author

b0n541 commented Jun 9, 2023

I have found the following information:

https://medium.com/applied-data-science/alphago-zero-explained-in-one-diagram-365f5abf67e0
https://github.com/deepmind/open_spiel/blob/master/docs/alpha_zero.md#model
https://github.com/deepmind/open_spiel/blob/master/open_spiel/python/algorithms/alpha_zero/model.py

What we would need is a way to split the network after the first couple of layers into two stacks of following layers with dedicated outputs for the game value prediction and the move prediction.

@EgoLLC
Copy link

EgoLLC commented May 12, 2024

Do you mean reinforcement learning?

Can I implement reinforcement learning in Kotlin DL to play chess or any other 1v1 board game where the model will play against itself?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants