Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connect4 seems not working. Am I wrong? #36

Closed
verystrongjoe opened this issue Apr 24, 2020 · 2 comments
Closed

Connect4 seems not working. Am I wrong? #36

verystrongjoe opened this issue Apr 24, 2020 · 2 comments

Comments

@verystrongjoe
Copy link

I am running your code with the game connect4. It is already doing more than 250k steps. but the reward value is declining and approaching the bottom.

@werner-duvaud
Copy link
Owner

Hi,

Running MuZero on connect4 requires a lot of computing power that we don't have yet. The default hyperparameters may not allow good learning.

We quickly tested it today (slightly increased the number of blocks and the size of the replay buffer), it seems to learn slowly.

What do you call the reward value?
How many self played games do you have for 250k training steps ?

@fidel-schaposnik
Copy link
Contributor

If you haven't done so already, you may want to check https://medium.com/oracledevs/lessons-from-alpha-zero-part-6-hyperparameter-tuning-b1cfcbe4ca9a and previous articles in that series to get an idea for the various hyperparameter values you may use (many directly translate from AlphaZero)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants