-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No issue here, just wanted to ask a couple of questions #1
Comments
Hey, I'll help if I can - I'm by no means an expert on RL. I assume you have seen this? |
Yeah, I've read it couple of times, but some stuff that I have questions about is only provided in the code and I've never worked with Jax/Haiku before, so trying to fully understand the code is not an easiest task in the world (I'm working with pytorch).
|
Also, what was the reward system you used? I'm curious as to how you rewarded the first two rolls where the bot is supposed to re-roll the dices. |
I didn't explore this very well so I can't say much. Pre-training helped a lot in my case. You could try (lots of) supervised pre-training iterations with a simple heuristic, then see what happens if you switch on the agent (for example, does it revert to random play, keep the pre-trained policy, or start learning on top of it).
That's almost the same I use, with the only difference that I pass the count of dice. So it's
I use a reward of 0 for the first two rolls. Only the final roll receives a reward (the score). |
Looks like you have a bug in your implementation :) |
Heyy,
First of all, great project, I'm glad to see it succeeded! I'm working on a similar project of making A2C model learn to play Yamb (it's a variation of Yahtzee played with 6 dices where you keep 5 and the categories are different), so I was wondering if you could possibly find a couple of minutes of your day to answer some of my questions of how you got the agent to learn, since it's a pretty similar problem that we're dealing with.
Sorry for opening an issue, I just had no idea of how to message you other than this :)
Regards,
Aleksandar
The text was updated successfully, but these errors were encountered: