New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flamme rouge game initial version #5
Conversation
This looks great! Can you add the name of the game in the README under Getting Stated and I'll be happy to merge in - many thanks! |
Sure ! Done. |
Hey just had a look at your Flamme Rouge code @zorgluf - nice work :) Could you share the training parameters you used in the command line? - I'll try running on my multi-core server and see how good we can train the agent. |
@zorgluf - The main thing I would recommend is to change the shape of the observation, so that additional information is added as channels, rather than along the width or height dimension of the input. For example, the board shape is (120, 3, 7) = (length, lanes, tile types). At the moment, extra information is being adding against the length dimension: This isn't ideal as the convolutional layer won't be able to easily synthesise information from the early cells in the board (e.g. (0,...), (1,...) with the extra information, which at the moment is literally stored at the other end of the board :) (120,...), (121,...) etc. Instead this extra info needs to be distributed across the entire board as extra channels (exactly like you've done for the position of the cyclists). Also, note that it should be smeared across the board (i.e. repeated) rather than padded with zeros. So for example: In total it looks like you're creating 84 extra channels for a 2 player game (12 * 2 players played cards, 12 discarded cards, 12 hand card, 36 actions), so the final observations object should have shape (120, 3, 95). Hope that makes sense - I'm seriously impressed with the implementation. I think with a few tweaks to the design it will be able to get to superhuman performance. Can't wait to see it in action! |
Thanks for the comment and advices, I will have a look. By the way, I already started an improvement of the model on my fork https://github.com/zorgluf/SIMPLE/tree/frouge, so that learning will be made separately on conv2D on board and dense on card channels (without the unnecessary empty dimensions). It improve the learning time significantly. Before suggesting a new pull request, I have to review the experiment, even if it's a few week ago, I have the memory of a red fish... |
Ok great - I've also just committed the changes described above as a new branch (for testing I've just simplified to the basic board layout and 2 player game). https://github.com/davidADSP/SIMPLE/tree/frouge_df I just wondered if you changed the default arguments to the train.py command line function e.g. batch_size, number of timesteps per training iterations etc. They probably don't need changing - it's always a bit of trial and error with these things. I'm training now using my updated branch and seeing promising results - will leave it overnight and see what happens. |
Implementation of the "flamme rouge" game (https://www.ultraboardgames.com/flamme-rouge/game-rules.php), with "peleton" extension, for 5 players. Proposed for merge to the source project, if it is relevant for you.
The best_model need much more training (only 2 hours on one CPU). Model can be largely improved, I am still learning deep learning...
@davidADSP : great thanks for sharing this wonderful project !