flamme rouge game initial version #5

zorgluf · 2021-05-10T21:01:28Z

Implementation of the "flamme rouge" game (https://www.ultraboardgames.com/flamme-rouge/game-rules.php), with "peleton" extension, for 5 players. Proposed for merge to the source project, if it is relevant for you.
The best_model need much more training (only 2 hours on one CPU). Model can be largely improved, I am still learning deep learning...

@davidADSP : great thanks for sharing this wonderful project !

davidADSP · 2021-05-11T20:19:59Z

This looks great! Can you add the name of the game in the README under Getting Stated and I'll be happy to merge in - many thanks!

zorgluf · 2021-05-12T13:04:54Z

Sure ! Done.

davidADSP · 2021-06-12T10:56:19Z

Hey just had a look at your Flamme Rouge code @zorgluf - nice work :) Could you share the training parameters you used in the command line? - I'll try running on my multi-core server and see how good we can train the agent.

davidADSP · 2021-06-12T15:02:08Z

@zorgluf - The main thing I would recommend is to change the shape of the observation, so that additional information is added as channels, rather than along the width or height dimension of the input.

For example, the board shape is (120, 3, 7) = (length, lanes, tile types).
Let's say we're playing a 2 player game, so the observation shape gets expanded to (120, 3, 11) - (4 extra channels for rolleur and sprinteur location of each player) 👍

At the moment, extra information is being adding against the length dimension:
e.g. played cards = (12, ) -> resized and padded with zeros to (12, 3, 11) -> appended along length dimension to obs = (132, 3, 11).

This isn't ideal as the convolutional layer won't be able to easily synthesise information from the early cells in the board (e.g. (0,...), (1,...) with the extra information, which at the moment is literally stored at the other end of the board :) (120,...), (121,...) etc.

Instead this extra info needs to be distributed across the entire board as extra channels (exactly like you've done for the position of the cyclists). Also, note that it should be smeared across the board (i.e. repeated) rather than padded with zeros.

So for example:
played cards = (12, ) -> repeated and reshaped to (120, 3, 12) -> appended along channel dimension to obs = (120, 3, 23).

In total it looks like you're creating 84 extra channels for a 2 player game (12 * 2 players played cards, 12 discarded cards, 12 hand card, 36 actions), so the final observations object should have shape (120, 3, 95).

Hope that makes sense - I'm seriously impressed with the implementation. I think with a few tweaks to the design it will be able to get to superhuman performance. Can't wait to see it in action!

zorgluf · 2021-06-12T16:24:46Z

Thanks for the comment and advices, I will have a look. By the way, I already started an improvement of the model on my fork https://github.com/zorgluf/SIMPLE/tree/frouge, so that learning will be made separately on conv2D on board and dense on card channels (without the unnecessary empty dimensions). It improve the learning time significantly. Before suggesting a new pull request, I have to review the experiment, even if it's a few week ago, I have the memory of a red fish...
By "training parameters", what do you mean ? I used your train script without any extra parameters.
For the "superhuman" performance, there should be still a gap, I frequently manage to win against the best model so far... I will follow your advices to see if it's better with a new model.
Thanks !

davidADSP · 2021-06-12T16:55:51Z

Ok great - I've also just committed the changes described above as a new branch (for testing I've just simplified to the basic board layout and 2 player game).

https://github.com/davidADSP/SIMPLE/tree/frouge_df
See what you think. Your idea is also interesting - using 3D array input for the board and 1D array input for the cards - that would also work I'm sure.

I just wondered if you changed the default arguments to the train.py command line function e.g. batch_size, number of timesteps per training iterations etc. They probably don't need changing - it's always a bit of trial and error with these things.

I'm training now using my updated branch and seeing promising results - will leave it overnight and see what happens.

flamme rouge game v1

4124d63

Update readme

b82a7d0

davidADSP merged commit 2acfd25 into davidADSP:main May 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flamme rouge game initial version #5

flamme rouge game initial version #5

zorgluf commented May 10, 2021

davidADSP commented May 11, 2021

zorgluf commented May 12, 2021

davidADSP commented Jun 12, 2021 •

edited

davidADSP commented Jun 12, 2021 •

edited

zorgluf commented Jun 12, 2021

davidADSP commented Jun 12, 2021 •

edited

flamme rouge game initial version #5

flamme rouge game initial version #5

Conversation

zorgluf commented May 10, 2021

davidADSP commented May 11, 2021

zorgluf commented May 12, 2021

davidADSP commented Jun 12, 2021 • edited

davidADSP commented Jun 12, 2021 • edited

zorgluf commented Jun 12, 2021

davidADSP commented Jun 12, 2021 • edited

davidADSP commented Jun 12, 2021 •

edited

davidADSP commented Jun 12, 2021 •

edited

davidADSP commented Jun 12, 2021 •

edited