OhHellStrategyEvolution is an exploration into using Genetic Programming with an AST-like paradigm as a hyper-heuristic to evolve strategies for the card game Oh Hell, evaluating their fitness by playing tournaments against a selection of (varyingly) naive (fixed) algorithmic strategies using an included multi-player game engine.
Well you did ask.
- The game engine can play (and log in a human-friendly way) entire computer-only games or tournaments (multiple games).
- Evolution is orchestrated using the excellent Watchmaker [Java] framework, with a few additions and hacks to support this usage.
- Crossover breeding and mutation of the strategies is supported
- Designed throughout for use of Java 8's functional and stream features.
- Most of the many, many rule and scoring variants are pluggable, though for now only one typical setup is used (related aside: see an earlier product of this real-world problem, HellOh Android app)
- Something that can run Java 8 and Swing
- The build uses Maven 3, and some of the more common plugins
- JUnit, AssertJ and Mockito are used for testing, and currently Travis.ci runs these per commit.
The overall steps the app goes through
- Set up the engine and UI
- For each generation:
- Randomly generate bidding evaluation trees of configurable depths (more on this later)
- Create a
GeneticStrategybased on this bidding evaluator - Generate
n(bot) players based on these, and a playing strategy (currently: random (!)) - Split the
nplayers up intontournaments, addingm"outsiders" (bots with algorithmic strategies) - Start the Fitness Evaluators, using a threadpool to maximise throughput on multi-core machines to run tournaments (a series of games with the same players). For each of the
ntournaments:- Run a configurable (but small) number of games. For each game:
- Play the game through its sequences of hands (typically 13-19 rounds):
- For each hand (of size
h), bidding is done by evalutaing each possible bid using the bid evaluator - Play the hand, consisting of
htricks (more on this later)
- For each hand (of size
- Keep track of scores by using the scorer on tricks scored vs tricks bid
- Play the game through its sequences of hands (typically 13-19 rounds):
- Then, sum the scores across games for each player and normalising slightly (this seems to work slightly better than averaged rankings)
- Run a configurable (but small) number of games. For each game:
- Collate the separate tournaments to overall results for the candidates (game-playing strategies)
- Repeatedly select candidates for generating new operations (currently this uses Sigma-Scaling selection), and generate using the
EvolutionaryOperatorpipeline. This consists of a heady mix of Mutation, Crossover (sexual reproduction, of sorts), Simplification (of the node trees) and so on. - The framework assesses if a termination condition has been met, and finishes if so.
- If not, update engine / UI, move to next generation
As of April 2015: yes, a bit. The problem domain is (deliberately) unusual and therefore possibly ill-suited to GP in general; in addition the usual GP (and more generally EA) concerns exist - balancing crossover, mutation, simplification, elitism, population size, initial tree complexity, symbol sets and so on to produce enough but not too much diversity to get at least some reasonable solutions. It rapidly becomes clear there are no silver bullets.
There's now a basic Swing UI, also using the Watchmaker base classes.
A lot! See the issues, really, but here are some general ideas so far:
- A separate, but related playing strategy evaluator, which co-evolves.
- Some kind of persistence, to start and stop, and eventually use these
- Lots of bug-fixes, and performance enhancements (though this has got a lot better recently)
- Lots of tweaks to the (many) EA parameters and symbol sets that can make the difference between success and failure.
- Enhancements to the UI, possibly even to the level of being able to freeze evolution and see individual strategies playing their games.