Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Puzzle RLE

Project description

Puzzle RLE (Reinforcement Learning Environment) is an environment for learning the puzzle gameplay from the mobile game Puzzle and Dragons (Gungho Online Entertainment, Tokyo, Japan). The environment is a re-implemntation based on orb-matching and clearing mechanics encountered during normal gameplay.

The environment supports the following:

  • pygame environment visualization engine
  • 5 Actions: select-orb, move left, up, right, down
  • Baseline random agent
  • OpenAI Baselines agents

Project Milestones & Plans

  • Implement basic movement, clearing, skyfall and cascade mechanics ( ✔️ Sep 2, '17)
  • Implement a random agent and run experiments to generate baseline statistics ( ✔️ Sep 2, '17)
  • Implement Deep Q Network learning agent (✔️ Sep 9 '17 )
  • Abandon previous goal of implementing RL algorithms. Use openai baselines, and stable-baselines instead! (✔️ Sep '18)
  • Update environment to work with openai gym style with spaces, step(), ... etc. (✔️ Sep '18)
  • Update render-able environment via pygame (✔️ Oct '18)
  • Train a successful agent (in progress)
  • Update agent to take rendered pygame pixels
    • Represent selected orb on-screen
    • Represent environment timer on-screen
    • Move timer to work "real - time" : reset clock when orb is selected. End episode after timer.
    • Allow "unselect" option = "end episode now" <-- estimate that you've gotten the max reward for this episode

Tested OpenAI baseline agents

Agent tested-runing performance
DeepQ no bad
A2C no bad
HER no bad
PPO2 yes bad

During a long hiatus from this project there's been some developments in relational reinforcement learning (arxiv)[]. OpenAI's baselines implementations have been greatly improved and expanded.



Environment based on mobile game Puzzle & Dragons

Implementation of DQN algorithms were with reference to the original papers:

Very much credit to the series of blogposts and Jupyter notebooks by awjuliani on reinforement learning:


MIT license ? the free use with citation one.


RL environment



No releases published


No packages published