A first bare bones paralleled implementation of Go Explore as described by the Uber Engineering blog post
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Gameplay_1568.gif
Gameplay_3973-242-37-0-0.gif
Gameplay_FR_1641-305-37-0-0.gif
LICENSE
README.md
asyncGoExplore.ipynb
markov.py
support.py
syncGoExplore.ipynb

README.md

SynchronousGoExplore

A first bare bones paralleled implementation of Go Explore as described by the Uber Engineering blog post

Currently no deep learning is incorperated with the project. The avalible exploration policies are random, and markov chain.

The notebook syncGoExplore.ipynb demonstrates the use of Go Explore to create a speedrun of level in a gym environment using multiple threads.

Dependencies:

ray (linux and osx only)
gym retro
imageio (also needs freeimage)
rom file for the game environment

Original reddit discussion with some more information: https://www.reddit.com/r/MachineLearning/comments/agf43s/d_go_explore_vs_sonic_the_hedgehog/

Original blog post by Uber: https://eng.uber.com/go-explore/

To do:

Add smarter exploration policies (fast simple models and deep learning)

Asynchronous Go Explore, i.e. allow workers to be constantly playing and updating only when ready/neccesary

Add iterative deepening

Add procedures for experiments to search for good hyperparameters

Add the comb operation - sequentially go to each state encountered in a run that reaches the end of the level

Some early gameplay:

A very polished run: