Skip to content
Teaching a machine to play tic-tac-toe
HTML Jupyter Notebook Python
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Images Linking source code to github instead of local links. Jun 1, 2018
pre-rendered-notebooks Added an index / landing page for the tutorials Jun 1, 2018
tic_tac_toe Another small type hinting fix Jan 18, 2020
.gitignore
LICENSE Initial commit Apr 13, 2018
Part 1 - Computer Tic Tac Toe Basics.ipynb
Part 2 - The Min Max Algorithm.ipynb Linking source code to github instead of local links. Jun 1, 2018
Part 3 - Tabular Q-Learning.ipynb
Part 4 - Neural Network Q-Learning.ipynb More spell-checking, Jun 2, 2018
Part 5 - Q Network review and becoming less greedy.ipynb More spell-checking, Jun 2, 2018
Part 6 - Double Duelling Q Network with Experience Replay.ipynb More spell-checking, Jun 2, 2018
Part 7 - This is deep. In a convoluted way.ipynb More spell-checking, Jun 2, 2018
Part 8 - Tic Tac Toe with Policy Gradient Descent.ipynb Proof Reading Jul 20, 2018
Part 9 - Using TensorBoard.ipynb Renamed tensorboard tutorial to part 9 Jul 19, 2018
README.md
index.ipynb
requirements.txt Limit TensorFlow to verions less than 2.0.0 for the moment as the cod… Jan 13, 2020
reward_sweep.py More work on Policy Gradient Descent tutorial Jul 18, 2018
test.py Some code clean-up / re-formatting, added comments. Jun 5, 2018
test_nn_q.py Some code clean-up / re-formatting, added comments. Jun 5, 2018
test_part4.py Some code clean-up / re-formatting, added comments. Jun 5, 2018
test_part5.py
test_part6.py merged DeepExpDoubleDuelQPlayer2 back into 1. Also more Tensorboard l… May 28, 2018
test_part7.py
testq.py Minor improvements Jun 2, 2018
util.py Some code clean-up / re-formatting, added comments. Jun 5, 2018

README.md

Tic Tac Toe

A tale about trying to train a machine to play Tic Tac Toe through Reinforcement Learning

To run the Jupyter notebooks in Binder press: Binder

The goal of this series is to implement and test a couple of different approaches to training a computer how to play Tic Tac Toe. We will create:

  • A player that plays completely randomly,
  • Two players that implement simple forms of the Min-Max algorithm,
  • Several players that we will train through Reinforcement Learning:
    • a Tabular Q-Learning player.
    • a Simple Neural Network Q-Learning player.
    • a Deep Neural Network Q-learning player.
    • a Policy Gradient Descent based player.

I assume you are familiar with:

  • The rules and basic strategy of playing Tic Tac Toe.
  • Basic Python 3 programming and use of a Python IDE or Jupyter Notebooks.
  • At least rudimentary knowledge of Tensorflow and Neural Networks would be helpful, but you might be able to do without (give it a try and if it's too overwhelming do some of the beginner tutorials, and then try again).
You can’t perform that action at this time.