Reinforcement learning in python -- battleship

The included notebook contains a policy-gradient trained, deep RL algorithm that learns to play the game Battleship in 1-d. That is, the algorithm trains a network to suggest good next moves, allowing it to find a hidden ship quickly. To run the notebook you should have the following packages installed: tensorflow, jupyter notebook, numpy, and matplotlib.