Skip to content

Mononofu/reinforcement-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning: An Introduction

Implementing exercises for Reinforcement Learning: An Introduction.

Chapter 2 - Bandit Problems

Greedy, 0.1 and 0.01 epsilon greedy agents

nbandit.py, implementation of a greedy and epsilon greedy agent for the n-armed bandit problem. For explanations on how it works, read the book ;)

Playing Catch

As a more interesting test, I next tried my hand at a very simple game: Catch.

A ball starts at a random position at the top of a 5x5 playing field and moves down one row each round. The player controls a bat to catch the ball width, which can either move left, right or stand still. Catching the ball gives a reward of +1, missing -1.

A naive table based agent learns to play perfectly after ~500 episodes, the neural network based ones (with 1 and 2 hidden layers) take quite a bit longer, about 3000 episodes:

Table based Q-learning, 1 and 2 hidden layer neural networks, random agent

About

Implementing exercises from Reinforcement Learning: An Introduction

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published