Thompson Sampling for Gridworld

Final project for Reinforcement Learning course

Authors: Lorenzo Basile, Irene Brugnara

Source files:

gridworld.py contains the implementation of the class Gridworld, defining a two-dimensional grid environment in which an agent moves from an initial cell to a target cell. The position of the target cell is not known, but at each time step the agent receives a random binary signal from the target depending on its distance from the target. The method gridworld_search implements a search algorithm based on Thompson sampling (or a greedy algorithm if greedy=True);
animation.py contains an example animated run of the search algorithm;
benchmark.py and analysis.py respectively contain code to collect and process data on larger-scale runs of both Thompson algorithm and greedy algorithm (the former produces a pickle file which is to be read by the latter).

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.gitignore		.gitignore
README.md		README.md
analysis.py		analysis.py
animation.py		animation.py
benchmark.py		benchmark.py
gridworld.py		gridworld.py
slides.pdf		slides.pdf

Provide feedback