GridWorld is a common MDP (Markov Decision Process) used in teaching AI and Reinforcement Learning. This is an environment you can import and implement basic algorithms on. The states and actions are discrete.
git clone https://github.com/k--chow/gym_gridworld.git
cd gym_gridworld
pip install -e .
To use in code:
import gym
import gym_gridworld
env = gym.make('gridworld-v0')
action = 0 # move north
action = 1 # move east
action = 2 # move south
action = 3 # move west
This is a 3 x 4 grid.
Gridworld has a rock which is an invalid state, and two exit/game ending states (red and green), which return reward -1 and 1 respectively. MDP's are special because every intentional action is not deterministic; If we choose to go north (action 0), there is a 0.8 probability we go north, and a 0.1 probability we go in each orthogonal direction (0.1 east, 0.1 west).
- Policy Evaluation
- TD Learning
- Monte Carlo
- Value Iteration
- Policy Iteration
- Q-learning
- Proximal Policy
- Policy Gradient
[x] Add visual rendering