Skip to content

Latest commit

 

History

History
46 lines (37 loc) · 1.28 KB

README.md

File metadata and controls

46 lines (37 loc) · 1.28 KB

GridWorld Gym Environment


GridWorld is a common MDP (Markov Decision Process) used in teaching AI and Reinforcement Learning. This is an environment you can import and implement basic algorithms on. The states and actions are discrete.

alt text

git clone https://github.com/k--chow/gym_gridworld.git
cd gym_gridworld
pip install -e .

To use in code:

import gym
import gym_gridworld

env = gym.make('gridworld-v0')

Actions

action = 0 # move north
action = 1 # move east
action = 2 # move south
action = 3 # move west

Observation Space

This is a 3 x 4 grid.

Gridworld has a rock which is an invalid state, and two exit/game ending states (red and green), which return reward -1 and 1 respectively. MDP's are special because every intentional action is not deterministic; If we choose to go north (action 0), there is a 0.8 probability we go north, and a 0.1 probability we go in each orthogonal direction (0.1 east, 0.1 west).

Challenge: Algorithms to implement

  • Policy Evaluation
  • TD Learning
  • Monte Carlo
  • Value Iteration
  • Policy Iteration
  • Q-learning
  • Proximal Policy
  • Policy Gradient

TODO

[x] Add visual rendering