Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.

Replication code for "Efficient Evaluation of Natural Stochastic Policies in Offline Reinforcement Learning" by Kallus and Uehara

Taxi Environment

Taxi is a 2D grid world simulating taxi movement along the grids. A taxi moves North, East, South, West or attends to pick up or drop off a passenger. It receives a reward of 20 when it successfully picks up a passenger or drops her off at the right place, and otherwise a reward of -1 every time step. The original taxi environment would stop when the taxi successfully picks up a passenger and drops her off at the right place. We modify the environment to make it infinite horizon, by allowing passengers to randomly appear and disappear at every corner of the map at each time step. We use a grid size of 5 * 5, which yields 2000 states in total (25 * 2^4 * 5$, corresponding to 25 taxi locations, 2^4 passenger appearance status and 5 taxi status (empty or with one of 4 destinations)).


Example run

$ python --nt 200 --ts 400 --gm 1.0

Here nt denotes number of iteration, ts denotes truncated trajectory size, gm denotes the discounted factor gamma (gamma=1 means long-run average case).


No description, website, or topics provided.






No releases published


No packages published