
# Reinforcement Learning
### env: _taxi-v2_

In [1]:
import gym

## Loading and initializing an environment

In [3]:
env = gym.make('Taxi-v2')
env.reset()

[2017-10-17 12:34:41,080] Making new env: Taxi-v2


84

### Observation states

In [4]:
print('Total number of possible states = {:,}'.format(env.observation_space.n))

Total number of possible states = 500


### Visualizing the state
In this environment the yellow square represents the taxi, the (“|”) represents a wall, the blue letter represents the pick-up location, and the purple letter is the drop-off location. The taxi will turn green when it has a passenger aboard. While we see colors and shapes that represent the environment, the algorithm does not think like us and only understands a flattened state, in this case an integer.

In [5]:
env.render()

+---------+
|[35mR[0m: | : :[34;1m[43mG[0m[0m|
| : : : : |
| : : : : |
| | : | : |
|Y| : |B: |
+---------+



### Action space
This shows us there are a total of six actions available. Gym will not always tell you what these actions mean, but in this case, the six possible actions are: down (0), up (1), right (2), left (3), pick-up (4), and drop-off (5).

In [6]:
env.action_space.n

6

### Overriding and moving the agent state

In [10]:
env.env.s = 114 # Overrode the state to 114
env.render()

+---------+
|R: | : :G|
|[43m [0m: : : : |
| : : : : |
| | : | : |
|[35mY[0m| : |[34;1mB[0m: |
+---------+
  (North)


In [11]:
# move up (1)
t = env.step(1)
print(t)
env.render()

(14, -1, False, {'prob': 1.0})
+---------+
|[43mR[0m: | : :G|
| : : : : |
| : : : : |
| | : | : |
|[35mY[0m| : |[34;1mB[0m: |
+---------+
  (North)


In [12]:
# move left (3)
t = env.step(3)
print(t)
env.render()

(14, -1, False, {'prob': 1.0})
+---------+
|[43mR[0m: | : :G|
| : : : : |
| : : : : |
| | : | : |
|[35mY[0m| : |[34;1mB[0m: |
+---------+
  (West)


In [13]:
# move right (2)
t = env.step(2)
print(t)
env.render()

(34, -1, False, {'prob': 1.0})
+---------+
|R:[43m [0m| : :G|
| : : : : |
| : : : : |
| | : | : |
|[35mY[0m| : |[34;1mB[0m: |
+---------+
  (East)
