## Classic Control

### MountainCar (v0)

An underpowered car must climb a one-dimensional hill to reach a target. Unlike MountainCar v0, the action (engine force applied) is allowed to be a continuous value.

The target is on top of a hill on the right-hand side of the car. If the car reaches it or goes beyond, the episode terminates.

On the left-hand side, there is another hill. Climbing this hill can be used to gain potential energy and accelerate towards the target. On top of this second hill, the car cannot go further than a position equal to -1, as if there was a wall. Hitting this limit does not generate a penalty (it might in a more challenging version).

[Environment Source code (GitHub)](https://github.com/openai/gym/blob/master/gym/envs/classic_control/continuous_mountain_car.py)

In [None]:
import engine

In [None]:
env = engine.instantiate("MountainCar-v0")

print("Observation space type:", type(env.observation_space))
print("Observation space size:", env.observation_space.shape, "*", env.observation_space.dtype)
print("Observation space max values:", env.observation_space.high)
print("Observation space min values:", env.observation_space.low)

print("Action space type:", env.action_space)

print("Reward range:", env.reward_range)

### Observation Space - _Box(2)_

| Index | Observation Type | Min Value | Max Value |
|:-----:|------------------|:---------:|:---------:|
|   0   | Car Position     |   -1.2    |    0.6    |
|   1   | Car Velocity     |   -0.07   |   0.07    |

### Action Space - _Discrete(3)_

| Action | Action Type             |
|:------:|-------------------------|
|   0    | Accelerate to the Left  |
|   1    | Don't accelerate        |
|   2    | Accelerate to the Right |

_**Note**: This doesn't affect the amount of velocity affected by the gravitational pull action on the car._

### Reward
Reward of 0 is awarded if the agent reached the flag (position = 0.5) on top of the mountain.
Reward of -1 is awarded if the position of the agent is less than 0.5.

### Starting State
The position of the car is assigned a uniform random value in [-0.6, -0.4].
The starting velocity of the car is always assigned to 0.

### Termination Conditions
1) The Car position is more than 0.5
2) Episode length is greater than 200

### Solved Requirements
Considered solved when getting the average reward of -110.0 over 100 consecutive trials.

In [None]:
from agents import RandomAgent

agent = RandomAgent(env)

In [None]:
from IPython.display import clear_output

for i in range(100):
    clear_output(wait=True)
    reward = engine.run(env, agent)
    print("Iteration", i, "->", "Reward", reward)