[Official documentaiton Mountain car](https://gym.openai.com/envs/MountainCar-v0/)

[Github source code](https://github.com/openai/gym/blob/master/gym/envs/classic_control/mountain_car.py)

[OpenAI Gym DOC](https://gym.openai.com/docs/)

In [4]:
import gym


- **observation** (object): an environment-specific object representing your observation of the environment. For example, pixel data from a camera, joint angles and joint velocities of a robot, or the board state in a board game.
- **reward** (float): amount of reward achieved by the previous action. The scale varies between environments, but the goal is always to increase your total reward.
- **done** (boolean): whether it’s time to reset the environment again. Most (but not all) tasks are divided up into well-defined episodes, and done being True indicates the episode has terminated. (For example, perhaps the pole tipped too far, or you lost your last life.)
- **info** (dict): diagnostic information useful for debugging. It can sometimes be useful for learning (for example, it might contain the raw probabilities behind the environment’s last state change). However, official evaluations of your agent are not allowed to use this for learning.


<img src="./assets/MDP.png" width="380" />

[Section 3.1 of the textbook](http://www.incompleteideas.net/book/RLbook2018.pdf#page=70)

**Observation**: 

     Type:  Box(2)
     Num    Observation               Min            Max
     0      Car Position              -1.2           0.6
     1      Car Velocity              -0.07          0.07
         
**Actions**:

     Type: Discrete(3)
     Num    Action
     0      Accelerate to the Left
     1      Don't accelerate
     2      Accelerate to the Right

     Note: This does not affect the amount of velocity affected by the gravitational pull acting on the car
        
**Reward**:

     Reward of 0 is awarded if the agent reached the flag(position = 0.5) on top of the mountain
     Reward of -1 is awarded if the position of the agent is less than 0.5
        
**Starting State**:

     The position of the car is assigned a uniform random value in [-0.6 , -0.4]
     The velocity of the car is always assigned to 0
        
**Episode Termination**:

     The car position is more than 0.5
     Episode length is greater than 200

In [28]:
env = gym.make("MountainCar-v0")
observation = env.reset() 

# Object's type in the action Space
print("The Action Space is an object of type: {0}\n".format(env.action_space))
# Shape of the action Space
print("The shape of the action space is: {0}\n".format(env.action_space.n))
# Object's type in the Observation Space
print("The Environment Space is an object of type: {0}\n".format(env.observation_space))
# Shape of the observation space
print("The Shape of the dimension Space are: {0}\n".format(env.observation_space.shape))
# The high and low values in the observation space
print("The High values in the observation space are {0}, the low values are {1}\n".format(
    env.observation_space.high, env.observation_space.low))


The Action Space is an object of type: Discrete(3)

The shape of the action space is: 3

The Environment Space is an object of type: Box(2,)

The Shape of the dimension Space are: (2,)

The High values in the observation space are [0.6  0.07], the low values are [-1.2  -0.07]



In [7]:
# Number of runs are the times the experiment will start again (a.k.a episode)

# Number of timesteps the agent will run steps (a.k.a episodes)

for i_episode in range(20):
    # Returns initial observation of the Environment
    # Resets Agent
    observation = env.reset()
    for t in range(100):
        env.render()
        print(observation)
        action = env.action_space.sample()
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break
env.close()


reinicie
[-0.40890625  0.        ]
[-0.41074957 -0.00184332]
[-0.41342318 -0.00267361]
[-0.41690815 -0.00348497]
[-0.42217971 -0.00527156]
[-0.42820022 -0.00602052]
[-0.4339265  -0.00572628]
[-0.44131724 -0.00739074]
[-0.44831884 -0.0070016 ]
[-0.45688025 -0.00856141]
[-0.46693872 -0.01005847]
[-0.4774201  -0.01048138]
[-0.48924671 -0.01182661]
[-0.5023305  -0.01308379]
[-0.5155737 -0.0132432]
[-0.52987708 -0.01430338]
[-0.54413337 -0.01425629]
[-0.55723575 -0.01310238]
[-0.57008629 -0.01285053]
[-0.58158929 -0.01150301]
[-0.59265956 -0.01107027]
[-0.60321557 -0.01055601]
[-0.61218012 -0.00896455]
[-0.62048809 -0.00830798]
[-0.62707959 -0.0065915 ]
[-0.63190739 -0.0048278 ]
[-0.63593709 -0.0040297 ]
[-0.63814011 -0.00220302]
[-0.64050088 -0.00236077]
[-0.64300274 -0.00250187]
[-0.64462811 -0.00162536]
[-0.64636556 -0.00173746]
[-0.64820295 -0.00183738]
[-6.48127411e-01  7.55357519e-05]
[-0.64613948  0.00198793]
[-0.64225306  0.00388642]
[-0.63749541  0.00475765]
[-0.63190006  0.0055953

[-0.56804893 -0.00324464]
[-0.57096118 -0.00291225]
[-0.5745194  -0.00355822]
[-0.5776972 -0.0031778]
[-0.58147104 -0.00377384]
[-0.58581302 -0.00434198]
[-0.5886911  -0.00287808]
[-0.59208408 -0.00339298]
[-0.59496702 -0.00288295]
[-0.59631879 -0.00135176]
[-0.59712947 -0.00081068]
[-5.97393129e-01 -2.63661761e-04]
[-0.59810784 -0.00071471]
[-5.98268381e-01 -1.60537800e-04]
[-5.97873568e-01  3.94812329e-04]
[-0.59692629  0.00094727]
[-0.59443349  0.00249281]
[-0.59141341  0.00302008]
[-0.58788822  0.00352519]
[-0.58388385  0.00400437]
[-0.5804298   0.00345405]
[-0.57555158  0.00487822]
[-0.57128529  0.00426629]
[-0.56666257  0.00462272]
[-0.56071776  0.0059448 ]
[-0.55449514  0.00622262]
[-0.54804113  0.00645401]
[-0.54240397  0.00563716]
[-0.53662584  0.00577813]
[-0.53175003  0.00487581]
[-0.5278131   0.00393693]
[-0.52284456  0.00496854]
[-0.51788167  0.00496288]
[-0.51296166  0.00492001]
[-0.50812142  0.00484024]
[-0.50439722  0.00372421]
[-0.50081694  0.00358027]
[-0.4974074   0.

[-0.68213809  0.00241461]
[-0.67857876  0.00355933]
[-0.67389851  0.00468025]
[-0.66712882  0.00676969]
[-0.65931561  0.00781321]
[-0.64951241  0.0098032 ]
[-0.63778716  0.01172525]
[-0.62622215  0.01156501]
[-0.61289957  0.01332258]
[-0.59891521  0.01398436]
[-0.58437078  0.01454444]
[-0.56937307  0.01499771]
[-0.55503314  0.01433993]
[-0.53945779  0.01557534]
[-0.52276355  0.01669424]
[-0.50507558  0.01768798]
[-0.48752645  0.01754912]
[-0.46924734  0.01827911]
[-0.45037406  0.01887328]
[-0.43104557  0.0193285 ]
[-0.41340233  0.01764323]
[-0.3965706   0.01683173]
[-0.37966868  0.01690191]
[-0.36281301  0.01685567]
[-0.34811701  0.014696  ]
[-0.33567717  0.01243984]
[-0.32557326  0.01010391]
[-0.31686871  0.00870455]
[-0.30961713  0.00725158]
[-0.30286244  0.00675469]
[-0.2976449   0.00521754]
[-0.29299518  0.00464972]
[-0.2899403   0.00305489]
[-0.28949782  0.00044248]
[-2.89670288e-01 -1.72465980e-04]
[-0.29045671 -0.00078642]
[-0.29185258 -0.00139587]
[-0.29584987 -0.00399729]
[-0.

[-0.55689654 -0.00531748]
[-0.56196471 -0.00506816]
[-0.56574576 -0.00378106]
[-0.56921156 -0.00346579]
[-0.57133632 -0.00212476]
[-0.57410428 -0.00276796]
[-0.57749489 -0.00339061]
[-0.58148304 -0.00398815]
[-0.58403924 -0.0025562 ]
[-0.58514461 -0.00110537]
[-5.84791011e-01  3.53602143e-04]
[-5.84981040e-01 -1.90029508e-04]
[-5.84713300e-01  2.67740039e-04]
[-5.84989765e-01 -2.76464660e-04]
[-0.5858084  -0.00081863]
[-5.86163158e-01 -3.54762399e-04]
[-5.86051438e-01  1.11720239e-04]
[-5.85474058e-01  5.77379664e-04]
[-0.58443527  0.00103878]
[-0.58294275  0.00149253]
[-0.58200748  0.00093526]
[-5.81636396e-01  3.71088076e-04]
[-0.58083222  0.00080417]
[-5.80600903e-01  2.31319109e-04]
[-5.80944148e-01 -3.43245546e-04]
[-5.80859422e-01  8.47263658e-05]
[-0.57934735  0.00151207]
[-0.57841911  0.00092824]
[-0.57708157  0.00133754]
[-0.57534462  0.00173694]
[-0.57422114  0.00112348]
[-5.73719453e-01  5.01690730e-04]
[-0.57284327  0.00087618]
[-0.5705991   0.00224417]
[-0.56700359  0.0035

[-0.56730441 -0.00781733]
[-0.57479489 -0.00749048]
[-0.5829029  -0.00810801]
[-0.59056847 -0.00766557]
[-0.59673515 -0.00616667]
[-0.60235769 -0.00562254]
[-0.60839502 -0.00603733]
[-0.61480322 -0.0064082 ]
[-0.61953588 -0.00473267]
[-0.62255891 -0.00302303]
[-0.6238506  -0.00129168]
[-6.24401677e-01 -5.51079370e-04]
[-6.24208206e-01  1.93471809e-04]
[-6.24271568e-01 -6.33623307e-05]
[-6.24591311e-01 -3.19742749e-04]
[-6.25165145e-01 -5.73833980e-04]
[-6.24988963e-01  1.76181410e-04]
[-0.62306403  0.00192494]
[-0.62040412  0.0026599 ]
[-0.61602834  0.00437578]
[-0.61096819  0.00506015]
[-0.60626024  0.00470795]
[-0.60193867  0.00432157]
[-0.59803494  0.00390373]
[-0.59257757  0.00545737]
[-0.58660654  0.00597103]
[-0.58116576  0.00544078]
[-0.57629537  0.00487039]
[-0.57203141  0.00426397]
[-0.56740547  0.00462593]
[-0.56245193  0.00495354]
[-0.55820765  0.00424428]
[-0.55370427  0.00450338]
[-0.54797541  0.00572886]
[-0.54206388  0.00591153]
[-0.53501394  0.00704994]
[-0.5278784   0.