<h1>CartPole Environment Demo</h1>

This is a simple demo of how [OpenAI Gym's CartPole-v0 Environment](https://gym.openai.com/envs/CartPole-v0/) works. The first example simulates the environment with random actions for 500 timesteps, regardless of what happens. You may see the cartpole fly off the screen, and/or the pole will fall too far off balance. In the next example, we will collect information collected from the simulation after each individual action to collect **observations, rewards, done criteria, and other raw simulation information.** If you aren't familiar with OpenAI gym, you should quickly look through the [docs](https://gym.openai.com/docs/).

In [1]:
import gym
env = gym.make('CartPole-v0')
env.reset()
for _ in range(500):
    env.render()
    env.step(env.action_space.sample())
env.close()



This demo stores the observation, reward, done condition, and additional info, when taking an action. In this example, we simulate 20 episodes of random actions, where each episode is ended either after 1000 timesteps, or when the *done* criteria is met. env.close() is needed to prevent crashing when rendering the simulation.

In [1]:
import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
    observation = env.reset()
    for t in range(1000):
        env.render()
        print(observation)
        action = env.action_space.sample()
        observation, reward, done, info = env.step(action)
        print(reward)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break
env.close()

[-0.02825124  0.0236406  -0.03860474 -0.00245405]
1.0
[-0.02777842 -0.17090705 -0.03865383  0.27780301]
1.0
[-0.03119656 -0.36545685 -0.03309776  0.55804834]
1.0
[-0.0385057  -0.56009893 -0.0219368   0.8401226 ]
1.0
[-0.04970768 -0.75491463 -0.00513435  1.12582692]
1.0
[-0.06480597 -0.55972577  0.01738219  0.83153801]
1.0
[-0.07600049 -0.75508091  0.03401295  1.12963656]
1.0
[-0.09110211 -0.56042051  0.05660568  0.84781273]
1.0
[-0.10231052 -0.36611447  0.07356194  0.57345367]
1.0
[-0.10963281 -0.56218653  0.08503101  0.88837407]
1.0
[-0.12087654 -0.75835316  0.10279849  1.2065299 ]
1.0
[-0.1360436  -0.5646986   0.12692909  0.9477511 ]
1.0
[-0.14733757 -0.37149312  0.14588411  0.69749133]
1.0
[-0.15476743 -0.17866296  0.15983394  0.45405648]
1.0
[-0.15834069 -0.37564123  0.16891507  0.79254913]
1.0
[-0.16585352 -0.57262947  0.18476605  1.13325131]
1.0
[-0.17730611 -0.38034213  0.20743108  0.90374229]
1.0
Episode finished after 17 timesteps
[ 0.02173086 -0.00917372 -0.01704039  0.038832

[0.00885946 0.01396773 0.0524177  0.05272136]
1.0
[ 0.00913881  0.20830042  0.05347213 -0.22297355]
1.0
[ 0.01330482  0.40261894  0.04901266 -0.49832141]
1.0
[ 0.0213572   0.20684146  0.03904623 -0.19060389]
1.0
[ 0.02549403  0.40138369  0.03523415 -0.47071814]
1.0
[ 0.0335217   0.20578222  0.02581979 -0.16714135]
1.0
[ 0.03763735  0.40052525  0.02247696 -0.45156835]
1.0
[ 0.04564785  0.59532223  0.01344559 -0.73708234]
1.0
[ 0.0575543   0.79025592 -0.00129605 -1.02550357]
1.0
[ 0.07335942  0.98539511 -0.02180613 -1.31859314]
1.0
[ 0.09306732  0.79055558 -0.04817799 -1.03281382]
1.0
[ 0.10887843  0.59610639 -0.06883426 -0.75563732]
1.0
[ 0.12080056  0.79210625 -0.08394701 -1.06916217]
1.0
[ 0.13664268  0.98823207 -0.10533025 -1.38696637]
1.0
[ 0.15640732  1.18449741 -0.13306958 -1.71064266]
1.0
[ 0.18009727  0.9911317  -0.16728244 -1.46216543]
1.0
[ 0.19991991  1.18786186 -0.19652574 -1.80209338]
1.0
Episode finished after 21 timesteps
[ 0.03958987  0.00541041  0.01918162 -0.01190149]


[ 0.00424693  0.01139244  0.01637801 -0.04655703]
1.0
[ 0.00447478  0.20627576  0.01544687 -0.33402788]
1.0
[ 0.00860029  0.40117449  0.00876632 -0.62179991]
1.0
[ 0.01662378  0.59617294 -0.00366968 -0.91170907]
1.0
[ 0.02854724  0.79134435 -0.02190386 -1.2055431 ]
1.0
[ 0.04437413  0.59651221 -0.04601473 -0.91980423]
1.0
[ 0.05630437  0.40204139 -0.06441081 -0.64193032]
1.0
[ 0.0643452   0.59799925 -0.07724942 -0.95418127]
1.0
[ 0.07630518  0.79407068 -0.09633304 -1.27010006]
1.0
[ 0.0921866   0.99028146 -0.12173504 -1.59132978]
1.0
[ 0.11199223  1.18662021 -0.15356164 -1.91936015]
1.0
[ 0.13572463  0.99344705 -0.19194884 -1.67798003]
1.0
Episode finished after 24 timesteps
[-0.04690845 -0.04674046 -0.01922688 -0.02759014]
1.0
[-0.04784326 -0.24158148 -0.01977868  0.25896502]
1.0
[-0.05267489 -0.04618283 -0.01459938 -0.03989018]
1.0
[-0.05359855 -0.24109242 -0.01539719  0.24815098]
1.0
[-0.05842039 -0.43599113 -0.01043417  0.5359378 ]
1.0
[-6.71402164e-02 -6.30964825e-01  2.84587778e-