# OpenAI Gym Tutorial



## Run a Basic Environment

The codes below will run an instance of the CartPole-v0 environment for 1000 timesteps, rendering the environment at each step. 

In [None]:
import gym

# env = gym.make('CartPole-v0')
# env = gym.make('MountainCar-v0')
# env = gym.make('MsPacman-v0')
# env = gym.make('Hopper-v1')

env = gym.make('CartPole-v0')
env.reset()
for _ in range(1000):
    env.render()
    env.step(env.action_space.sample()) # take a random action

## Run Basic Environment - Part 2

Run 20 episodes of CartPole-v0 environment (the right way). Note the following:
* env.reset() returns an observation
* env.step() returns observation, reward, done and info
* done is True if the episode is terminated

In [5]:
import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
    observation = env.reset()
    for t in range(100):
        env.render()
        print(observation)
        action = env.action_space.sample()
        observation, reward, done, info = env.step(action)
        print(action,reward,done)
        if done:  # Episode has terminated
            print("Episode finished after {} timesteps".format(t+1))
            break   # Break to restart episode

[2017-08-27 16:09:17,364] Making new env: CartPole-v0


[ 0.03952292 -0.00720098 -0.01664389  0.0007348 ]
1 1.0 False
[ 0.0393789   0.18815566 -0.0166292  -0.29715266]
1 1.0 False
[ 0.04314202  0.38351067 -0.02257225 -0.59503341]
0 1.0 False
[ 0.05081223  0.1887118  -0.03447292 -0.30954521]
0 1.0 False
[ 0.05458647 -0.00590246 -0.04066382 -0.0279303 ]
1 1.0 False
[ 0.05446842  0.18977833 -0.04122243 -0.33316072]
1 1.0 False
[ 0.05826398  0.38546202 -0.04788564 -0.63855286]
1 1.0 False
[ 0.06597322  0.58121781 -0.0606567  -0.94592295]
1 1.0 False
[ 0.07759758  0.77710195 -0.07957516 -1.25703093]
0 1.0 False
[ 0.09313962  0.58308358 -0.10471578 -0.99029472]
0 1.0 False
[ 0.10480129  0.38950719 -0.12452167 -0.73225088]
1 1.0 False
[ 0.11259143  0.58610969 -0.13916669 -1.06138607]
0 1.0 False
[ 0.12431363  0.39307725 -0.16039441 -0.81541985]
0 1.0 False
[ 0.13217517  0.2004722  -0.17670281 -0.57717462]
0 1.0 False
[ 0.13618462  0.00820958 -0.1882463  -0.34495207]
0 1.0 False
[ 0.13634881 -0.18380567 -0.19514534 -0.11703555]
0 1.0 False
[ 0.1326

## Action and Observation Spaces

Every environment comes with Space objects that describe the valid actions and observations.

In [3]:
import gym
env = gym.make('CartPole-v0')
print(env.action_space)
#> Discrete(2)
print(env.observation_space)
#> Box(4,)
print(env.observation_space.high)
#> array([ 2.4       ,         inf,  0.20943951,         inf])
print(env.observation_space.low)
#> array([-2.4       ,        -inf, -0.20943951,        -inf])

[2017-08-26 16:14:48,966] Making new env: CartPole-v0


Discrete(2)
Box(4,)
[  4.80000000e+00   3.40282347e+38   4.18879020e-01   3.40282347e+38]
[ -4.80000000e+00  -3.40282347e+38  -4.18879020e-01  -3.40282347e+38]


In [1]:
from gym import spaces

space = spaces.Discrete(8) # Set with 8 elements {0, 1, 2, ..., 7}
x = space.sample()
print ("Sample from Space:", x)
assert space.contains(x)
assert space.n == 8

Sample from Space: 4


## Environments

OpenAI gym's main purpose is to provide a large collection of environments that expose a common interface and are versioned to allow for comparisons. envs.registry.all() give you a list of EnvSpecs.

In [2]:
from gym import envs
print(envs.registry.all())

dict_values([EnvSpec(AirRaid-v0), EnvSpec(Venture-v0), EnvSpec(MsPacman-v0), EnvSpec(Frostbite-v0), EnvSpec(BankHeistDeterministic-v0), EnvSpec(OffSwitchCartpoleProb-v0), EnvSpec(Skiing-ramDeterministic-v4), EnvSpec(Asteroids-v0), EnvSpec(Frostbite-ramDeterministic-v0), EnvSpec(RepeatCopy-v0), EnvSpec(SeaquestNoFrameskip-v4), EnvSpec(BankHeistDeterministic-v4), EnvSpec(Assault-ramDeterministic-v0), EnvSpec(SemisuperPendulumNoise-v0), EnvSpec(DemonAttack-v4), EnvSpec(DemonAttackNoFrameskip-v4), EnvSpec(GopherNoFrameskip-v4), EnvSpec(Pooyan-ram-v0), EnvSpec(Centipede-v0), EnvSpec(PrivateEye-v4), EnvSpec(TimePilotNoFrameskip-v4), EnvSpec(Zaxxon-ramDeterministic-v4), EnvSpec(Robotank-ram-v0), EnvSpec(PhoenixNoFrameskip-v0), EnvSpec(RobotankDeterministic-v0), EnvSpec(WizardOfWor-ramDeterministic-v0), EnvSpec(CarnivalDeterministic-v0), EnvSpec(FishingDerby-ramNoFrameskip-v4), EnvSpec(MsPacmanNoFrameskip-v4), EnvSpec(SpaceInvaders-v0), EnvSpec(UpNDown-ramNoFrameskip-v0), EnvSpec(Robotank-v4),

In [3]:
#!/usr/bin/env python
from gym import envs
envids = [spec.id for spec in envs.registry.all()]
for envid in sorted(envids):
    print(envid)

Acrobot-v1
AirRaid-ram-v0
AirRaid-ram-v4
AirRaid-ramDeterministic-v0
AirRaid-ramDeterministic-v4
AirRaid-ramNoFrameskip-v0
AirRaid-ramNoFrameskip-v4
AirRaid-v0
AirRaid-v4
AirRaidDeterministic-v0
AirRaidDeterministic-v4
AirRaidNoFrameskip-v0
AirRaidNoFrameskip-v4
Alien-ram-v0
Alien-ram-v4
Alien-ramDeterministic-v0
Alien-ramDeterministic-v4
Alien-ramNoFrameskip-v0
Alien-ramNoFrameskip-v4
Alien-v0
Alien-v4
AlienDeterministic-v0
AlienDeterministic-v4
AlienNoFrameskip-v0
AlienNoFrameskip-v4
Amidar-ram-v0
Amidar-ram-v4
Amidar-ramDeterministic-v0
Amidar-ramDeterministic-v4
Amidar-ramNoFrameskip-v0
Amidar-ramNoFrameskip-v4
Amidar-v0
Amidar-v4
AmidarDeterministic-v0
AmidarDeterministic-v4
AmidarNoFrameskip-v0
AmidarNoFrameskip-v4
Ant-v1
Assault-ram-v0
Assault-ram-v4
Assault-ramDeterministic-v0
Assault-ramDeterministic-v4
Assault-ramNoFrameskip-v0
Assault-ramNoFrameskip-v4
Assault-v0
Assault-v4
AssaultDeterministic-v0
AssaultDeterministic-v4
AssaultNoFrameskip-v0
AssaultNoFrameskip-v4
Asterix-ra

## Upload and Record Results

To record your algorithm's performance on an environment and take videos of its learning, wrap your environment with a Monitor Wrapper:

In [6]:
import gym
from gym import wrappers
env = gym.make('CartPole-v0')
env = wrappers.Monitor(env, '/tmp/cartpole-experiment-1', force=True)   # force=TRUE clear the directory's 
                                                                        # previous content
for i_episode in range(200):
    observation = env.reset()
    for t in range(100):
        env.render()
        print(observation)
        action = env.action_space.sample()
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break
            
env.close()  # Remember to close the environment

[2017-08-26 18:04:40,294] Making new env: CartPole-v0
[2017-08-26 18:04:40,314] Clearing 8 monitor files from previous run (because force=True was provided)
[2017-08-26 18:04:40,316] Starting new video recorder writing to /tmp/cartpole-experiment-1/openaigym.video.2.6880.video000000.mp4


[ 0.01811564 -0.01988677  0.04899066 -0.01158837]
[ 0.0177179   0.17449961  0.04875889 -0.2884206 ]
[ 0.0212079  -0.02128252  0.04299048  0.01923301]
[ 0.02078224  0.17319737  0.04337514 -0.25958199]
[ 0.02424619  0.36767414  0.0381835  -0.53827447]
[ 0.03159967  0.17203676  0.02741801 -0.23380909]
[ 0.03504041  0.36675645  0.02274183 -0.51771896]
[ 0.04237554  0.1713218   0.01238745 -0.21795727]
[ 0.04580197  0.3662645   0.0080283  -0.50670706]
[ 0.05312726  0.56127241 -0.00210584 -0.79684921]
[ 0.06435271  0.36617942 -0.01804282 -0.50482948]
[ 0.0716763   0.17131633 -0.02813941 -0.21788672]
[ 0.07510263 -0.0233923  -0.03249715  0.0657887 ]
[ 0.07463478 -0.21803363 -0.03118137  0.34804406]
[ 0.07027411 -0.02248237 -0.02422049  0.04569392]
[ 0.06982446  0.17297836 -0.02330661 -0.25453132]
[ 0.07328403 -0.02180319 -0.02839724  0.03071014]
[ 0.07284797 -0.21650664 -0.02778304  0.3142999 ]
[ 0.06851783 -0.02100016 -0.02149704  0.0129862 ]
[ 0.06809783  0.17442338 -0.02123731 -0.28640103]


[2017-08-26 18:04:41,643] Starting new video recorder writing to /tmp/cartpole-experiment-1/openaigym.video.2.6880.video000001.mp4


Episode finished after 36 timesteps
[-0.03049624  0.03075248  0.04508895  0.00725519]
[-0.02988119  0.22519976  0.04523405 -0.27086764]
[-0.0253772   0.0294625   0.0398167   0.03573221]
[-0.02478795 -0.16620715  0.04053134  0.34070706]
[-0.02811209  0.02831538  0.04734548  0.06107586]
[-0.02754578 -0.1674523   0.048567    0.36831269]
[-0.03089483  0.02694712  0.05593326  0.09133048]
[-0.03035589 -0.16893007  0.05775987  0.40112259]
[-0.03373449 -0.36482172  0.06578232  0.711442  ]
[-0.04103092 -0.17066939  0.08001116  0.44016946]
[-0.04444431  0.02323445  0.08881455  0.17374239]
[-0.04397962 -0.17303894  0.09228939  0.49306925]
[-0.0474404  -0.36933311  0.10215078  0.81335205]
[-0.05482706 -0.17574746  0.11841782  0.5544674 ]
[-0.05834201  0.01753008  0.12950717  0.30131496]
[-0.05799141  0.21059124  0.13553347  0.05211712]
[-0.05377958  0.01381249  0.13657581  0.38430333]
[-0.05350333 -0.18295738  0.14426188  0.71673823]
[-0.05716248 -0.37975002  0.15859664  1.05112689]
[-0.06475748 -

[2017-08-26 18:04:44,352] Starting new video recorder writing to /tmp/cartpole-experiment-1/openaigym.video.2.6880.video000008.mp4


[-0.03386989 -0.04616582  0.01758758  0.07507417]
[-0.0347932  -0.24153543  0.01908906  0.37325375]
[-0.03962391 -0.43692328  0.02655414  0.67189397]
[-0.04836238 -0.6324041   0.03999202  0.97281782]
[-0.06101046 -0.43784092  0.05944837  0.69296093]
[-0.06976728 -0.63373501  0.07330759  1.00375051]
[-0.08244198 -0.8297556   0.0933826   1.31852483]
[-0.09903709 -0.63593028  0.1197531   1.05646819]
[-0.1117557  -0.44258122  0.14088246  0.80364502]
[-0.12060732 -0.63932469  0.15695536  1.13711892]
[-0.13339381 -0.83611182  0.17969774  1.47462683]
[-0.15011605 -0.64358272  0.20919028  1.24303068]
Episode finished after 16 timesteps
[ 0.04863384 -0.03877173  0.00932951  0.0478068 ]
[ 0.0478584   0.15621521  0.01028564 -0.24191804]
[ 0.05098271 -0.03905215  0.00544728  0.05399142]
[ 0.05020166  0.15599128  0.00652711 -0.23696788]
[ 0.05332149 -0.03922331  0.00178775  0.05776672]
[ 0.05253702  0.15587296  0.00294309 -0.23435162]
[ 0.05565448  0.35095274 -0.00174394 -0.52610475]
[ 0.06267354  

[2017-08-26 18:04:51,615] Starting new video recorder writing to /tmp/cartpole-experiment-1/openaigym.video.2.6880.video000027.mp4


[-0.07366323 -0.05524144  0.12924955  0.49705099]
[-0.07476805  0.13784382  0.13919057  0.24773257]
[-0.07201118 -0.05896314  0.14414522  0.58087794]
[-0.07319044  0.13387621  0.15576278  0.3368509 ]
[-0.07051292 -0.06307964  0.1624998   0.67431732]
[-0.07177451  0.12945566  0.17598615  0.43688404]
[-0.0691854  -0.06766369  0.18472383  0.77946937]
[-0.07053867  0.12450341  0.20031322  0.55012319]
Episode finished after 26 timesteps
[-0.01917993 -0.00820223  0.01452665  0.03286382]
[-0.01934398  0.18670842  0.01518392 -0.25520063]
[-0.01560981  0.38161033  0.01007991 -0.54305588]
[ -7.97760076e-03   5.76589177e-01  -7.81208338e-04  -8.32545854e-01]
[ 0.00355418  0.77172179 -0.01743213 -1.12547437]
[ 0.01898862  0.57683259 -0.03994161 -0.83830976]
[ 0.03052527  0.77247655 -0.05670781 -1.14328135]
[ 0.0459748   0.57813953 -0.07957343 -0.86890757]
[ 0.05753759  0.77424865 -0.09695159 -1.18551034]
[ 0.07302256  0.58050853 -0.12066179 -0.92472404]
[ 0.08463274  0.77703544 -0.13915627 -1.2527

[2017-08-26 18:05:05,818] Starting new video recorder writing to /tmp/cartpole-experiment-1/openaigym.video.2.6880.video000064.mp4


[ 0.02372256  0.14667433  0.03571564  0.04941262]
[ 0.02665605  0.34126643  0.0367039  -0.23179122]
[ 0.03348137  0.14563976  0.03206807  0.07223967]
[ 0.03639417 -0.0499269   0.03351286  0.37486524]
[ 0.03539563 -0.24550846  0.04101017  0.67792378]
[ 0.03048546 -0.44117545  0.05456864  0.9832311 ]
[ 0.02166195 -0.63698438  0.07423327  1.29254252]
[ 0.00892227 -0.83296728  0.10008412  1.6075118 ]
[-0.00773708 -0.63916079  0.13223435  1.34763248]
[-0.0205203  -0.83567356  0.159187    1.67859194]
[-0.03723377 -0.64271523  0.19275884  1.43941751]
Episode finished after 32 timesteps
[ 0.00455898  0.0372442   0.0326114   0.01516056]
[ 0.00530387  0.23188366  0.03291461 -0.26705733]
[ 0.00994154  0.03630779  0.02757346  0.03582272]
[ 0.0106677  -0.1591985   0.02828992  0.33707624]
[ 0.00748373  0.03550969  0.03503144  0.05344696]
[ 0.00819392  0.23011228  0.03610038 -0.2279807 ]
[ 0.01279617  0.42470023  0.03154077 -0.50906124]
[ 0.02129017  0.61936395  0.02135954 -0.79164014]
[ 0.03367745  

[2017-08-26 18:05:26,767] Starting new video recorder writing to /tmp/cartpole-experiment-1/openaigym.video.2.6880.video000125.mp4


[-0.08502032 -0.57269567  0.09134685  0.86677371]
[-0.09647423 -0.76893405  0.10868233  1.18672268]
[-0.11185292 -0.57537612  0.13241678  0.92998854]
[-0.12336044 -0.77201278  0.15101655  1.26117827]
[-0.13880069 -0.57910986  0.17624012  1.01934743]
[-0.15038289 -0.77608654  0.19662706  1.36178244]
Episode finished after 21 timesteps
[-0.01184577  0.03180636  0.03391105  0.00258222]
[-0.01120964 -0.1637851   0.0339627   0.30576868]
[-0.01448534  0.03083683  0.04007807  0.02398725]
[-0.01386861 -0.16483628  0.04055782  0.32904101]
[-0.01716533 -0.36051143  0.04713864  0.6342331 ]
[-0.02437556 -0.16607761  0.0598233   0.35675997]
[-0.02769711  0.02814506  0.0669585   0.08352469]
[-0.02713421  0.22224647  0.06862899 -0.1873039 ]
[-0.02268928  0.02621316  0.06488291  0.12621494]
[-0.02216502  0.2203485   0.06740721 -0.14531313]
[-0.01775805  0.02432921  0.06450095  0.16785077]
[-0.01727146  0.21847144  0.06785797 -0.10380687]
[-0.01290203  0.41255862  0.06578183 -0.37433267]
[-0.00465086  

[2017-08-26 18:05:51,965] Finished writing results. You can upload them to the scoreboard via gym.upload('/tmp/cartpole-experiment-1')


[ 0.0407179   0.16160421 -0.0123577  -0.2860109 ]
[ 0.04394999  0.3569002  -0.01807791 -0.58256559]
[ 0.05108799  0.55227071 -0.02972923 -0.88088808]
[ 0.0621334   0.35756495 -0.04734699 -0.5976976 ]
[ 0.0692847   0.55331633 -0.05930094 -0.90491043]
[ 0.08035103  0.74918906 -0.07739915 -1.21562734]
[ 0.09533481  0.94521937 -0.10171169 -1.53152512]
[ 0.1142392   1.14140963 -0.1323422  -1.85414129]
[ 0.13706739  0.94796745 -0.16942502 -1.60531047]
[ 0.15602674  0.75520615 -0.20153123 -1.36988149]
Episode finished after 11 timesteps


To upload the result:

In [7]:
import gym

gym.upload('/tmp/cartpole-experiment-1', api_key='sk_1oIzEVMTFW8LVGHnRRC6w')

[2017-08-26 18:06:10,259] [CartPole-v0] Uploading 200 episodes of training data
[2017-08-26 18:06:22,409] [CartPole-v0] Uploading videos of 6 training episodes (17736 bytes)
[2017-08-26 18:06:23,514] [CartPole-v0] Creating evaluation object from /tmp/cartpole-experiment-1 with learning curve and training video
[2017-08-26 18:06:23,987] 
****************************************************
You successfully uploaded your evaluation on CartPole-v0 to
OpenAI Gym! You can find it at:

    https://gym.openai.com/evaluations/eval_v5lvKVsmQGGIPlgjf3lRTQ

****************************************************


## Solving CartPole-v0

According to Andrej Karpathy, instead of starting with more advanced algorithms, we should always begin with a simpler approach such as CEM. So I simply used the cem.py provided by OpenAI Gym in the examples\agents\cem.py. And it solves the "CartPole-v0" environment in 59 episodes.

In [16]:
run -i "examples/agents/cem.py"

[2017-08-27 11:03:25,468] Making new env: CartPole-v0
[2017-08-27 11:03:25,468] Making new env: CartPole-v0
[2017-08-27 11:03:25,486] Clearing 16 monitor files from previous run (because force=True was provided)
[2017-08-27 11:03:25,486] Clearing 16 monitor files from previous run (because force=True was provided)
[2017-08-27 11:03:25,489] Starting new video recorder writing to /tmp/cem-agent-results/openaigym.video.3.2333.video000000.mp4
[2017-08-27 11:03:25,489] Starting new video recorder writing to /tmp/cem-agent-results/openaigym.video.3.2333.video000000.mp4
[2017-08-27 11:03:25,748] Starting new video recorder writing to /tmp/cem-agent-results/openaigym.video.3.2333.video000001.mp4
[2017-08-27 11:03:25,748] Starting new video recorder writing to /tmp/cem-agent-results/openaigym.video.3.2333.video000001.mp4
[2017-08-27 11:03:26,057] Starting new video recorder writing to /tmp/cem-agent-results/openaigym.video.3.2333.video000008.mp4
[2017-08-27 11:03:26,057] Starting new video reco

Iteration  0. Episode mean reward:  25.640


[2017-08-27 11:03:26,908] Starting new video recorder writing to /tmp/cem-agent-results/openaigym.video.3.2333.video000064.mp4
[2017-08-27 11:03:26,908] Starting new video recorder writing to /tmp/cem-agent-results/openaigym.video.3.2333.video000064.mp4


Iteration  1. Episode mean reward:  86.040


[2017-08-27 11:03:29,344] Starting new video recorder writing to /tmp/cem-agent-results/openaigym.video.3.2333.video000125.mp4
[2017-08-27 11:03:29,344] Starting new video recorder writing to /tmp/cem-agent-results/openaigym.video.3.2333.video000125.mp4


Iteration  2. Episode mean reward: 171.120
Iteration  3. Episode mean reward: 196.080
Iteration  4. Episode mean reward: 198.480


[2017-08-27 11:03:32,964] Starting new video recorder writing to /tmp/cem-agent-results/openaigym.video.3.2333.video000216.mp4
[2017-08-27 11:03:32,964] Starting new video recorder writing to /tmp/cem-agent-results/openaigym.video.3.2333.video000216.mp4


Iteration  5. Episode mean reward: 200.000
Iteration  6. Episode mean reward: 200.000
Iteration  7. Episode mean reward: 200.000


[2017-08-27 11:03:36,453] Finished writing results. You can upload them to the scoreboard via gym.upload('/tmp/cem-agent-results')
[2017-08-27 11:03:36,453] Finished writing results. You can upload them to the scoreboard via gym.upload('/tmp/cem-agent-results')
[2017-08-27 11:03:36,454] Successfully ran cross-entropy method. Now trying to upload results to the scoreboard. If it breaks, you can always just try re-uploading the same results.
[2017-08-27 11:03:36,454] Successfully ran cross-entropy method. Now trying to upload results to the scoreboard. If it breaks, you can always just try re-uploading the same results.
[2017-08-27 11:03:36,460] [CartPole-v0] Uploading 250 episodes of training data
[2017-08-27 11:03:36,460] [CartPole-v0] Uploading 250 episodes of training data


Iteration  8. Episode mean reward: 200.000
Iteration  9. Episode mean reward: 200.000


[2017-08-27 11:03:38,983] [CartPole-v0] Uploading videos of 7 training episodes (51695 bytes)
[2017-08-27 11:03:38,983] [CartPole-v0] Uploading videos of 7 training episodes (51695 bytes)
[2017-08-27 11:03:40,034] [CartPole-v0] Creating evaluation object from /tmp/cem-agent-results with learning curve and training video
[2017-08-27 11:03:40,034] [CartPole-v0] Creating evaluation object from /tmp/cem-agent-results with learning curve and training video
[2017-08-27 11:03:40,429] 
****************************************************
You successfully uploaded your evaluation on CartPole-v0 to
OpenAI Gym! You can find it at:

    https://gym.openai.com/evaluations/eval_kBbT9l3QuWFAIu8sRMbw

****************************************************
[2017-08-27 11:03:40,429] 
****************************************************
You successfully uploaded your evaluation on CartPole-v0 to
OpenAI Gym! You can find it at:

    https://gym.openai.com/evaluations/eval_kBbT9l3QuWFAIu8sRMbw

*************

## Gym Environments

We will showcase some environment types.

Algorithmic


In [1]:
import gym
env = gym.make('Copy-v0')
env.reset()
env.render()

[2017-08-27 18:43:23,339] Making new env: Copy-v0


Total length of input instance: 3, step: 0
Observation Tape    :   [42mC[0mDA  
Output Tape         :   
Targets             :   CDA  








<ipykernel.iostream.OutStream at 0x7f70fe8bb278>

### Atari Environments

The Atari environments are a variety of Atari video games:

In [2]:
import gym
env = gym.make('SpaceInvaders-v0')
env.reset()
env.render()

[2017-08-27 18:46:29,480] Making new env: SpaceInvaders-v0


### Board Game Environments

The board game environments are a variety of board games:

In [3]:
import gym
env = gym.make('Go9x9-v0')
env.reset()
env.render()

[2017-08-27 18:54:00,212] Making new env: Go9x9-v0


To play: black
Move:   0  Komi: 0.0  Handicap: 0  Captures B: 0 W: 0
      A B C D E F G H J  
    +-------------------+
  9 | . . . . . . . . . |
  8 | . . . . . . . . . |
  7 | . . . . . . . . . |
  6 | . . . . . . . . . |
  5 | . . . . . . . . . |
  4 | . . . . . . . . . |
  3 | . . . . . . . . . |
  2 | . . . . . . . . . |
  1 | . . . . . . . . . |
    +-------------------+


<ipykernel.iostream.OutStream at 0x7f70fe8bb278>

### Box2d Environments

Box2d is a 2D physics engine. But something is wrong with the current Gym setup. It has to do with Box2d-Kengz.

In [8]:
import gym
env = gym.make('LunarLander-v2')
env.reset()
env.render()

[2017-08-27 19:17:55,462] Making new env: LunarLander-v2


AttributeError: module 'Box2D._Box2D' has no attribute 'RAND_LIMIT'

### MuJoCo Environments

MuJoCo is a physics engine which can do very detailed efficient simulations with contacts. We have not installed it properly yet.

In [9]:
import gym
env = gym.make('Humanoid-v1')
env.reset()
env.render()

[2017-08-27 19:41:56,406] Making new env: Humanoid-v1


MujocoDependencyError: To use MuJoCo, you need to either populate ~/.mujoco/mjkey.txt and ~/.mujoco/mjpro131, or set the MUJOCO_PY_MJKEY_PATH and MUJOCO_PY_MJPRO_PATH environment variables appropriately. Follow the instructions on https://github.com/openai/mujoco-py for where to obtain these.

### Toy text Environments

Toy environments which are text-based. There's no extra dependency to install.

In [1]:
import gym
env = gym.make('FrozenLake-v0')
env.reset()
env.render()

[2017-08-27 19:43:31,185] Making new env: FrozenLake-v0



[41mS[0mFFF
FHFH
FFFH
HFFG
