# Getting Started with Gym
---

*The gym library is a collection of test problems — environments — that you can use to work out your reinforcement learning algorithms. These environments have a shared interface, allowing you to write general algorithms.*

In [1]:
import gym

In [2]:
env = gym.make("CartPole-v0") # Or MountainCar-v0, MsPacman-v0
env.reset()

for _ in range(1000):
    env.render()
    env.step(env.action_space.sample()) # Take a random action
    
env.close()



KeyboardInterrupt: 

`step` returns four values:
- `observation` is an object whose content is environment-specific
- `reward` is a `float`
- `done` tells you whether you need to `reset` the environment (episode has ended)
- `info` contains diagnostic information

In [4]:
env = gym.make("CartPole-v0")

try:
    for i_episode in range(20):
        observation = env.reset()
        for t in range(100):
            env.render()
            print("Observation[{}]: {}".format(i_episode, observation))
            action = env.action_space.sample()
            observation, reward, done, info = env.step(action) # See above.
            if done:
                print("Episode finished after {} timesteps".format(t+1))
                break
finally:
    env.close()

Observation[0]: [-0.00965659 -0.02293371  0.04553614  0.01350137]
Observation[0]: [-0.01011526  0.17150662  0.04580617 -0.26447379]
Observation[0]: [-0.00668513 -0.02423819  0.04051669  0.04229774]
Observation[0]: [-0.00716989  0.17028005  0.04136265 -0.23733151]
Observation[0]: [-0.00376429 -0.02540767  0.03661602  0.06810616]
Observation[0]: [-0.00427244  0.16917071  0.03797814 -0.21280307]
Observation[0]: [-0.00088903  0.36372968  0.03372208 -0.49326819]
Observation[0]: [ 0.00638556  0.16814874  0.02385671 -0.19015116]
Observation[0]: [ 0.00974854 -0.02730624  0.02005369  0.10996109]
Observation[0]: [ 0.00920241 -0.22270973  0.02225291  0.40890278]
Observation[0]: [ 0.00474822 -0.02791024  0.03043097  0.1233177 ]
Observation[0]: [ 0.00419001 -0.22345465  0.03289732  0.42544384]
Observation[0]: [-2.79079155e-04 -4.19026767e-01  4.14061985e-02  7.28313356e-01]
Observation[0]: [-0.00865961 -0.61469591  0.05597247  1.03373545]
Observation[0]: [-0.02095353 -0.42036119  0.07664717  0.7591

In [3]:
env = gym.make("CartPole-v0")
print(env.action_space)
print(env.observation_space)

Discrete(2)
Box(-3.4028234663852886e+38, 3.4028234663852886e+38, (4,), float32)
