First, let's install the OpenAI gym and Atari environment:

In [3]:
!pip install gym gym[atari]



In [6]:
import gym
import time

# Cartpole example

This example shows how to use an OpenAI gym environment.
First, we choose and init the environment:

In [9]:
env = gym.make('CartPole-v0')
env.reset()

array([-0.0078811 , -0.01404202, -0.03591595, -0.0421268 ])

Then we iterate for a big number of times. Each time, we do the following:
* Render the environment
* Get the observation, the reward and the `done` variable (it indicates whether we lost or not
* (Optional) Sleep for 20ms, makes the animation smoother
* (Optional) If we lost (`done == True`), reset the environment
    

In [21]:
for _ in range(1):
    env.render()
    obs, rew, done, info = env.step(env.action_space.sample()) # take a random action
    time.sleep(0.02)
    # Comment the two following lines to see cases of "failure"
    if done:
       env.reset()
env.close()

`obs` is an array of length 4, whose values are: <br/>
`[position of cart, velocity of cart, angle of pole, rotation rate of pole]`

# BattleZone environment


Actually, there are 12 environments. The main difference is the observation type:
* The "normal" environments: observations are the frames of the game. They are bigger and more complex to analyse, but more intuitive.
* The RAM environments: observations are a 128-byte arrays, representing the RAM of the Atari console. Lighter to use, but we do not know what each byte represents.

The environment is provided in binary so we cannot modify it. Besides, it is hard to replicate the logic behind the spawning of enemies, so we chose to use the OpenAI environments directly.

In [23]:
battlezone_envs = ['BattleZone-v0', 
        'BattleZone-v4',
        'BattleZoneDeterministic-v0', 
        'BattleZoneDeterministic-v4', 
        'BattleZoneNoFrameskip-v0', 
        'BattleZoneNoFrameskip-v4',
        'BattleZone-ram-v0',
        'BattleZone-ram-v4',
        'BattleZone-ramDeterministic-v0', 
        'BattleZone-ramDeterministic-v4', 
        'BattleZone-ramNoFrameskip-v0',
        'BattleZone-ramNoFrameskip-v4']

Normal environment:

In [35]:
env_number = 0
env = gym.make(battlezone_envs[env_number])

env.reset()
for _ in range(200):
    env.render()
    obs, rew, done, info = env.step(env.action_space.sample()) # take a random action
    time.sleep(0.02)
env.close()

In [29]:
obs.shape

(210, 160, 3)

RAM environment

In [36]:
env_number = 6
env = gym.make(battlezone_envs[env_number])

env.reset()
for _ in range(100):
    env.render()
    obs, rew, done, info = env.step(env.action_space.sample()) # take a random action
    time.sleep(0.02)
env.close()

In [32]:
obs.shape

(128,)