In [1]:
import gym 
import random

In [2]:
class RandomActionWrapper(gym.ActionWrapper):
  def __init__(self, env, epsilon = 0.1):
    super(RandomActionWrapper, self).__init__(env)
    self.epsilon = epsilon   # epsilon is probability of a random action

  def action(self, action):
    if random.random() < self.epsilon:
      print('Random!')
      return self.env.action_space.sample()
    return action

This is a method that we need to override from a parent's class to tweak the
agent's actions. Every time we roll the die and with the probability of epsilon, we
sample a random action from the action space and return it instead of the action
the agent has sent to us. Note that using action_space and wrapper abstractions,
we were able to write abstract code, which will work with any environment from
the Gym.

In [3]:
env = RandomActionWrapper(gym.make('CartPole-v0'))

Now it's time to apply our wrapper. We will create a normal CartPole
environment and pass it to our wrapper constructor. From here on, we use our
wrapper as a normal Env instance, instead of the original CartPole. As the
Wrapper class inherits the Env class and exposes the same interface, we can nest
our wrappers in any combination we want.

In [7]:
obs = env.reset()
total_reward = 0.0
while True:
  obs, reward, done, _ = env.step(0)
  total_reward += reward
  if done:
    break
print("Reward got: %.2f" % total_reward)

Random!
Random!
Random!
Reward got: 13.00
