# How to use `gym` wrappers to modify environments

## Example: Modify the stepwise reward in `CartPole-v1`

<img src="images/cartpole.jpg" width="300"/>

- The stepwise reward in `CartPole-v1` is 1 until any terminal state is reached.

In [1]:
import gym

env = gym.make("CartPole-v1")
obs = env.reset()
obs, r, done, _ = env.step(env.action_space.sample())
print(f"reward is {r}")

reward is 1.0


## Modify `CartPole-v1`: Change the stepwise reward to 2 instead of 1

**Solution: Write a `gym` wrapper.**

In [3]:
class ScaleReward(gym.Wrapper):  # must inherit from gym.Wrapper
    def __init__(self, env, scaling_factor=2):
        super().__init__(env)
        self.scaling_factor = scaling_factor
        
    def step(self, action):
        obs, r, done, info = self.env.step(action)
        r *= self.scaling_factor
        return obs, r, done, info

## How to use the `ScaleReward` wrapper

In [4]:
original_env = gym.make("CartPole-v1")
double_reward_env = ScaleReward(env=original_env)
obs = double_reward_env.reset()
obs, r, done, _ = double_reward_env.step(double_reward_env.action_space.sample())
print(f"reward is {r}")

reward is 2.0


## Scaling rewards by any scaling factor

In [5]:
# wrappers can accept arguments that changes its behavior.
# The scaling_factor argument changes how the ScaleReward wrapper scales rewards
triple_reward_env = ScaleReward(original_env, scaling_factor=3)  
obs = triple_reward_env.reset()
obs, r, done, _ = triple_reward_env.step(triple_reward_env.action_space.sample())
print(f"reward is {r}")

reward is 3.0


## How `gym.Wrapper`'s attributes and methods are mapped to the original environment

<img src="images/wrapper/wrapper.png" width="700"/>

- `original_env` and `gym.Wrapper(env=original_env)` behave identically.

In [None]:
class ScaleReward(gym.Wrapper):
    def __init__(self, env, scaling_factor=2):
        super().__init__(env)
        self.scaling_factor = scaling_factor
    
    def step(self, action):
        obs, r, done, info = self.env.step(action)
        r *= self.scaling_factor
        return obs, r, done, info

## wrappers can be chained

- Since wrapped environments behave like any other `gym` env, you can pass a wrapped environment to a wrapper.

```
double_wrapped_env = wrapper(env=wrapped_env)
```
- This way, you can make progressive modifications to any environment

In [7]:
double_wrapped_env = ScaleReward(ScaleReward(original_env, scaling_factor=2), scaling_factor=3)
obs = double_wrapped_env.reset()
obs, r, done, _ = double_wrapped_env.step(double_wrapped_env.action_space.sample())
print(f"reward is {r}")

reward is 6.0
