# Tasks

1. Modify the environment so that observations have standard range $\left( -1, 1 \right)$, $\left( 0, 1 \right)$ etc.
2. Modify the environment so that actions have standard range $\left( -1, 1 \right)$, $\left( 0, 1 \right)$ etc.

## RL algos work better/faster when rewards are non-sparse and have low variance

| Variable | Good variance | Acceptable variance | Bad |
| --- | --- | --- | --- |
| rewards | 1 | 10 | 1000 |

## Option 1: Write a `gym` wrapper

1. Set the wrapper's action space to `Box(low=np.array([-1,]), high=np.array([1,]))` $\rightarrow$ the wrapped environment will accept actions in this standard range.
2. In the wrapper's `step()` method, map the action in the standard range to the original environment's action space.

<img src="images/action_norm/1.png" width="500"/>

3. Use the mapped action to step through the original environment.

## Option 2: Use `gym`'s built-in `RescaleAction` wrapper.

In [6]:
from gym.wrappers import RescaleAction
import numpy as np

from inventory_env.inventory_env import InventoryEnv

normalized_action_env = RescaleAction(InventoryEnv(), min_action=np.array([-1]), max_action=np.array([1]))

`gym` has a lot of useful built-in wrappers. Check them out here:

- https://github.com/openai/gym/tree/v0.21.0/gym/wrappers
- https://www.gymlibrary.dev/api/wrappers/ (careful, this corresponds to the latest version of `gym`, not the one that we are using)

## You can chain wrappers 

In [7]:
from inventory_env.wrappers import MyNormalizeObservation

norm_action_obs_env = RescaleAction(MyNormalizeObservation(InventoryEnv()), min_action=np.array([-1]), max_action=np.array([1]))

## Option 3: `rllib` implements action normalization by default

- `"normalize_action"` key in [common algorithm configuration](https://docs.ray.io/en/releases-1.11.1/rllib/rllib-training.html#common-parameters) is set to `True` by default

<img src="images/action_norm/2.png" width="700"/>

- When using `rllib` for learning, we don't need to use any `gym` wrappers for action normalization.