# Wrappers

## Non-Evaluables Practical Exercices

This is a non-evaluable practical exercise, but it is recommended that students complete it fully and individually, since it is an important part of the learning process.

The solution will be available, although it is not recommended that students consult the solution until they have completed the exercise.

## Defining the base environment

In this activity, we are going to extend or modify the functionality of [Frozen Lake](https://gymnasium.farama.org/environments/toy_text/frozen_lake/) environment by using the [wrappers](https://gymnasium.farama.org/api/wrappers/) functionality.

The following code runs one episode and prints the observation at each time $t$. As can be seen, the observation is an **integer** in range $[0,n^2-1]$ where $n$ is the grid size.

Suppose we want to implement an agent that needs to process the observations as a **vector of binary positions**, where a single position of the vector must take the value of 1 and the rest must remain at 0. To achieve this behavior, we define a **wrapper** that takes the observation space environment `Discrete(n=16)` and generates a new environment with an encoding [one-hot](https://es.wikipedia.org/wiki/One-hot) of the states, to be used, for example, in neural networks.

<u>Questions</u>:
1. **Modify the environment** to fulfil the above description.

In [5]:
import gymnasium as gym

env = gym.make("FrozenLake-v1")
obs, done = env.reset(), False

while not done:
    act = env.action_space.sample()
    obs, rew, terminated, truncated, _ = env.step(act)
    done = terminated or truncated
    
    print(obs)

print("Episode finished!")

0
0
4
8
12
Episode finished!


In [6]:
print(env.observation_space)
print(env.observation_space.sample())

Discrete(16)
7


<div class="alert alert-block alert-danger">
<strong>Solution</strong>
</div>

- In this example we have defined the class `DiscreteToBoxWrapper` from the class `ObservationWrapper` where we override the configuration of the observations space, assigning the type `Box(low=0, high=1, shape=( 16,))`. 
- We initialize the environment from the created _wrapper_. 
- The following part of the code does not present any modification, since the _wrapper_ implements the changes in a **transparent** way and it is not necessary to modify the code of the main block.

In [7]:
import numpy as np

class DiscreteToBoxWrapper(gym.ObservationWrapper):
    def __init__(self, env):
        super().__init__(env)
        self.n = self.observation_space.n
        self.observation_space = gym.spaces.Box(0, 1, (self.n,), dtype=np.uint8)

    def observation(self, obs):
        new_obs = np.zeros(self.n)
        new_obs[obs] = 1
        print("{:2d} -> {}".format(obs, new_obs))
        return new_obs

env = DiscreteToBoxWrapper(gym.make("FrozenLake-v1"))

obs, done = env.reset(), False
while not done:
    act = env.action_space.sample()
    obs, rew, terminated, truncated, _ = env.step(act)
    done = terminated or truncated

print("Episode finished!")

 0 -> [1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 0 -> [1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 4 -> [0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 0 -> [1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 0 -> [1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 1 -> [0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 1 -> [0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 2 -> [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 2 -> [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 2 -> [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 1 -> [0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 0 -> [1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 0 -> [1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 0 -> [1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 0 -> [1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 1 -> [0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 0 -> [1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 4 -> [0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.

In [8]:
print(env.observation_space)
print(env.observation_space.sample())

Box(0, 1, (16,), uint8)
[0 1 0 1 0 0 1 0 1 1 0 0 0 0 0 1]
