# Tutorial 3: Converters

In this tutorial, we will discuss [converters](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/index.html).

The following will be covered:
- The three different converters, i.e. [SpaceConverter](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/space_converter.html), [Processor](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/processor.html) and [Converter](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/converter.html)
- Specifying the parameters of converters
- Creating a custom [SpaceConverter](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/space_converter.html)

In the remainder of this tutorial we will go more into detail on these concepts.

Furthermore, at the end of this notebook you will find exercises.
For the exercises you will have to add/modify a couple of lines of code, which are marked by

```python

# START EXERCISE [BLOCK_NUMBER]

# END EXERCISE [BLOCK_NUMBER]
```

## Pendulum Swing-up

We will create an environment for solving the classic control problem of swinging up an underactuated pendulum, very similar to the [Pendulum-v1 environment](https://www.gymlibrary.ml/environments/classic_control/pendulum/).
Our goal is to swing up this pendulum to the upright position and keep it there, while minimizing the velocity of the pendulum and the input voltage.

Since the dynamics of a pendulum actuated by a DC motor are well known, we can simulate the pendulum by integrating the corresponding ordinary differential equations (ODEs):


$\mathbf{x} = \begin{bmatrix} \theta \\ \dot{\theta} \end{bmatrix} \\ \dot{\mathbf{x}} = \begin{bmatrix} \dot{\theta} \\ \frac{1}{J}(\frac{K}{R}u - mgl \sin{\theta} - b \dot{\theta} - \frac{K^2}{R}\dot{\theta})\end{bmatrix}$

with $\theta$ the angle w.r.t. upright position, $\dot{\theta}$ the angular velocity, $u$ the input voltage, $J$ the inertia, $m$ the mass, $g$ the gravitational constant, $l$ the length of the pendulum, $b$ the motor viscous friction constant, $K$ the motor constant and $R$ the electric resistance.

## Notebook Setup

In order to be able to run the code, we need to install the *eagerx_tutorials* package and ROS.

In [None]:
try:
    import eagerx_tutorials
except ImportError:
    !{"echo 'Installing eagerx-tutorials with pip.' && pip install eagerx-tutorials >> /tmp/eagerx_install.txt 2>&1"}
if 'google.colab' in str(get_ipython()):
    !{"curl 'https://raw.githubusercontent.com/eager-dev/eagerx_tutorials/master/scripts/setup_colab.sh' > ~/setup_colab.sh"}
    !{"bash ~/setup_colab.sh"}

# Setup interactive notebook
# Required in interactive notebooks only.
from eagerx_tutorials import helper
helper.setup_notebook()
env = None

# Allows reloading of registered entites from changed files
# Required in interactive notebooks only.
%reload_ext autoreload
%autoreload 1

## Let's get started

We start by importing the required packages and initializing EAGERx.

In [None]:
import eagerx
import eagerx_tutorials.pendulum  # Registers Pendulum
import eagerx_ode  # Registers OdeBridge

# Initialize eagerx (starts roscore if not already started.)
eagerx.initialize("eagerx_core")

We will again create an environment with the *Pendulum* object, like we did in the [first](https://colab.research.google.com/github/eager-dev/eagerx_tutorials/blob/master/tutorials/pendulum/1_environment_creation.ipynb) and [second](https://colab.research.google.com/github/eager-dev/eagerx_tutorials/blob/master/tutorials/pendulum/2_reset_and_step.ipynb) tutorials.
However, first we would like to clarify the converter types of EAGERx, i.e. [Converter](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/converter.html), [SpaceConverter](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/space_converter.html) and [Processor](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/processor.html).
The [Converter](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/converter.html) allows to convert messages from one message type into another.
The [SpaceConverter](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/space_converter.html) allows to connect entities to actions and observations and create the appropriate [Gym spaces](https://gym.openai.com/docs/#spaces).
Finally, the [Processor](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/processor.html) allows to convert messages without changing the message type.

Let's go the *Pendulum* object to explain this.
Remember that we can print information of an object as follows:

In [None]:
eagerx.Object.info("Pendulum")

The printed info shows, amongst other things, the sensors, actuators and states of the *Pendulum* and their corresponding message types.
For example, the sensor *theta* has message type [Float32](http://docs.ros.org/en/melodic/api/std_msgs/html/msg/Float32.html), while the sensor *Image* has message type [Image](http://docs.ros.org/en/noetic/api/sensor_msgs/html/msg/Image.html).
When creating connections in the [graph](https://eagerx.readthedocs.io/en/master/guide/api_reference/graph/graph.html?highlight=graph) we should make sure that the correct message types are sent and received.
By using converters, we can ensure that data is received in each callback with the correct message type and format.

Converters can be specified during connection through the argument `converter` in the [connect method](https://eagerx.readthedocs.io/en/master/guide/api_reference/graph/graph.html#eagerx.core.graph.Graph.connect), but they can also be specified in the object definition.
Doing the latter for space converters makes sense for sensors and actuators, because it eases connecting them to observations and actions, respectively.

Let's make the *Pendulum* object and inspect the space converter we have defined for the sensor *theta*.

In [None]:
pendulum = eagerx.Object.make("Pendulum", "pendulum", actuators=["u"], sensors=["theta", "dtheta", "image"], states=["model_state"])
pendulum.sensors.theta

Here we see that the space converter *Space_Float32* is specified for the sensor *theta*, which allows to go from a Gym `Box` space to a `Float32` message and the other way around.
These values can be modified easily.

In [None]:
pendulum.sensors.theta.space_converter.low = -10
pendulum.sensors.theta.space_converter.high = 10
pendulum.sensors.theta

However, this space converter is not ideal when dealing with unnormalized angles.
Namely, we can reduce the observation space by normalizing the angle to $-\pi \le \theta < \pi$ before providing it to the agent.
So let's create our own space converter that does exactly this.

We can create a space converter by inheriting from the class [SpaceConverter](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/space_converter.html).
This class has the following abstract methods we need to implement:

- [spec()](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/space_converter.html#eagerx.core.entities.SpaceConverter.spec): Specifies the parameters of the space converter.
- [initialize()](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/space_converter.html#eagerx.core.entities.SpaceConverter.initialize): Initializes the space converter.
- [A_to_B()](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/space_converter.html#eagerx.core.entities.SpaceConverter.A_to_B): Converts MSG_TYPE_A to MSG_TYPE_B.
- [B_to_A()](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/space_converter.html#eagerx.core.entities.SpaceConverter.B_to_A): Converts MSG_TYPE_B to MSG_TYPE_A
- [get_space()](https://eagerx.readthedocs.io/en/master/guide/api_reference/converter/space_converter.html#eagerx.core.entities.SpaceConverter.get_space): Returns Gym Space.

and the following class properties should be set:

- MSG_TYPE_A
- MSG_TYPE_B

In [None]:
%%writefile space_converter.py

# ROS IMPORTS
from std_msgs.msg import Float32

# EAGERX IMPORTS
import eagerx
from eagerx.core.specs import ConverterSpec

# OTHER
import numpy as np
from gym.spaces import Box


class ExampleSpaceConverter(eagerx.SpaceConverter):
    MSG_TYPE_A = np.ndarray
    MSG_TYPE_B = Float32

    @staticmethod
    @eagerx.register.spec("ExampleSpaceConverter", eagerx.SpaceConverter)
    def spec(spec: ConverterSpec, low=None, high=None, dtype="float32"):
        # Initialize spec with default arguments
        spec.initialize(ExampleSpaceConverter)
        spec.config.update(low=low, high=high, dtype=dtype)

    def initialize(self, low=None, high=None, dtype="float32"):
        self.low = np.array(low, dtype=dtype)
        self.high = np.array(high, dtype=dtype)
        self.dtype = dtype

    def get_space(self):
        return Box(self.low, self.high, dtype=self.dtype)

    def A_to_B(self, msg):
        # In this example we only care about going from Float32 to ndarray
        raise NotImplementedError()

    def B_to_A(self, msg_b):
        th = msg_b.data
        
        # START EXERCISE 1.1
        # th -= 2 * np.pi * np.floor((th + np.pi) / (2 * np.pi))
        msg_a = np.array([np.sin(th), np.cos(th)], dtype=self.dtype)
        # END EXERCISE 1.1
        
        return msg_a

We can make this space converter similar to the way we make objects and add it to the *theta* sensor:

In [None]:
%aimport space_converter
import space_converter
import numpy as np

# START EXERCISE 1.2
space_converter = eagerx.SpaceConverter.make("ExampleSpaceConverter", low=[-1, -1], high=[1, 1])
# END EXERCISE 1.2

pendulum.sensors.theta.space_converter = space_converter

Next we will construct the graph with the *Pendulum* similar to the previous tutorials.

In [None]:
# Define rate in Hz
rate = 30.0

# Initialize empty graph
graph = eagerx.Graph.create()

# Add pendulum to the graph
graph.add(pendulum)

# Connect the pendulum to an action and observation
# We will now explicitly set the window size
graph.connect(action="voltage", target=pendulum.actuators.u, window=1)
graph.connect(source=pendulum.sensors.theta, observation="angle", window=1)
graph.connect(source=pendulum.sensors.dtheta, observation="angular_velocity", window=1)

# Render image
graph.render(source=pendulum.sensors.image, rate=rate)

# Make OdeBridge
bridge = eagerx.Bridge.make("OdeBridge", rate=rate)

Finally, we will initialize the environment and train the agent using [Stable Baselines3](https://stable-baselines3.readthedocs.io/en/master/), again similar to the first two tutorials.

In [None]:
from typing import Dict
import stable_baselines3 as sb
from stable_baselines3.common.env_checker import check_env
from eagerx.wrappers import Flatten


# Define step function
def step_fn(prev_obs: Dict[str, np.ndarray], obs: Dict[str, np.ndarray], action: Dict[str, np.ndarray], steps: int):
    
    # Get angle and angular velocity
    # Take first element because of window size (covered in other tutorial)
    
    # START EXERCISE 1.3
    sin_th, cos_th = obs["angle"][0]
    th = np.arctan2(sin_th, cos_th)
    # END EXERCISE 1.3
    
    thdot = obs["angular_velocity"][0]
    
    # Convert from numpy array to float
    u = float(action["voltage"])
    
    # Calculate cost
    # Penalize angle error, angular velocity and input voltage
    cost = th**2 + 0.1 * thdot**2 + 0.001 * u**2  
    
    # Determine when is the episode over
    # currently just a timeout after 100 steps
    done = steps > 100
    
    # Set info, tell the algorithm the termination was due to a timeout
    # (the episode was truncated)
    info = {"TimeLimit.truncated": steps > 100}
    
    return obs, -cost, done, info

# Initialize Environment
env = eagerx.EagerxEnv(name="PendulumEnv", rate=rate, graph=graph, bridge=bridge, step_fn=step_fn)

# Toggle render
env.render("human")

# Stable Baselines3 expects flattened actions & observations
# Convert observation and action space from Dict() to Box()
env = Flatten(env)

# Check that env follows Gym API and returns expected shapes
check_env(env)

# Initialize learner
model = sb.SAC("MlpPolicy", env, verbose=1)

# Train for 1 minute (sim time)
model.learn(total_timesteps=int(60 * rate))

env.shutdown()

# Exercises

In these exercises you will improve the sample efficiency of the learning problem by modifying the space converter.

For these exercises, you will need to modify or add some lines of code in the cells above.
These lines are indicated by the following comments:

```python
# START EXERCISE [BLOCK_NUMBER]

# END EXERCISE [BLOCK_NUMBER]
```

However, feel free to play with the other code as well if you are interested.
We recommend you to restart and run all code after each section (in Colab there is the option *Restart and run all* under *Runtime*).


## 1. Angle Decomposition

In the code as provided above, we reduced the observation space by normalizing $\theta$.
This will improve the sample efficiency, but we can do even better.
Normalizing $\theta$ results in discontinous observations of $\theta$, i.e. there is a sign switch increasing the angle over $\pi$ or decreasing the angle smaller than $-\pi$.
Many (reinforcement) learning algorithms have difficulties with such discontinuities.
Therefore it is better to choose a representation for $\theta$ without discontinuities, e.g. its sine and cosine component: $[\sin(\theta), \cos(\theta)]$.


### Add your code to the following blocks: 

1.1 Instead of the normalized angle, the `B_TO_A` method should return the decomposed angle: $[\sin(\theta), \cos(\theta)]$.  
1.2 The values of `low` and `high` of the Gym space should be updated accordingly.  
1.3 The function `step_fn` should be updated as well. Reconstruct $\theta$, since it is no longer observed directly by the agent.  