# Tutorial to create a customized PMSM motor environment

This example notebook aims to provide an introduction to the usage of the gym-electric-motor (GEM) toolbox. The first section provides a quick start guide to use the GEM toolbox.
Further sections provide a step by step guide to customize the different features offered by the toolbox and its application. In this example, a guide to create a customized discrete permanent magnet synchronous motor (PMSM) environment is presented.

<!-- The following code snippets are only needed if you are executing this file directly from a cloned GitHub repository where you don't have GEM installed -->

## 1.    A Brief Introduction to GEM

The gym-electric-motor (GEM) package is a software toolbox for the simulation of different electric motors.
The toolbox is built upon the OpenAI gym environments for reinforcement learning. Therefore, the toolbox is specifically designed for running reinforcement learning algorithms to train agents controlling electric motors.
Besides electrical motors, converters and load models are also implemented.

The components of GEM are structured as shown in the figure below:

![Motor Setup](img/SCML_Setting.svg)


### 1.1  Installation
Before you can start, you need to make sure that you have gym-electric-motor installed. You can install it easily using pip:

<code>
pip install gym-electric-motor
</code>
    
Alternatively, you can install them and their latest developer version directly from GitHub:
https://github.com/upb-lea/gym-electric-motor


### 1.2 Basic PMSM environment
As a quick start guide, the following example provides a method to create a basic motor environment . 
By simply calling the gem.make() function with the motor environment ID of the required electric motor, one can create a basic motor environment with default parameters and settings for all the relevant sub-components.
More information on the motor environment ID and the default parameters can be found in the [documentation.](https://upb-lea.github.io/gym-electric-motor/parts/environments/environment.html)

However, further sections provide a detailed guide to customize individual sub-components of the motor environment.


In [2]:
import gym_electric_motor as gem

basic_env = gem.make("PMSMDisc-v1")  # pass the motor environment ID as an arguement to the gem.make()
basic_env

<gym_electric_motor.envs.gym_pmsm.perm_mag_syn_motor_env.DiscPermanentMagnetSynchronousMotorEnvironment at 0x7f2e20848320>

## 2.  Physical System
The system consists of a voltage supply, a power electronic converter, an electrical motor and the mechanical load (SCML) as shown in the above figure . Additionally, each SCML-system has got an ODE-solver for the simulation.


### 2.1 Voltage Supply

The voltage supply module of the GEM toolbox provides both DC and AC voltage supplies. 
- The DC supplies provided are ideal and non-Ideal DC voltage sources.
- The AC supplies provided are single phase and 3-phase AC sources.  <br>

More documentation regarding the voltage supplies of GEM can be found [here.](https://upb-lea.github.io/gym-electric-motor/parts/physical_systems/voltage_supplies/voltage_supply.html)  
For the PMSM environment example, a non-ideal DC voltage supply is created. 
Here, the DC-link is modeled as an RC-circuit loaded from an ideal DC voltage source. This is illustrated in the below figure. 


<img src="img/non-ideal-supply.png" alt="non-ideal-supply" style="width: 600px;"/>

The non-ideal DC supply in GEM is named 'RCVoltageSupply' and the supply_parameter(dict) consists of resistance R in ohm and capacitance C in farad


In [3]:
supply = 'RCVoltageSupply'
supply_parameter=dict(R=10, C=4e-3)  # R and C values here are not realistic. 
                                     # Values here are just for reference

### 2.2 Converter and Motor
The converters are divided into two classes: The continuously controlled  and the discretely controlled converters.

In the continuous case, the converter's output voltage is modulated directly through means of duty cycling. In the discrete case, the converter's output voltage is determined by the state of the converter switches at a given instant. Therefore, only a discrete amount of options are available. For this environment, the discrete B6 bridge converter which has three switches which amounts to a total of eight possible actions is used.
<!-- ![Motor Setup](img/B6.svg) -->
<img src="img/B6.svg" alt="non-ideal-supply" style="width: 400px;"/>
<br>
<br>


The electric motor is the **Permanent Magnet Synchronous Motor**.
The motor schematic is the following:


<!-- ![Motor Setup](img/ESBdq1.svg) -->
<img src="img/ESBdq1.svg" alt="non-ideal-supply" style="width: 400px;"/>

<br>
<br>

And the electrical ODEs for that motor are:

<h3 align="center">

$\frac{\mathrm{d}i_{sq}}{\mathrm{d}t} = \frac{u_{sq}-pL_d\omega_{me}i_{sd}-R_si_{sq}}{L_q}$

$\frac{\mathrm{d}i_{sd}}{\mathrm{d}t} = \frac{u_{sd}-pL_q\omega_{me}i_{sq}-R_si_{sd}}{L_d}$

$\frac{\mathrm{d}\epsilon_{el}}{\mathrm{d}t} = p\omega_{me}$

</h3>


The motor environment ID for the discrete PMSM motor is **"PMSMDisc-v1"**. This is later passed to the 'make' function to create the motor environment. The environment IDs for other available motor environments can be found in the [documentation.](https://upb-lea.github.io/gym-electric-motor/parts/environments/environment.html)
The parameters of the specific motor is to be passed by the user as a motor parameter dictionary. Default parameters will be considered in case the motor parameters are not provided by the user. The default converter for the environment ID used here is the discrete B6 bridge converter. More details can be found in the [documentation.](https://upb-lea.github.io/gym-electric-motor/parts/physical_systems/converters/B6C.html) <br>
The nominal and limit values which define the operating region of the motor is also passed as a dictionary.

In [4]:
motor_env_id = "PMSMDisc-v1"
tau = 1e-5    # The duration of each sampling step
motor_parameter = dict(p=3,  # [p] = 1, nb of pole pairs
                       r_s=17.932e-3,  # [r_s] = Ohm, stator resistance
                       l_d=0.37e-3,  # [l_d] = H, d-axis inductance
                       l_q=1.2e-3,  # [l_q] = H, q-axis inductance
                       psi_p=65.65e-3,  # [psi_p] = Vs, magnetic flux of the permanent magnet
                       )  # BRUSA

nominal_values=dict(omega=4000*2*np.pi/60,  # angular velocity in RPM
                    i=230,                  # motor current in amps
                    u=350                   # nominal voltage in volts
                    )
# limit values are taken as 1.3 times the nominal values in this case.
limit_values = {key: 1.3 * nomin for key, nomin in nominal_values.items()}


### 2.3 Motor state initializer
By default, the motor states (e.g. motor currents, rotational speed) are always set to zero whenever the motor environment is reset. In order to generate diverse expisodes, the motor state initializer can be used to draw the initial state values from a given probability distribution within the nominal operating range of the given motor. 

The 'motor_initializer' is a dictionary that consists of the type of distribution, for example, uniform or gaussian distribution and the interval of values within the nominal operating region. 
Here, the motor states, i.e. $i_{sd}$, $i_{sq}$ and motor angle are initialized with values sampled from a uniform distribution from the provided intervals. $i_{sd}$ and $i_{sq}$ are the motor currents in the d-q coordinate system. More details on the d-q coordinate system can be found [here.](https://en.wikipedia.org/wiki/Direct-quadrature-zero_transformation#:~:text=The%20direct%2Dquadrature%2Dzero%20,an%20effort%20to%20simplify%20analysis)


In [5]:
motor_initializer={'random_init': 'uniform', 'interval': [[-230, 230], [-230, 230], [-np.pi, np.pi]]}  


### 2.4 Mechanical Load
The attached mechanical load in the GEM toolbox is represented by the function: <br>

$ T_L(\omega) = sign(\omega_me)(c\omega_{me}^2 + sign(\omega_{me}
)b\omega_{me} +a) $  <br>

The parameters are: constant load torque a, viscous friction coefficient b and aerodynamic load torque coefficient c. These parameters as well as a moment of inertia of the load J load can be freely
defined by the user to simulate different loads.

In this example, we use the load type: "ConstSpeedLoad" which initializes the load with a constant speed at the start of each episode. The initialization value for speed is sampled from a uniform distribution defined by the given interval.


In [6]:
from gym_electric_motor.physical_systems import ConstantSpeedLoad

load = 'ConstSpeedLoad'
load_initializer={'random_init': 'uniform', 'interval':[100,200] }   


## 3. Reward Function
The reward calculation is based on the current state and reference of the motor environment. It is calculated as a weighted sum of errors with a certain power as follows:

<!-- <h3 align="center"> -->
$ reward = - reward\_weights * (abs(state - reference)/ state\_length)^{reward\_power}$  

If states are observed for a constraint violation, an additional terminal reward is added. This value depends on the discount factor gamma as follows. 
<!-- <h3 align="center"> -->
$limit\_violation\_reward = -1 / (1 - \gamma).$


### 3.1. Constraint Monitor
The constraint monitor observes the system states and assesses whether
    they comply the given limits or violate them.
    It returns the necessary information for the reward function, to calculate
    the corresponding reward value.
    The constraints of the system states can be generally described by the user
    or is restricted by the physical enviroment limits as a default constraint.
    
    
The user defined constraint: "SqdCurrentMonitor" presented below, observes the currents and raises a flag indicating a limit violation if :

$ i_{sd}^2 + i_{sq}^2 > i_{max}^2 $  
Here $i_{sd}$ and $i_{sq}$ are the motor currents in the d-q coordinate system that can be accessed from the motor states and $i_{max}$ is the maximum allowable current value.




In [7]:

gamma = 0.99  #Discount factor for the reward punishment. Should equal agents' discount factor gamma.

class SqdCurrentMonitor:
    """
    monitor for squared currents:

    i_sd**2 + i_sq**2 < 1.5 * nominal_limit
    """

    def __call__(self, state, observed_states, k, physical_system):
        self.I_SD_IDX = physical_system.state_names.index('i_sd') # access motor state i_sd
        self.I_SQ_IDX = physical_system.state_names.index('i_sq') # access motor state i_sq
        sqd_currents = state[self.I_SD_IDX] ** 2 + state[self.I_SQ_IDX] ** 2
        return sqd_currents > 1

    
reward_function=gem.reward_functions.WeightedSumOfErrors(  # The function that computes the reward
            observed_states=['i_sq', 'i_sd'],              # Names of the observed states
            reward_weights={'i_sq': 10, 'i_sd': 10},       # Reward power for each of the systems states.
            constraint_monitor=SqdCurrentMonitor(),        # ConstraintMonitor for monitoring
                                                           # states regarding defined constraints
            gamma=gamma,    # Discount factor for the reward punishment. Should equal agent's 
                            # discount factor gamma.
            reward_power=1) # Reward power for each of the systems states

## 4. Reference Generator
The reference generator generates references to the observed states, that the physical system is expected to follow. GEM toolbox provides various reference generators which can be found in the [documentation.](https://upb-lea.github.io/gym-electric-motor/parts/reference_generators/reference_generator.html)

In this example, we generate references to the motor currents $i_{sq}$ and $i_{sd}$.  <br>
The "WienerProcessReferenceGenerator" is used to generate random references for both $i_{sq}$ and $i_{sd}$.The wiener process is a stochastic process W(t) for $t>=0$ with $W(0)=0$ and such that the increment $W(t)-W(s)$ is Gaussian with mean 0 and variance $\sigma$ for any $0<=s<t$, and increments for nonoverlapping time intervals are independent. More information can be found [here](https://en.wikipedia.org/wiki/Wiener_process)

The individual sub-reference generators are then combined using the "MultipleReferenceGenerator".

In [8]:
from gym_electric_motor.reference_generators import \
    MultipleReferenceGenerator,\
    WienerProcessReferenceGenerator

q_generator = WienerProcessReferenceGenerator(reference_state='i_sq') # sub-reference generator for i_sq
d_generator = WienerProcessReferenceGenerator(reference_state='i_sd') # sub-reference generator for i_sd
rg = MultipleReferenceGenerator([q_generator, d_generator])           # combine the sub-reference generators

## 5. Visualization
The visualization module provides an interface to observe and inspect the physical system's state, references, rewards, etc. 
GEM offers two forms of visualization:
- Motor dashboard: A graphical interface which provides visualization in the form of plots.
- Console printer: A simpler interface in the form of prints on the console.  <br>
This example demonstrates the usage of the motor dashboard for visualization.<br>
A list of variables to be plotted is passed to the MotorDashboard during initialization. The variables that can be plotted for a given motor environment can be found in the [documentation.](https://upb-lea.github.io/gym-electric-motor/parts/visualizations/motor_dashboard.html)

In [9]:
from gym_electric_motor.visualization import MotorDashboard

visualization = MotorDashboard(plots=['i_sq', 'i_sd', 'reward']) # plots the states i_sd and i_sq and reward.

## 6. Callbacks
GEM callbacks provide and easy to use interface to apply a set of functions on the motor environment during run time. Callbacks can be used to get a view of the internal states, collect statistics or modify certain motor parameters during runtime. 

GEM callbacks can be used to interact with the motor environment:
- At the start/end of every step
- At the start/end of every reset
- At the start of a close() call
    
The following example provides a sample user defined callback implementation. The user defined callback object must be a sub-class of the 'Callback' class as shown. The objective of the user defined 'RewardLogger' callback object is to create a log of mean episode rewards of the experiment.

Some of the interfaces implemented here are:
- __init()         : A suitable class constructor.
- on_step_end()    : A suitable task to be performed at the end of every step.
- on_reset_begin() : A suitable task at the beginning of each episode.
- on_close         : A suitable task at the beginning of a close.

For the list of all the interfaces, check out the GEM API documentation.    


In [10]:
from gym_electric_motor.core import Callback
import numpy as np

class RewardLogger(Callback):
    """ Logs the reward accumulated in each episode
    """
    def __init__(self):
        self._step_rewards = []
        self._mean_episode_rewards = []

    def on_step_end(self):
        """ stores the received reward at each step
        """
        self._step_rewards.append(self._env._reward)
    
    def on_reset_begin(self):
        """ stores the mean reward received in every episode
        """
        self._mean_episode_rewards.append(np.mean(self._step_rewards))
        self._step_rewards = []
        
    def on_close(self):
        """ writes the mean episode reward of the experiment to a file.
        """
        np.save(Path.cwd() /"rl_frameworks" / "saved_agents" / "EpisodeRewards.npy",
                np.array(self._mean_episode_rewards))
        
my_callback = [RewardLogger()]  # instantiate the callback object 

## 7. GEM Make Call

The make function takes the environment-ids and several constructor arguments. Every environment also works without further parameters with default values. These default parameters can be looked up in the API-documentation of every GEM-environment [here](https://upb-lea.github.io/gym-electric-motor/index.html). 

The various components of the motor environment defined above are passed as arguements to the gem.make() function. This further returns the discrete PMSM motor environment.

The motor environment can then be passed to a Reinforcement Learning agent in order to learn a controller.

In [11]:
# define a PMSM with discrete action space
env = gem.make(  
    motor_env_id ,
    # visualize the results
    visualization=visualization,
    # parameterize the PMSM and update limitations
    motor_parameter=motor_parameter,
    limit_values=limit_values, nominal_values=nominal_values,
    # define the random initialisation for load and motor
    load=load,
    load_initializer=load_initializer,
    motor_initializer=motor_initializer,
    reward_function=reward_function,
    tau=tau,
    supply = supply,
    supply_parameter=supply_parameter,
    reference_generator=rg,
    ode_solver='euler',
    callbacks = my_callback
)

In [12]:
env

<gym_electric_motor.envs.gym_pmsm.perm_mag_syn_motor_env.DiscPermanentMagnetSynchronousMotorEnvironment at 0x7f8264b107b8>