FMUGym: An Interface for Reinforcement Learning-based Control of Functional Mock-up Units under Uncertainties
This project provides an interface for reinforcement learning-based control of functional mock-up units (FMUs) under uncertainties. The interface is designed to be compatible with the Functional Mock-up Interface (FMI) standard. It is implemented in Python and connects FMUs as control plants to pre-existing reinforcement learning libraries such as Stable Baselines3 or SKRL for training.
For more information refer to the corresponding paper [FMUGym: An Interface for Reinforcement Learning-based Control of Functional Mock-up Units under Uncertainties](following soon)
You can easily install FMUGym via pip by running the following command:
pip install fmugym
To install the project from source, first clone this repo and then run the following command from its root directory:
pip install -e .
Further, we recommend using jupyter notebooks for testing. To install jupyter notebook, run the following command:
pip install notebook
As reinforcement learning libraries, we tested Stable Baselines3 and SKRL. To install Stable Baselines3, run the following command:
pip install stable-baselines3
To install SKRL, run the following command:
pip install skrl
This project contains the following content:
- fmugym: The library containing the abstract FMUGym interface and classes for the environment configuration.
- examples: Jupyter notebooks demonstrating the usage of FMUGym
- dummy default: A linear MIMO system connected to an SB3 SAC agent with continuous action and observation space. All possible kinds of uncertainties are included as well as comprehensive comments.
- dummy skrl: A linear MIMO system connected to an SKRL SAC agent with continuous action and observation space.
- dummy discrete: A linear MIMO system connected to an SB3 A2C agent with discrete action and observation space.
- nonlinear: A nonlinear MIMO system connected to an SB3 SAC agent with continuous action and observation space.
- FMUs: A collection of Modelica models and their corresponding FMUs used in the examples. We just provide FMUs exported with Open Modelica for Linux x86. If you are running on another OS / architecture, please export the FMUs accordingly.
Further, we chose the CVODE solver used by Open Modelica, which is causing issues in long simulation runs. So we recommend exporting Modelica models with e.g. Dymola instead on your own. - trained_agents: Models of RL agents after completing training.
- tests: Unit tests for compliance of FMUGym library with gymnasium API.
The following code snippet demonstrates the usage of FMUGym for the most simple linear MIMO system connected to an SB3 agent with continuous action and observation space. For specific implementations please see the dummy example notebook notebook.
The first step involves exporting the desired Modelica model as a co-simulation FMU compliant with FMI 1.0/2.0/3.0. The inputs and outputs of the model have to be clearly defined using the appropriate Modelica standard library blocks (Modelica.Blocks.Interfaces.RealInput for inputs and Modelica.Blocks.Interfaces.RealOutput for outputs). See below the model view of our dummy example in Open Modelica.
class FMUEnv(FMUGym):
...
The super class FMUGym has to be inherited for model-specific utilization by implementing the following abstract methods:
_get_info()
: Used bystep()
andreset()
, returns any relevant debugging information._get_obs()
: Retrieves FMU output values by possibly callingself.fmu._get_fmu_output
for handling different FMU versions and stores them in theself.observation
dictionary. It can also add output noise (usingself._get_output_noise()
) and update the set point (usingself._setpoint_trajectory()
) to return a goal-oriented observation dictionary._get_input_noise()
: Returns input noise for each input component, potentially by sampling from the self.input_noise dictionary._get_output_noise()
: Similar toself._get_input_noise
, generates output noise for each output component, potentially by sampling from theself.output_noise dictionary
._get_terminated()
: Returns two booleans indicating first the termination and second truncation status._create_action_space(inputs)
: Constructs the action space from aVarSpace
object representing the inputs. It can usegymnasium.spaces.Box
for continuous action spaces._create_observation_space(outputs)
: Constructs the observation space returning it as agymnasium.spaces.Dict
. The observation space typically includes observation,achieved_goal
, anddesired_goal
, each created from aVarSpace
object._noisy_init()
: Random variations to initial system states and dynamic parameters by sampling fromself.random_vars_refs
and propagates to corresponding initial output values. It also allows for direct manipulation and randomization of set point goals using theself.y_stop
class variable._process_action(action)
: Called byself.step()
to add noise to action from RL library. May be used to execute low-level controller and adapt action space._setpoint_trajectory()
: Determines the set point values at the current time step within the trajectory, called byself._get_obs()
._process_reward(self, obs, acts, info)
: Interface between step method of fmugym and compute_reward. adjusts the necessary paramters for reward method of RL library (SKRL, SB3).compute_reward()
: Computes and returns a scalar reward value in case of SB3 fromachieved_goal
,desired_goal
, and possibly further parameters.
config = FMUGymConfig(...)
By passing an FMUGymConfig object containing the necessary parameters to the FMUGym implementation an environment instance can be created. Several custom data types simplify the environment configuration:
- VarSpace: Holds a dictionary of class variables. It provides methods like add_var_box(name: str, min: float, max: float) to add a gymnasium.spaces.Box entry or add_var_discrete(name: str, n: int, start: int) to add a gymnasium.spaces.Discrete.
- State2Out: Holds a dictionary of class variables. It offers an add_map(state: str, out: str) method to map a system state variable to an output variable.
- TargetValue: Holds a dictionary of class variables. The method add_target(name: str, value: float) incorporates a target value, such as a nominal initial output value.
dummyEnv = FMUEnv(config)
Create instance of previously implemented FMUEnv by passing the config object to it.
model = ...
Create the RL agent leveraging a pre-existing library like Stable Baselines3 or SKRL or implement a custom RL method compatible to the gymnasium standard. Proceed with training the agent and save the resulting policy.
Running one inference episode with the trained agent while capturing trajectories. To statistically assess the trained agent's performance, the optimized policy can be deployed for multiple randomized episodes to evaluate the results.
This project is licensed under MIT license, as found in the LICENSE file.
The corresponding paper was submitted to the European Group for Intelligent Computing in Engineering (EG-ICE) conference 2024 in Vigo. We will provide information for citing the paper as soon as it is published.