This repository presents materials on numerical experiments for the article "IPLooper: Industrial Processes Simulator for Benchmarking Reinforcement Learning-based Control". The experiments are aimed at systematically studying and comparing the performance of traditional industrial process control algorithms and algorithms obtained using reinforcement learning (RL).
The study use flexible Python-based framework for modeling dynamic industrial processes iplooper and a methodology for packaging these models into OpenAI Gym-compatible environments, enabling the integration of reinforcement learning (RL) algorithms. The framework was employed to create simulators for the Exothermic Continuous Stirred-Tank Reactor, a Distillation Column, and the Simplified Tennessee Eastman Process. These simulators facilitated a comparative analysis between traditional PID control algorithms and advanced control strategies derived from both online and offline RL techniques.
The general idea of the framework can be illustrated in the figure. A node-based simulator, using sensor nodes, provides results to an AI algorithm, which can control the process by influencing manipulated variables via input nodes.
The presented repository has the following structure:
- The config directory contains configuration files with settings for online and offline RL training on the environments used in the study.
- The control directory contains conventional control algorithms in the form adopted for the study.
- The data directory contains a single file with DC_IndustrialControlAlgorithm_data.csv data obtained by using an industrial control algorithm on the DistillationColumn simulator provided by industrial partners.
- The mgym directory contains bindings for obtaining OpenAI gym/gymnasium environments from the industrial process simulators considered in the study, as well as standard settings for the RL algorithms used in the study.
- The utils directory contains various utilities to provide template approach for generating data, working with supplymentary directoryes and others. The constructors script within directory contains methods for creating OpenAI gym/gymnasium compatible environments tools for particular process.
- The root directory contains the main scripts for preparing data for offline training, the training itself both offline and online, scripts for testing the obtained models within the predefined scenarios for each process.
Python 3.9.13 is used for experiments. You will also need git since requirements.txt has links to github projects as dependencies.
The IPlooper framework used in the study aims to maintain a minimum number of external dependencies. This work uses proven RL frameworks compatible with the OpenAI gym/gymnasium standard, such as d3rlpy and ray, which themselves contain a large number of dependencies and are sensitive to both the PC environment and the Python environment on which they are executed. Therefore, the dependencies used in the study are collected in the requirements.txt file attached to the repository in the root directory.
The modified SMPL frameword used as basis for wrapping of industrial process simulators to OpenAI gym/gymnasium style enviroments. The base implementation of SMPL doesn't cover all requirement (e.g. proper normalization routines) for the study. Thats why, the our modified SMPL framework already included in mgym directory and don't need explicit import.
It is recommended to install all dependencies in a separate .venv space using the provided requirements.txt file as follows:
> pip install -r requirements.txt
Used ray==1.9.1 require torch of particular versions. To properly setup torch use following pip command within used venv:
For Windows
> pip install torch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 --index-url https://download.pytorch.org/whl/cu113
For MacOS:
> pip install torch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1
To run training configure particular process .yaml settings file and run coresponding scripts.
For offline training, you first need to generate data for the desired process
offlineRL_data_generation.py -p @ProcessName@
Then run the training as follows.
offlineRL_training.py -p @ProcessName@
Specific training parameters can be changed using the process configuration file in the configs folder.
By default offlineRL_training try all used RL algorithms ('COMBO', 'MOPO', 'AWAC', 'DDPG', 'TD3', 'BC', 'CQL', 'PLAS', 'PLASWithPerturbation', 'BEAR', 'SAC', 'BCQ', 'CRR'). Partifular algorithms can be specified by --algs option of offlineRL_training script.
For online training run the training as follows
onlineRL_training.py -p @ProcessName@
By default offlineRL_training try all used RL algorithms ('ppo', 'sac', 'ars', 'impala', 'a2c', 'a3c'). Partifular algorithms can be specified by --algs option of offlineRL_training script.
To evaluate pretrained models you need place data from disk to pretrained directory, which needs to be created in the root directory of the current project. You can do this manually or use load_pretrained_models.py, which will download models and unzip them to pretrained directory. Just call:
load_pretrained_models.py
To evaluate offline models call:
offline_pretrained_models_assessment.py -p @ProcessName@
To evaluate online models call:
online_pretrained_models_assessment.py -p @ProcessName@
The models assesment scripts also allow specify particular algorithms to show results by --algs option. List of algs coresponds to list in offlineRL_training and onlineRL_training.
Exothermic Continuous Tank Reactor (ECSTR) result for offline RL models is presented on image
Exothermic Continuous Tank Reactor (ECSTR) result for online RL models is presented on image
Distillation Column (DC) result for offline RL models is presented on image
Distillation Column (DC) result for online RL models is presented on image
Simplified Tennessee Eastman Process (STEP) result for offline RL models is presented on image





