This repo hosts the code for the online and offline reinforcement learning experiments on The Simulated Industrial Manufacturing and Process Control Learning Environments (SMPL).
$ pip install -r requirements.txt
You simply need to run the online_experiments.sh
script. Note that you may want to edit the online_experiments.yaml
for different configurations. Or, optionally, you can go into the specific directory of environments (e.g. mabenv_experiments) and execute the online_experiments.sh
(which executes online_experiments.py
for the online RL algorithms) there for their specific configurations. Moreover, you can edit the configurations in online_experiments.yaml
.
After you trained an online RL algorithm, you could do the inference with online_inference.py
. You need to set the env_name, model_names, best_checkpoint_paths and config_dirs accordingly such that the correct checkpoint(s) are loaded. You can also set the plot configurations to visualize how the trained algorithm actually performs. For more details, please consult the docstring in online_inference.py
and this documentation.
You first need to generate a dataset with the baseline algorithm using the script offline_data_generation.py
located in {env_name}_experiments
. After successfully generated the training, evaluating and testing initial states and datasets, you can then use the offlineRL_training.py
to train the offline RL algorithms. Don't forget that you can edit the configurations in offline_experiments.yaml
.
The OFFLINE_BEST.yaml
in {env_name}_experiments specifies the location of your current offline RL experiments. For example, if you finished the experiment of Behavior Cloning and you put "d3rlpy_logs/42"
in the OFFLINE_BEST.yaml
, then you should be able to locate the best checkpoint in d3rlpy_logs/42/BC/best.pt
, which is the checkpoint used to perform the inference with offline_inference.py
. Again you can set the plot configurations to analyze and visualize the results.