Deep reinforcement learning for automated search of model parameters: photo-fenton wastewater disinfection case study
Code for optimizing the parametric Photo-fenton model from [1] using deep reinforcement learning. Here we include some different configuration over a Proximal Policy Optimization (PPO) agent [1] including balancing the memory of experiences, Hindsight Experience Replay [3] and including expert knowledge. Read the full article here: paper
To run an experiment:
python <file.py> <path to folder containing exp_config.yaml>
E. g.: To run an optimization of peroxide model using a PPO agent with "balanced" memory of experiences and "random" initialization, configure the exp_config.yaml file as:
# Optimization conf
actor_lr: 1e-5
critic_lr: 1e-5
batch_size: 128
exploration_noise: 5.0
epsilon: 1.0
epsilon_decay: 0.8
epsilon_min: 0.09 # exploration_noise*epsilon_min > 0.4
memory_size: 40
histogram_memory: False
n_stack: 20
n_step_return: 20
skip_states: 1
iter: 25
sodis_params: src/environments/fotocaos_complete_model/sodis_params.txt
perox_params: src/environments/fotocaos_complete_model/perox_params.txt
experiment_path: experimentos/
test_iter: 0
test: False
optimize_peroxide: True
optimize_bacteria: False
Then, run the .py file and pass the path to the folder containing the exp_config.yaml file:
python Agent_DB-R.py ./
Install python 3.6 and then install the requirements.txt
pip install -r requirements.txt
@article{hernandez2023rlphotofenton,
title={Deep reinforcement learning for automated search of model parameters: photo-fenton wastewater disinfection case study},
author={Hern{\'a}ndez-Garc{\'\i}a, Sergio and Cuesta-Infante, Alfredo and Moreno-SanSegundo, Jos{\'e} {\'A}ngel and Montemayor, Antonio S},
journal={Neural Computing and Applications},
volume={35},
number={2},
pages={1379--1394},
year={2023},
publisher={Springer},
doi = {10.1007/s00521-022-07803-3}
}
[1] C. Casado, J. Moreno-SanSegundo, I. De la Obra, B. Esteban Garc ́ıa, J. A. S ́anchez P ́erez, and J. Marug ́an, “Mechanistic mod- elling of wastewater disinfection by the photo-fenton process at circumneutral ph,” Chemical Engineering Journal, vol. 403, p. 126335, 2021. https://doi.org/10.1016/j.cej.2020.126335
[2] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017. https://arxiv.org/abs/1707.06347
[3] M. Andrychowicz, F. Wolski, A. Ray, J. Schneider, R. Fong, P. Welinder, B. McGrew, J. Tobin, O. Pieter Abbeel, and W. Zaremba, “Hindsight experience replay,” in Advances in Neu- ral Information Processing Systems, ser. NeurIPS, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30, 2017.