In this repository, you can find the source code used for the paper "Discovering Continuous-Time Memory-Based Symbolic Policies using Genetic Programming". Click here to read the paper. In this paper, we used genetic programming to evolve control policies consisting of symbolic expressions, both with and without memory, and compare with black-box neural differential equations. The methods are tested in various settings on the harmonic oscillator, acrobot and continuous stirred tank reactor.
To use the code, you can clone the repository and create the environment by running:
conda env create -f environment.yml
conda activate dsp_env
All code used for our paper is located in src/.
The code for running the seven main experiments can be found in src/run.py. For each of the seven experiments, we have outlined the hyperparameters and the five methods can be evaluated given the hyperparameters and experiment setting. Select a certain experiment, pick a random seed and then run a method to reproduce the results.
| Experiment ID | Environment | Observability | Other |
|---|---|---|---|
| 1 | Harmonic oscillator | Full state | -- |
| 2 | Harmonic oscillator | Partial state | -- |
| 3 | Harmonic oscillator | Full state | Changing environment parameters |
| 4 | Acrobot | Full state | -- |
| 5 | Acrobot | Partial state | -- |
| 6 | Acrobot | Full state | Two control inputs |
| 7 | Continuous stirred tank reactor | Partial state | Changing environment parameters |
To compare our method with the time-lagged policies, the code in src/lag_experiment.py should be run. It focuses on the harmonic oscillator with partial state observability (experiment 2).
All data generated by our experiments is stored in subfolders in results/. src/generate_plots_ipynb provides the code for recreating the figures in the paper for each experiment, and the additional figures.
If you make use of this code in your research paper, please cite:
@article{de2024discovering,
title={Discovering Dynamic Symbolic Policies with Genetic Programming},
author={de Vries, Sigur and Keemink, Sander and van Gerven, Marcel},
journal={arXiv preprint arXiv:2406.02765},
year={2024}
}