Model predictive control-based value estimation for efficient reinforcement learning
This repository contains the code for the implementation of model predictive control-based value estimation, for solving the problem in classic simulation environments and a designed UAV dynamic obstacle avoidance environment. Our experiments are conducted in classic simulation environments, such as Cliff Walking (CW), CartPole (CP), Pendulum (PD), and Humanoid (HO).
How to use the code: You can open each directory, the instructions for running each simulation are written in the respective readme.md. Dependencies: The code is written in Python. We recommend using Python 3.7. The required packages can be found in the file environment.yml.
Citation:
Q. Wu, K. Liu and L. Chen, "Model predictive control–based value estimation for efficient reinforcement learning," in IEEE Intelligent Systems, doi: 10.1109/MIS.2024.3386204 (https://ieeexplore.ieee.org/document/10494864)
Bibtex:
@ARTICLE{10494864,
author={Wu, Qizhen and Liu, Kexin and Chen, Lei},
journal={IEEE Intelligent Systems},
title={Model predictive control–based value estimation for efficient reinforcement learning},
year={2024},
volume={},
number={},
pages={1-10},
keywords={Predictive models;Neural networks;Trajectory;Data models;Computational modeling;Training;Optimization},
doi={10.1109/MIS.2024.3386204}}
If you have any questions or concerns, please raise an issue or email: wuqzh7@buaa.edu.cn