Reinforcement-Learning-with-CLF-s

This project contains code for some RL implementations where a Control Lyapunov Function is added to the reward function to accelerate the training process. This project also contains MATLAB code for computing Control Lyapunov Functions using Hamilton-Jacobi Reachability analysis. Read the full paper here.

Here, we show the trajectories of the different RL implementations done with our method.

1. Dubins Car Trajectory	2. Lunar Lander

3. Drone Landing (Left):xz plane (Right): y

Paper Abstract

Recent methods using Reinforcement Learning (RL) have proven to be successful for training intelligent agents in unknown environments. However, RL has not been applied widely in real-world robotics scenarios. This is because current state-of-the-art RL methods require large amounts of data to learn a specific task, leading to unreasonable costs when deploying the agent to collect data in real-world applications. In this paper, we build from existing work that reshapes the reward function in RL by introducing a Control Lyapunov Function (CLF), which is demonstrated to reduce the sample complexity. Still, this formulation requires knowing a CLF of the system, but due to the lack of a general method, it is often a challenge to identify a suitable CLF. Existing work can compute low-dimensional CLFs via a Hamilton-Jacobi reachability procedure. However, this class of methods becomes intractable on high-dimensional systems, a problem that we address by using a system decomposition technique to compute what we call Decomposed Control Lyapunov Functions (DCLFs). We use the computed DCLF for reward shaping, which we show improves RL performance. Through multiple examples, we demonstrate the effectiveness of this approach, where our method finds a policy to successfully land a quadcopter in less than half the amount of real-world data required by the state-of-the-art Soft-Actor Critic algorithm.

Dependencies

gym - 0.18.0
numpy - 1.19.5
matplotlib - 3.3.4
pytorch - 1.8.1

Implementation

Constructing CLF

src/Reachability_CLVF contains the code in MATLAB necessary for computing Lyapunov functions using Reachability.
ToolboxLS contains the code for using Level Set methods to obtain solutions for Hamilton-Jacobi partial differential equations. For more information on how to use this toolbox, see here
Before using the ToolBox, go to src/Reachability_CLVF/add_path_to_tollbox.m and modify the respective path accordingly to your computer. The run the script.
helperOC-master has integrated functions to facilitate using the Toolbox. See here for more information on how to use helperOC.
SystemDecomposition contains different examples for using our method to compute DCLF using system decomposition. Run /src/Reachability_CLVF/System Decomposition/Dubins Car CLVF/main_dubins.m to visualize the CLF for a Dubins Car example. This also creates a variable V_1.mat, which is used for the next steps in RL training.