All code was written by me.
This repository follows the general directory structure for a ROS package.
-
checkpoints
: Contains .csv files for various models. Each model has its own folder, inside which there are files for the policy, q-matrix, and measured metrics. -
launch
: Launch files to run combinations of nodes simultaneously. Requires ROS (Robot Operating System) to useroslaunch
, the command-line tool to run these files. -
models
: .dae mesh files generated by Blender for the walls in the simulation. Two lengths are provided: a long wall 10 meters long, and a short wall 5 meters long. Unfortunately, the folder namemodels
is required for Gazebo to read the files, creating some confusion. -
msg
: .msg files representing custom messages that get passed between ROS nodes. Only one custom message is used in this project:StateReward
, which contains a state ID, a reward, and a boolean representing whether the state is terminal. -
plots
: Python scripts for plotting the metrics recorded in thecheckpoints
folder, as well as the .png plots themselver.four_way_hist.py
(39 lines): Plot histograms of the times recorded for four different models:heuristic
,manual
,qlearning_0.5_q_7610
, andsarsa_0.5_q_7610
.policy_updates.py
(29 lines): Plot a running total for the number of changes made to a specific policy. Takes two arguments to produce a plot comparing two models.
-
src
: The source code for the ROS nodes. This folder contains all of my code (except for the plot utils). Two nodes,action
andenvironment
, set up the framework for the algorithm we want to train/test; the third node, which provides the policy algorithm to map a state/reward to an action, is a choice betweenfollow_policy
,heuristic
, andlearning
.utils
: Files containing helper functions and classes that don't run as standalone ROS nodes.checkpoint.py
(98 lines): Support for loading and save metrics like goal time and success rate to thecheckpoints
folder.corner.py
(233 lines):Wall
andCorner
classes to represent a corner in Gazebo, as well as helper functions to calculate robot state relative to the geometry.state.py
(473 lines): Classes to represent robot state. Contains implementations forTargetSector
,ObstacleSector
, the combined state objects, and helper functions to convert them to and from integer IDs.
action.py
(56 lines): A node to execute an action. Converts action IDs intoTwist
messages that tell the robot how to move.environment.py
(174 lines): A node to read in the simulation state at a fixed time step and create a discretized state object.follow_policy.py
(61 lines): A node that loads a policy from a checkpoint CSV file and always takes the specified action.heuristic.py
(65 lines): A node that follows the heuristic algorithm designed to serve as a baseline for the reinforcement learning model.learning.py
(243 lines): A node that loads or initializes a policy and uses a temporal difference RL algorithm to learn an optimal policy. Contains implementation of both Q-Learning and Sarsa.
-
worlds
: .world files representing a Gazebo simulation environment. A single world,turtlebot3_drift.world
, is used for the cornering task; it loads in the walls and is reused upon the world reset. -
CMakeLists.txt
andpackage.xml
: Files required for the ROS package and messages to compile.
- Python 3
- ROS Noetic
- NumPy
- Matplotlib (plots only)
Ubuntu 20.04 is recommended as it is the platform that the code was developed and tested on.
After installing the dependencies and following the ROS Noetic setup
instructions, clone this repository to ~/catkin_ws/src
and compile in the
~/catkin_ws
directory with catkin_make
.
The system is executed by running two launch files: