Designing a control system to exploit model-free deep reinforcement learning algorithms to solve a real-world autonomous driving task of a small robot.
- Table of Contents
- Introduction
- Abstract
- Showcase of some testing episode
- Repository Structure
- Contibutions and License
- References
The Official repository of my master thesis in Computer Engineering at Politecnico di Torino.
The project was developed at Eurecom (Biot, France) with prof. Pietro Michiardi (Eurecom) and prof. Elena Baralis (Politecnico di Torino).
- Piero Macaluso - pieromacaluso
- Prof. Pietro Michiardi - michiard
- Prof. Elena Baralis - elena.baralis
Because of its potential to thoroughly change mobility and transport, autonomous systems and self-driving vehicles are attracting much attention from both the research community and industry. Recent work has demonstrated that it is possible to rely on a comprehensive understanding of the immediate environment while following simple high-level directions, to obtain a more scalable approach that can make autonomous driving a ubiquitous technology. However, to date, the majority of the methods concentrates on deterministic control optimisation algorithms to select the right action, while the usage of deep learning and machine learning is entirely dedicated to object detection and recognition.
Recently, we have witnessed a remarkable increase in interest in Reinforcement Learning (RL). It is a machine learning field focused on solving Markov Decision Processes (MDP), where an agent learns to make decisions by mapping situations and actions according to the information it gathers from the surrounding environment and from the reward it receives, trying to maximise it. As researchers discovered, it can be surprisingly useful to solve tasks in simulated environments like games and computer games, and it showed encouraging performance in tasks with robotic manipulators. Furthermore, the great fervour produced by the widespread exploitation of deep learning opened the doors to function approximation with convolutional neural networks, developing what is nowadays known as deep reinforcement learning.
In this thesis, we argue that the generality of reinforcement learning makes it a useful framework where to apply autonomous driving to inject artificial intelligence not only in the detection component but also in the decision-making one. The focus of the majority of reinforcement learning projects is on a simulated environment. However, a more challenging approach of reinforcement learning consists of the application of this type of algorithms in the real world. For this reason, we designed and implemented a control system for Cozmo, a small toy robot developed by Anki company, by exploiting the Cozmo SDK, PyTorch and OpenAI Gym to build up a standardised environment in which to apply any reinforcement learning algorithm: it represents the first contribution of our thesis.
Furthermore, we designed a circuit where we were able to carry out experiments in the real world, the second contribution of our work. We started from a simplified environment where to test algorithm functionalities to motivate and discuss our implementation choices. Therefore, we implemented our version of Soft Actor-Critic (SAC), a model-free reinforcement learning algorithm suitable for real-world experiments, to solve the specific self-driving task with Cozmo. The agent managed to reach a maximum value of above 3.5 meters in the testing phase, which equals more than one complete tour of the track. Despite this significant result, it was not able to learn how to drive securely and stably. Thus, we focused on the analysis of the strengths and weaknesses of this approach outlining what could be the next steps to make this cutting-edge technology concrete and efficient.
Document | Status | Project Folder | |
---|---|---|---|
Master Thesis | WIP | π | π |
Summary | WIP | π | π |
Presentation | WIP | π | π |
- April 2019: Deep Deterministic Policy Gradient (DDPG) - REPORT
- May 2019: Soft Actor-Critic (SAC) - REPORT
- September 2019: Experiment Flow - REPORT
It is possible to fork the project and create your own one following the rules given by the LICENSE.
Please cite using the following BibTex entry:
@mastersthesis{macaluso2020deep,
author = {Piero Macaluso},
title = {{Deep Reinforcement Learning for Autonomous Systems}},
school = {{Politecnico di Torino}, {Eurecom}}
year = {2020}
}
If you want to contribute or to request a new features, you can do that via the ISSUE sections.
If you need any help to setup the project or to have information about it, feel free to join us at @PieroMacaluso
on Telegram and ask away.
[v1] David Silver's Reinforcement Learning Course
[v2] Reinforcement Learning Udacity Course
[b1] Reinforcement Learning: An Introduction (2018) by Richard S. Sutton and Andrew G. Barto
[b2] Deep Reinforcement Learning Hands-On (2018) by Maxim Lapan
[p1] Learning to Drive in a Day (Sep 2018) by Alex Kendall, Jeffrey Hawke, David Janz, Przemyslaw Mazur, Daniele Reda, John-Mark Allen, Vinh-Dieu Lam, Alex Bewley & Amar Shah
[p2] Continuous Control with Deep Reinforcement Learning (Feb 2016) by Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver & Daan Wierstra
[p3] Deterministic Policy Gradient Algorithms (2014) David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, Martin Riedmiller
[p4] Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor (2018) Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine
[p5] Soft Actor-Critic Algorithms and Applications (2018) Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, Sergey Levine