Simulation Architectures
for
Reinforcement Learning applied to Robotics

University of Manchester

2022

Abstract

There is no doubt that we are living in the age of data. In the last two decades, the scientific community has been able to produce systems with superhuman capabilities through the combination of modern hardware advancements, novel learning algorithms and architectures, and advances in software frameworks. Such progress revolutionised domains like computer vision and language processing, showing performance previously out of reach. One may think that results could transfer straightforwardly to other fields like robotics until realising the existence of domain-specific characteristics and limitations hindering the potential of these learning methods. Generating enough data from real-world robots is often too expensive or not even possible to the desired scale. Data sampled from robots has a sequential nature, and not all families of learning algorithms are effective in this context. Furthermore, most algorithms that excel in this sequential setting, such as those belonging to the Reinforcement Learning (RL) family, learn by a trial-and-error process, which could lead to trajectories that damage either the robots or their surroundings.

In this thesis, we attempt to answer the question, "How can modern technology help us generate synthetic data for humanoid robot planning and control?".

Motivated by the advancements in hardware accelerators that are revolutionising scientific computing, we limit our analysis to the simulation realm. In this context, we first introduce a software architecture allowing to structure learning environments for robotics that can be adopted to train and run RL policies regardless of the simulated or real-world setting. With its underlying simulation technology and exploiting a scheme based on reward shaping, we validate the architecture by training with RL a push-recovery controller capable of synthesising whole-body references for the humanoid robot iCub. Then, motivated by overcoming the bottlenecks related to the poor sampling performance of traditional rigid-body simulators, we present a new physics engine in reduced coordinates that can simulate robots interacting with a ground surface on hardware accelerators like GPUs and TPUs. To this end, we present a contact-aware continuous state-space representation describing the dynamical evolution of floating-base robots that can be numerically integrated for simulation purposes. We adopt the new general-purpose Gazebo Sim simulator as our first solution to sample synthetic data, and exploit JAX and its hardware support to scale the sampling performance for highly parallel problems. Furthermore, we implement and benchmark common Rigid Body Dynamics Algorithms part of the proposed physics engine on hardware accelerators and assess their scalability properties on different GPUs. These pieces of technology help to lower the computational barriers that nowadays are still among the main bottlenecks for obtaining intelligent agents, democratising the applicability of this family of learning-based methods.

Citing

@phdthesis{ferigo_phd_thesis_2022,
  title = {Simulation Architectures for Reinforcement Learning applied to Robotics},
  author = {Ferigo, Diego},
  school = {University of Manchester},
  type = {PhD Thesis},
  month = {July},
  year = {2022},
  url = {https://github.com/diegoferigo/phd-thesis/releases/latest/download/thesis.pdf},
}

Contributing

For any doubt or to report an error, please open an issue.

If you want to fix the document yourself, please open a PR against the main branch (see branching details below). The Continuous Integration pipeline implemented in this repository will compile the LaTeX sources with your contribution and upload the PDF document as artifact of the workflow for inspection.

Branching

This repository has two branches:

overleaf is the branch connected to my personal Overleaf project.
main is the branch associated to external contributions and releases.

The Overleaf Git system does not currently support branching. For this reason, I cannot select main as default branch of the repository, even if it is.

If you want to contribute with a new PR, please target the main branch.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github/workflows		.github/workflows
Chapters		Chapters
FrontBackmatter		FrontBackmatter
images		images
tikzexternalize		tikzexternalize
.gitignore		.gitignore
COPYING		COPYING
ClassicThesis.tcp		ClassicThesis.tcp
ClassicThesis.tps		ClassicThesis.tps
README.md		README.md
classicthesis-arsclassica.sty		classicthesis-arsclassica.sty
classicthesis-config.tex		classicthesis-config.tex
classicthesis.sty		classicthesis.sty
latexmkrc		latexmkrc
new_commands.tex		new_commands.tex
thesis.tex		thesis.tex
zotero.bib		zotero.bib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simulation Architectures
for
Reinforcement Learning applied to Robotics

University of Manchester

2022

Abstract

Citing

Contributing

Branching

About

Releases 4

Languages

License

diegoferigo/phd-thesis

Folders and files

Latest commit

History

Repository files navigation

Simulation Architectures for Reinforcement Learning applied to Robotics

University of Manchester

2022

Abstract

Citing

Contributing

Branching

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 4

Languages

Simulation Architectures
for
Reinforcement Learning applied to Robotics