# Smart Control

This repository contains the source code for use of the jointcontrol ROS package. The goal of this software is for a Reinforcement Learning (RL) model to learn the dynamics of a closed loop controller found in a robot. The trained model should then be able to set controller parameters based on the achieved plant output. The model is trained using a physics simulation and approximated version of the cascade controller found in the robot's actuators. While physics-based training is done using [the Pybullet simulator](https://pybullet.org/wordpress/), the standardised interface between environment and RL agent defined by [OpenAI Gym](https://gym.openai.com/) was used by implementing a custom Gym environment that connects the Gym and Bullet components to each other using the [Robot Operating System (ROS)](https://www.ros.org/).

The application is built in a way that allows an arbitrary number of instances of the custom Gym environment to connect to a shared simulation in Bullet. This approach allows for simultaneous training of multiple agents in multiple environments with possibly different physics properties. However, in order to ensure accuracy of the simulation, time-discrete simulation steps are performed synchronised across all active Gym environments.

## Used Software Packages

Figure 1 shows the application's deployment structure as well as used open software packages. The implementation is mainly done in [Python 3](https://www.python.org/downloads/).

<img src="images/AppDeploymentStructure.jpg" width="800"/>

_Figure 1: The system is implemented using the Pybullet physics simulator, OpenAI Gym and the Robot Operating System (ROS). It allows for simultaneous training of an arbitrary number of agents by instantiating multiple Gym environments that all connect to one shared and synchronised physics simulation._

<style>
  table {margin-left: 0 !important;}
</style>

Table 1 provides links to each of the used packages describes which funcitonality of each software package was incorporated into the application.

| Package Name  | Description     |
| ------------- |:--------------- |
| [Pybullet](https://pybullet.org/wordpress/) | Python-accesible version of the Bullet physics simulator |
| [Robot Operating System (ROS)](https://www.ros.org/) | Framework for implementing robot systems |
| [OpenAI Gym](https://gym.openai.com/) | Reinforcement Learning framework with standardised interface between agent and environment |
| [Stable Baselines](https://github.com/hill-a/stable-baselines) | Implementation of common RL algorithms for OpenAI Gym |
| [Tensorflow](https://www.tensorflow.org/) | Machine Learning library for Python |
| [Numpy](https://numpy.org/) | Python module for data management |
| [Matplotlib](https://matplotlib.org/) | Python module for data visualisation |
| [JupyterLab](https://jupyterlab.readthedocs.io/en/stable) | Python notebook IDE for interactive programming and data visualisation |
 
_Table 1: Multiple openly available software packages were used to implement the application._

## Code Reference

While the algorithms are documented using [JupyterLab](https://jupyter.org/), the core ROS package is documented using [rosdoc_lite](http://wiki.ros.org/rosdoc_lite) and [Doxygen](https://www.doxygen.nl/index.html). This section provides direct links to the documentation for each component. Links tagged with *[Notebook]* are links to html versions of the interactive notebooks used for model training and testing and have been provided to allow them to be easily viewed in a browser without having to install a notebook viewer.

- Autogenerated documentation of the jointcontrol ROS package [**link**](./htmldoc/jointcontrol/html/index.html)
- Step response testing of each discretised control block [**link [Notebook]**](./htmldoc/testControllerBlocks.html) 
- **Deep Deterministic Policy Gradients** agent for controller parameter estimation [**link [Notebook]**](./htmldoc/DDPG.html)
- **Deep Q-Network** agent for controller parameter estimation [**link [Notebook]**](./htmldoc/DQN.html)
- **Proximal Policy Optimisation** agent for controller parameter estimation [**link [Notebook]**](./htmldoc/PPO1.html)

## Running the Application

The application itself is deployed using [Docker](https://www.docker.com/). For installation of Docker components and drivers, please refer to the documentation for the [used Docker environment](https://github.com/SimonSchwaiger/ros-ml-container).

These steps are required to get the application up and running as quickly as possible on Linux:
- [Install Docker](https://docs.docker.com/engine/install/ubuntu/)
- Run **sudo groupadd docker && sudo usermod -aG docker $USER** in order to allow Docker to run without sudo and log out and in again for changes to take place
- Run **GRAPHICS_PLATFORM=opensource ./buildandrun.sh** in order to start the application in open-source mode (if you have proprietary Nvidia drivers installed, please use GRAPHICS_PLATFORM=nvidia instead)
- The application will be built and started inside of an Ubuntu 20.04 container. The build can take several minutes depending on the host hardware. Once up and running, the application can be closed using Ctrl+D

Each algorithm can be run in an interactive JupyterLab session. Upon running the *buildandrun.sh* script, JupyterLab will be hosted and the neccesary ports will be exposed to the host machine in order to access JupyterLab [**here**](http://127.0.0.1:8888/), using a Browser of choice on the host machine. Upon the first start, JupyterLab will ask for a password; just enter *smart_control* and the session will start.

In the session, the **Notebooks** folder can be navigated to using the browser on the left side of the page. Within this folder, there are interactive notebooks that allow training of agents based on each implemented algorithm. During training, a [Tensorboard](https://www.tensorflow.org/tensorboard) visualisation can be started and accessed [**here**](http://0.0.0.0:6006/#scalars) in order to visualise training progress.

<center>____________________________________________________________________________________________________________________________</center>