Efficient Reinforcement Learning for Jumping Monopods

Riccardo Bussola, Michele Focchi, Andrea Del Prete, Daniele Fontanelli, Luigi Palopoli

Corresponding author's email: Riccardo Bussola

This repository is a reduced version of Locosim (preprint) and it is intended for reproducing simulations and experiments presented in the manuscript

In this work we consider the complex problem of making a monopod perform an omni-directional jump on uneven terrain. We guide the learning process within an RL framework by injecting physical knowledge. This expedient brings to widespread benefits, such as a drastic reduction of the learning time, and the ability to learn and compensate for possible errors in the low-level controller executing the motion.

Check out our Youtube video.

Installing Locosim

Locosim is composed by a roscontrol node called ros_impedance_controller (written in C++) that interfaces the Python ros node (where the controller is written) to a Gazebo simulator.


Locosim is compatible with Ubuntu 18/20. The installation instructions have been generalized accordingly. You need replace few strings with the appropriate values according to your operating systems as follows:

Ubuntu 18: Ubuntu 20:
ROS_VERSION = bionic ROS_VERSION = noetic

Install ROS

setup your source list:

sudo sh -c 'echo "deb $(lsb_release -sc) main" > /etc/apt/sources.list.d/ros-latest.list'

Set up your keys:

curl -sSL '' | sudo apt-key add -

install ROS main distro:

sudo apt-get install ros-ROS_VERSION-desktop-full

install ROS packages:

sudo apt-get install ros-ROS_VERSION-urdfdom-py
sudo apt-get install ros-ROS_VERSION-srdfdom
sudo apt-get install ros-ROS_VERSION-joint-state-publisher
sudo apt-get install ros-ROS_VERSION-joint-state-publisher-gui
sudo apt-get install ros-ROS_VERSION-joint-state-controller 
sudo apt-get install ros-ROS_VERSION-gazebo-msgs
sudo apt-get install ros-ROS_VERSION-control-toolbox
sudo apt-get install ros-ROS_VERSION-gazebo-ros
sudo apt-get install ros-ROS_VERSION-controller-manager
sudo apt install ros-ROS_VERSION-joint-trajectory-controller

Pinocchio stuff

Add robotpkg as source repository to apt:

sudo sh -c "echo 'deb [arch=amd64] $(lsb_release -sc) robotpkg' >> /etc/apt/sources.list.d/robotpkg.list"
sudo sh -c "echo 'deb [arch=amd64] $(lsb_release -sc) robotpkg' >> /etc/apt/sources.list.d/robotpkg.list"

Register the authentication certificate of robotpkg:

sudo apt install -qqy lsb-release gnupg2 curl
curl | sudo apt-key add -

You need to run at least once apt update to fetch the package descriptions:

sudo apt-get update

Now you can install Pinocchio and the other dependencies:

sudo apt install robotpkg-PINOCCHIO_PYTHON_VERSION-crocoddyl
sudo apt install robotpkg-PINOCCHIO_PYTHON_VERSION-eigenpy	
sudo apt install robotpkg-PINOCCHIO_PYTHON_VERSION-pinocchio
sudo apt-get install robotpkg-PINOCCHIO_PYTHON_VERSION-quadprog  

NOTE: If you have issues in installing robotpkg libraries you can try to install them through ROS as:

sudo apt-get install ros-ROS_VERSION-LIBNAME


sudo apt-get install python3-scipy
sudo apt-get install python3-matplotlib
sudo apt-get install python3-termcolor
sudo apt install python3-pip
sudo pip install numpy==1.17.4
sudo pip install joblib==1.2.0
sudo pip install torchvision==0.15.1
sudo pip install tensorboard==2.11.0
sudo pip install torch==2.0.0

Download code and setup ROS workspace

Now that you installed all the dependencies you are ready to get the code, but first you need to create a ros workspace to out the code in:

mkdir -p ~/ros_ws/src
cd ~/ros_ws/src

Now you need to call the following line manually (next you will see that it will be done automatically in the .bashrc)

source /opt/ros/ROS_VERSION/setup.bash
cd ~/ros_ws/
Now you can clone the repository inside the ROS workspace you just created:

git clone

now recompile again (then this step won't bee needed anymore if you just work in Python unless you do not modify / create additional ROS packages)

cd ~/ros_ws/ 
 catkin_make install

the install step install the ros packages inside the "$HOME/ros_ws/install" folder rather than the devel folder. This folder will be added to the ROS_PACKAGE_PATH instead of the devel one.

Finally, run (you should do it any time you add a new ros package)

 rospack profile

There are some additional utilities that I strongly suggest to install. You can find the list here.

Configure environment variables

gedit  ~/.bashrc

copy the following lines (at the end of the .bashrc), remember to replace the string PYTHON_VERSION with the appropriate version name as explained in software versions section:

source /opt/ros/ROS_VERSION/setup.bash
source $HOME/ros_ws/install/setup.bash
export PATH=/opt/openrobots/bin:$PATH
export LOCOSIM_DIR=$HOME/ros_ws/src/rl_pipeline
export PYTHONPATH=/opt/openrobots/lib/pythonPYTHON_VERSION/site-packages:$PYTHONPATH
export PYSOLO_FROSCIA=$LOCOSIM_DIR/fddp_optimization
export ROS_PACKAGE_PATH=$ROS_PACKAGE_PATH:/opt/openrobots/share/
export PKG_CONFIG_PATH=/opt/openrobots/lib/pkgconfig:$PKG_CONFIG_PATH
export LD_LIBRARY_PATH=/opt/openrobots/lib:$LD_LIBRARY_PATH

The .bashrc is a file that is automatically sourced whenever you open a new terminal.

Compile/Install the code

Whenever you modify some of the ROS packages (e.g. the ones that contain the xacro fles inside the robot_description folder), you need to install them to be sure they are been updated in the ROS install folder.

cd ~/ros_ws/ 
 catkin_make install 


The first time you compile the code the install folder is not existing, therefore won't be added to the PYTHONPATH with the command source $HOME/ros_ws/install/setup.bash, and you won't be able to import the package ros_impedance_controller. Therefore, only once, after the first time that you compile, run again :

source .bashrc

Code usage

The repository contains the implementation of the three approaches presented in the paper: Guided reinforcement learning (GRL), End-to-end reinforcement learning (E2E), and FDDP-based nonlinear trajectory optimization. Both the GRL and the E2E solutions can execute the RL agent in three different modes:

  • train: to start the policy training process
  • test: to test the learned policy in the pre-defined test-region
  • inference: the policy is used without performing the training process


Policy weights

To try the learned policies, download the network weights and decompress them in the base_controllers/jumpleg_rl folder. The GRL policy weights are loaded from the runs folder, while the ones for the E2E policy are loaded from the runs_joints folder.

You can download the weights directly from the latest release of this repository.

Configuring the agent

Inside the base_controller folder are the two files responsible for the execution of GRL and E2E implementation, respectively and

Inside the constructor of the JumplegController class, there are several configuration parameters.

class JumpLegController(BaseControllerFixed):

    def __init__(self, robot_name="jumpleg"):
        self.agentMode = 'inference'
        self.restoreTrain = False
        self.gui = False
        self.model_name = 'latest'
        self.EXTERNAL_FORCE = False
        self.DEBUG = False


  • agentMode(str): RL agent mode, {"train", "test", "inference"}: Set it to 'train' to train the NN. The NN weights will be updated and stored in a local folder (robot_control/jumpleg_rl/runs). To evaluate the NN on the test set (test region) set it to 'test', set it to 'inference' for random targets evaluation.
  • restoreTrain(bool): Allows to restore training from a saved run
  • gui(bool): Enable/Disable the launch of Gazebo view
  • model_name(str): Specify the model's weight name to load in the rl agent
    ! ATTENTION ! the weights have to be in the base_controllers/jumpleg_rl/runs folder
  • DEBUG(bool): Enable/Disable the plotting of robot's telemetry

Running the agent

Once configured, you can run the agent directly from your IDE or by executing the following command

python3 -i $LOCOSIM_DIR/robot_control/base_controllers/ 
python3 -i $LOCOSIM_DIR/robot_control/base_controllers/ 

Monitoring the execution

Each time the agent is executed, the corresponding agent mode folder is created/updated inside the runs folder. Inside each folder, there is a logs folder where Tensorboard event files are saved.

├── base_controllers
│   ├── ...
└── jumpleg_rl
    ├── runs_joints
    │   │── train
    │   │   ├── logs
    │   │   └── partial_weights
    │   └── inference
    │       ├── logs
    ├── runs
    │   │── train
    │   │   ├── logs
    │   │   └── partial_weights
    ├── ...

By launching Tensorboard in the desired folder, you can visualize some telemetries regarding the experiment execution.

tensorboard --logdir runs_joints/train/logs/


To run the FDDP optimization, run the script:

python3 -i $LOCOSIM_DIR/fddp_optimization/scripts/ 

This will solve the optimal control problem for all the point in test_points.txt and generate a file test_optim.csv that contains: target, error, landing position, computation time.


You can find all the plots present in the video and the paper in the plots folder