High-Speed Quadrupedal Locomotion by Imitation-Relaxation Reinforcement Learning

The IRRL learning paradigm is mainly to reduce the difficulty of designing the cost function of a robot controller based on the reinforcement learning method. Inspired by Deepmimic, Motion imitation can guide the neural network controller to rapidly converge to the local optimal solution, to realize bionic and periodic gait. Based on this solution, remove the mimic reward to guide the controller to fit robot's own dynamics for better performance. Based on hyperplane analysis, motion imitation can guide RL to jump over the local optima, and relaxation learning can further exploit the potential of the robot system.

Installation Guide

Prerequisites

OS support: Ubuntu(16.04 is tested)
Python version: 3.5

Dependencies

Software/Package	Version
Raisim	>=1.0.0
raisimogre	-
numpy	-
stable-baselines	2.8.0
Matplotlib	-

Tips during installation and compile

Change the package location in CMakeLists.txt file according to corresponding package installation config

For example:

set(pybind11_DIR ~/raisim_build/share/cmake/pybind11)
set(raisimOgre_DIR ~/raisim_build/share/raisimOgre/cmake)
set(OGRE_DIR ~/raisim_build/lib/OGRE/cmake)

To install the reinforcement learning environment, just run:

cd ./IRRL/FlexibleRobotRaisimGym/
sh compile.sh

Usage

To run the pretrained LSTM controller: Connect xbox gamepad to the computer to control the robot for test. Left analog stick controls the forward and lateral velocity command, right analog stick controls the rotation velocity. Press the start button to finish the test and plot the figures for some physics variables (with matplotlib).

cd ./IRRL/script
python3 run_bp_v5.py --test --cfg ./config/bp5_test.yaml --model ./pkl/bp5_155.pkl --eval --wc

e.g.

To train a LSTM controller for quadrupedal robots:

cd ./IRRL/script
python3 run_bp_v5.py --train --l 0.001 --max_iter 200000000

To relax a pretrained controller:

python3 run_bp_v5.py --train --l 0.0005 --max_iter 400000000 --load ./pkl/bp5_155.pkl

To tune the reward function, change the parameter in default.yaml configuration file.

Data

We put the data used in this article in the Exp_Raw_Data folder

Analysis and post-processing

The post-processing code for Figure 2,3,4,5 is added in Data_Visualization_Code

run the corresponding code to the the corresponding picture

Reference

License

This project is covered under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
Data_Visualization_Code		Data_Visualization_Code
Exp_Raw_Data		Exp_Raw_Data
IRRL		IRRL
.gitignore		.gitignore
LISENSE		LISENSE
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data_Visualization_Code

Data_Visualization_Code

Exp_Raw_Data

Exp_Raw_Data

IRRL

IRRL

.gitignore

.gitignore

LISENSE

LISENSE

readme.md

readme.md

Repository files navigation

High-Speed Quadrupedal Locomotion by Imitation-Relaxation Reinforcement Learning

Installation Guide

Prerequisites

Dependencies

Tips during installation and compile

Usage

Data

Analysis and post-processing

Reference

License

About

Releases

Packages

Languages

WoodenJin/High_Speed_Quadrupedal_Locomotion_by_IRRL

Folders and files

Latest commit

History

Repository files navigation

High-Speed Quadrupedal Locomotion by Imitation-Relaxation Reinforcement Learning

Installation Guide

Prerequisites

Dependencies

Tips during installation and compile

Usage

Data

Analysis and post-processing

Reference

License

About

Resources

Stars

Watchers

Forks

Languages