This repository provides an implementation of the paper:
DribbleBot: Dynamic Legged Manipulation in the WildYandong Ji*, Gabriel B. Margolis* and Pulkit Agrawal
International Conference on Robotics and Automation (ICRA), 2023
paper / bibtex / project page
This training code, environment and documentation build on Walk these Ways: Tuning Robot Control for Generalization with Multiplicity of Behavior by Gabriel Margolis and Pulkit Agrawal, Improbable AI Lab, MIT (Paper: https://arxiv.org/pdf/2212.03238.pdf) and the Isaac Gym simulator from NVIDIA (Paper: https://arxiv.org/abs/2108.10470). All redistributed code retains its original license.
Our initial release provides the following features:
- Train reinforcement learning policies for the Go1 robot using PPO, IsaacGym, Domain Randomization to dribble a soccer ball in simulation following a random ball velocity command in global frame.
- Evaluate a pre-trained soccer policy in simulation.
Simulated Training and Evaluation: Isaac Gym requires an NVIDIA GPU. To train in the default configuration, we recommend a GPU with at least 10GB of VRAM. The code can run on a smaller GPU if you decrease the number of parallel environments (Cfg.env.num_envs
). However, training will be slower with fewer environments.
conda create -n dribblebot python==3.8
conda activate dribblebot
pip3 install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
-
Download and install Isaac Gym Preview 4 from https://developer.nvidia.com/isaac-gym
-
unzip the file via:
tar -xf IsaacGym_Preview_4_Package.tar.gz
-
now install the python package
cd isaacgym/python && pip install -e .
-
Verify the installation by try running an example
python examples/1080_balls_of_solitude.py
-
For troubleshooting check docs
isaacgym/docs/index.html
In this repository, run pip install -e .
If everything is installed correctly, you should be able to run the evaluation script with:
python scripts/play_dribbling_pretrained.py
You should see a robot manipulate a yellow soccer following random global velocity commands.
CODE STRUCTURE The main environment for simulating a legged robot is in legged_robot.py. The default configuration parameters including reward weightings are defined in legged_robot_config.py::Cfg.
There are three scripts in the scripts directory:
scripts
├── __init__.py
├── play_dribbling_custom.py
├── play_dribbling_pretrained.py
└── train_dribbling.py
To train the Go1 controller from Dribblebot, run:
python scripts/train_dribbling.py
After initializing the simulator, the script will print out a list of metrics every ten training iterations.
Training with the default configuration requires about 12GB of GPU memory. If you have less memory available, you can
still train by reducing the number of parallel environments used in simulation (the default is Cfg.env.num_envs = 1000
).
To visualize training progress, first set up weights and bias (wandb):
Weights and Biases is the service that will provide you a dashboard where you can see the progress log of your training runs, including statistics and videos.
First, follow the instructions here to create you wandb account: https://docs.wandb.ai/quickstart
Make sure to perform the wandb.login()
step from your local computer.
Finally, use a web browser to go to the wandb IP (defaults to localhost:3001
)
To evaluate a pretrained trained policy, run play_dribbling_pretrained.py
. We provie a pretrained agent checkpoint in the ./runs/dribbling directory.
We are working a modular version of the vision processing code so DribbleBot can be easily deployed on Go1. It will be added in a future release.