Skip to content

Code for the paper Continual Learning from Demonstration of Robotic Skills

License

Notifications You must be signed in to change notification settings

sayantanauddy/clfd

Repository files navigation

Continual Learning from Demonstration of Robotics Skills

Methods for teaching motion skills to robots focus on training for a single skill at a time. Robots capable of learning from demonstration can considerably benefit from the added ability to learn new movement skills without forgetting what was learned in the past. To this end, we propose an approach for continual learning from demonstration using hypernetworks and neural ordinary differential equation solvers. We empirically demonstrate the effectiveness of this approach in remembering long sequences of trajectory learning tasks without the need to store any data from past demonstrations. Our results show that hypernetworks outperform other state-of-the-art continual learning approaches for learning from demonstration. In our experiments, we use the popular LASA benchmark, and two new datasets of kinesthetic demonstrations collected with a real robot that we introduce in our paper called the HelloWorld and RoboTasks datasets. We evaluate our approach on a physical robot and demonstrate its effectiveness in learning realistic robotic tasks involving changing positions as well as orientations. We report both trajectory error metrics and continual learning metrics, and we propose two new continual learning metrics. Our code, along with the newly collected datasets, is available in this repository.

Fig. 1: After learning one letter at a time with the same model, the robot can reproduce all letters from the HelloWorld dataset.

Fig. 2: The robot is able to reproduce any task after continually learning the four realistic tasks of the RoboTasks dataset (each task involves changing positions and orientations).

Here is a very short overview of our approach (also available on YouTube):

clfd.mp4

HelloWorld Dataset

HelloWorld is a dataset of kinesthetic demonstrations we collected using the Franka Emika Panda robot. The x and y coordinates of the robot's end-effector were recorded while a human user guided it kinesthetically to write the 7 lower-case letters h,e,l,o,w,r,d one at a time on a horizontal surface. The HelloWorld dataset consists of 7 tasks, each containing 8 slightly varying demonstrations of a letter. Each demonstration is a sequence of 1000 2-D points. After training on all the tasks, the objective is to make the robot write the words hello world. Our motivation for using this dataset is to test our approach on trajectories with loops and to show that it also works on kinesthetically recorded demonstrations using a real robot.

The data for each of the 7 tasks can be found as .npy files in the folder datasets/robot_hello_world/processed_demos.

Fig. 3: Tasks of the HelloWorld dataset.

Please check the file helloworld_dataset.ipynb to see how to load this dataset. Code for using this dataset in the training loop can be found in the training scripts tr_*_node.py (e.g. tr_hn_node.py).

RoboTasks Dataset

RoboTasks is a dataset of kinesthetic demonstrations of realistic robot tasks we collected using the Franka Emika Panda robot. Each task involves learning trajectories of the position (in 3D space) as well as the orientation (in all 3 rotation axes) of the robot's end-effector. The tasks of this dataset are:

  • Task 0 - box opening: the lid of a box is lifted to an open position;
  • Task 1 - bottle shelving: a bottle in a vertical position is transferred to a horizontal position on a shelf.
  • Task 2 - plate stacking: a plate in a vertical position is transferred to a horizontal position on an elevated platform while orienting the arm so as to avoid the blocks used for holding the plate in its initial vertical position;
  • Task 3 - pouring: a cup full of coffee beans is taken from an elevated platform and the contents of the cup are emptied into a container.

The data for each of the 4 tasks can be found as .npy files in the folder datasets/robottasks/pos_ori. Upon loading the data (of each task), we get a numpy array of shape [num_demos=9, trajectory_length=1000, data_dimension=7]. A data point consists of 7 elements: px,py,pz,qw,qx,qy,qz (3D position followed by quaternions in the scalar first format). This represents the position and orientation of the end-effector at each point of a trajectory.

Fig. 3: Collecting demos for the RoboTasks dataset.

Code Instructions

Clone this repository

git clone https://github.com/sayantanauddy/clfd.git

Create a virtual enviornment and install the dependencies

# Create a virtual environment, then activate it
cd <path/to/this/repository>
python -m pip install -r requirements.txt

Install the GPU version of torch if needed.

Execute a training run

Here we show the command for training a Hypernetwork that generates a NODE:

# DATASET: LASA
# NODE TYPE: NODE^T (with time input)

python3 tr_hn_node.py --data_dir datasets/LASA/DataSet --num_iter 15000 --tsub 20 --replicate_num 0 --lr 0.0001 --tnet_dim 2 --tnet_arch 100,100,100 --tnet_act elu --hnet_arch 200,200,200 --task_emb_dim 256 --explicit_time 1 --beta 0.005 --data_class LASA --eval_during_train 0 --seq_file datasets/LASA/lasa_sequence_all.txt --log_dir logs_clfd/lasa_explicit_time --plot_fs 10 --figw 16.0 --figh 3.3 --seed 200 --description tr_hn_node_LASA_t1

Reproducing results

The complete set of commands for reproducing all our experiments can be found in commands_LASA.txt, commands_HelloWorld.txt, and commands_RoboTasks.txt.

Log files are generated in the folder logs_clfd in the following structure:

logs_clfd
├── lasa_explicit_time
│   ├── tr_hn_node_LASA_t1
│   │   ├── 211123_190744_seed200
│   │   │   ├── commandline_args.json
│   │   │   ├── eval_results.json
│   │   │   ├── log.log
│   │   │   ├── models
│   │   │   │   └── hnet_25.pth
│   │   │   └── plot_trajectories_tr_hn_node_LASA_t1.pdf

Description of generated log files:

  • commandline_args.json: Contains the command line arguments used for running the training script.
  • eval_results.json: JSON file containing the evaluation results. For each task, evaluation is carried out for that task and all previous tasks.
  • log.log: Screen dump of the training run.
  • models/: Folder containing the trained model (in most cases only the final model is saved after all tasks are learned).
  • plot_trajectories_tr_hn_node_LASA_t1.pdf: A plot of the trajectories (of all previous tasks) predicted by the last model after all tasks are learned. This file is created only for the LASA and HelloWorld datasets.

View Trajectories Predicted by Trained Models

First download the pretrained models and extract them to the directory trained_models

cd <path/to/this/repository>
cd trained_models
wget https://iis.uibk.ac.at/public/auddy/clfd/trained_models/trained_models.tar.gz
tar -xvf trained_models.tar.gz --strip-components=1

Then run the notebook predict_traj_saved_models.ipynb for generating trajectory predictions using the pretrained models provided by us.

Acknowledgements

Citation

If you use this code or our results in your research, please cite:

@article{AUDDY2023104427,
title = {Continual learning from demonstration of robotics skills},
journal = {Robotics and Autonomous Systems},
volume = {165},
pages = {104427},
year = {2023},
issn = {0921-8890},
doi = {https://doi.org/10.1016/j.robot.2023.104427},
url = {https://www.sciencedirect.com/science/article/pii/S0921889023000660},
author = {Sayantan Auddy and Jakob Hollenstein and Matteo Saveriano and Antonio Rodríguez-Sánchez and Justus Piater},
keywords = {Learning from demonstration, Continual learning, Hypernetwork, Neural ordinary differential equation solver}
}