"What's This?" - Learning to Segment Unknown Objects from Manipulation Sequences

Wout Boerdijk, Martin Sundermeyer, Maximilian Durner and Rudolph Triebel

Accepted at ICRA 2021. Paper

Citation

If you find our work useful for your research, please consider citing

@inproceedings{boerdijk2020learning,
  title={" What's This?" - Learning to Segment Unknown Objects from Manipulation Sequences},
  author={Boerdijk, Wout and Sundermeyer, Martin and Durner, Maximilian and Triebel, Rudolph},
  booktitle={ICRA},
  year={2021}
}

Overview

Abstract

We present a novel framework for self-supervised grasped object segmentation with a robotic manipulator. Our method successively learns an agnostic foreground segmentation followed by a distinction between manipulator and object solely by observing the motion between consecutive RGB frames. In contrast to previous approaches, we propose a single, end-to-end trainable architecture which jointly incorporates motion cues and semantic knowledge. Furthermore, while the motion of the manipulator and the object are substantial cues for our algorithm, we present means to robustly deal with distraction objects moving in the background, as well as with completely static scenes. Our method neither depends on any visual registration of a kinematic robot or 3D object models, nor on precise hand-eye calibration or any additional sensor data. By extensive experimental evaluation we demonstrate the superiority of our framework and provide detailed insights on its capability of dealing with the aforementioned extreme cases of motion. We also show that training a semantic segmentation network with the automatically labeled data achieves results on par with manually annotated training data. Code and pretrained model are available at https://github.com/DLR-RM/DistinctNet.

Prerequisites

This repository contains files to learn segmentation of moving objects as well as distinguishing moving foreground into manipulator and object. It also contains scripts to generate the respective data. It is recommended to install all required packages using conda:

conda env create -f environment.yaml

Note: BlenderProc builds its own python environment, so don't run the data generations with BlenderProc with the activated distinctnet environment!

We use yacs configs, and every training script can be run with python train_{motion, motion_recurrent, semantic, semantic_recurrent}.py --config-file path/to/config.yaml [OPTIONAL_ARGS]. Please refer to yacs's documentation for further details.

Pretrained model & Demo

You can download a pretrained model trained to segment agnostic moving foreground here.

We provide a demo script which works with a webcam. After downloading the pretrained model (or with your own), run this script to segment any moving object (or a network finetuned on hand-object segmentation) in front of a camera. This also works with models that are trained including distractor objects.

Training

1. (optional) Learn to segment any moving objects from a static camera

You can skip this step if you plan on using our pretrained model

Generate the synthetic training dataset. See the respective README for further details.
Run the training script: python train_motion.py --config-file configs/motion.yaml. Make sure you've set the experiment and data paths in the config before, or overwrite it (e.g. python ... EXP.ROOT /path/to/experiment_root DATA.ROOT /path/to/data_root).

2. Collect annotated images of a moving manipulator

We provide a recording script which you can use as a starting point.
Then, start prediction on the images. See the prediction script as a starting point. Otherwise have a look at the default predictor.

3. Self-supervised training data generation

Generate occluding objects. See the respective README for further details.
Generate training data by pasting manipulator plus distractor object on random backgrounds. We also provide a script for this.

4. Fine-tune on semantically distinguishing moving manipulator from grasped object

Run the training script: python train_semantic.py --config-file configs/semantic.yaml [OPTIONAL_ARGS].

5. Extreme cases of motion

a. Distractor objects

You can train step 4 with including distinction to distractor objects. Make sure that in step 3 you called the data generation script with --num-distractor-objects [INT] with an integer larger than 0 (e.g. 5). Then, adapt the following in the semantic config:

DATA.WITH_DISTRACTOR_OBJECTS: True  # don't discard the distractor objects during loading
MODEL.OUT_CH: 5  # add another channel to the output layer

This also works for the recurrent semantic config (see also 5.b). Then proceed by running the training script (Step 4).

b. Static scenes

Both motion and semantic models can be trained on static scenes. For this, run python train_{motion, semantic}_recurrent.py --config-file configs/{motion,semantic}_recurrent.yaml [OPTIONAL_ARGS]. Note that as in the paper we only fine-tune on the last few layers, meaning the other layers require pretrained weights. Add a path to the respective state dict in the config, or overwrite the argument (e.g. python ... MODEL.WEIGHTS /path/to/state_dict.pth). For training on static object-manipulator scenes, make sure that during data generation (step 3) you generated multiple images per background by setting --sequence-length [INT] with an integer larger than 2 (e.g. 5).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
blenderproc		blenderproc
configs		configs
data		data
gen		gen
networks		networks
resources		resources
tests		tests
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
environment.yaml		environment.yaml
predict.py		predict.py
predictor.py		predictor.py
record.py		record.py
train_motion.py		train_motion.py
train_motion_recurrent.py		train_motion_recurrent.py
train_semantic.py		train_semantic.py
train_semantic_recurrent.py		train_semantic_recurrent.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

"What's This?" - Learning to Segment Unknown Objects from Manipulation Sequences

Citation

Overview

Abstract

Prerequisites

Pretrained model & Demo

Training

1. (optional) Learn to segment any moving objects from a static camera

2. Collect annotated images of a moving manipulator

3. Self-supervised training data generation

4. Fine-tune on semantically distinguishing moving manipulator from grasped object

5. Extreme cases of motion

a. Distractor objects

b. Static scenes

About

Releases

Packages

Languages

License

DLR-RM/DistinctNet

Folders and files

Latest commit

History

Repository files navigation

"What's This?" - Learning to Segment Unknown Objects from Manipulation Sequences

Citation

Overview

Abstract

Prerequisites

Pretrained model & Demo

Training

1. (optional) Learn to segment any moving objects from a static camera

2. Collect annotated images of a moving manipulator

3. Self-supervised training data generation

4. Fine-tune on semantically distinguishing moving manipulator from grasped object

5. Extreme cases of motion

a. Distractor objects

b. Static scenes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages