Companion website for the article Learning Long-Range Perception Using Self-Supervision from Short-Range Sensors and Odometry
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
code
img
video
Learning_to_Detect_Obstacles.pdf
Letter_Learning_to_Detect_Obstacles.pdf
README.md

README.md

Learning Long-Range Perception Using Self-Supervision from Short-Range Sensors and Odometry

Mirko Nava, Jérôme Guzzi, R. Omar Chavez-Garcia, Luca M. Gambardella and Alessandro Giusti

Dalle Molle Institute for Artificial Intelligence, USI-SUPSI, Lugano (Switzerland)

Abstract

We introduce a general self-supervised approach to predict the future outputs of a short-range sensor (such as a proximity sensor) given the current outputs of a long-range sensor (such as a camera); we assume that the former is directly related to some piece of information to be perceived (such as the presence of an obstacle in a given position), whereas the latter is information-rich but hard to interpret directly. We instantiate and implement the approach on a small mobile robot to detect obstacles at various distances using the video stream of the robot's forward-pointing camera, by training a convolutional neural network on automatically-acquired datasets. We quantitatively evaluate the quality of the predictions on unseen scenarios, qualitatively evaluate robustness to different operating conditions, and demonstrate usage as the sole input of an obstacle-avoidance controller. We additionally instantiate the approach on a different simulated scenario with complementary characteristics, to exemplify the generality of our contribution.

Predictions Prediction of a model trained with the proposed approach applied on a camera mounted on a Mighty Thymio (a), on a TurtleBot (b) and on the belt of a person (c).

Predictions Simulation setup and results of the proposed approach applied on 3 cameras mounted on a Pioneer 3AT with different rotations. Left & Center-left: robot setup with cameras' views. Center-right: number of extracted known labels from 70min of recording. Right: achieved AUC score of a model trained from 35min of recording.

The preprint PDF of the article is available at the following link arXiv:1809.07207.

Bibtex

@article{nava2019learning, 
author={M. Nava and J. Guzzi and R. O. Chavez-Garcia and L. Gambardella and A. Giusti}, 
journal={IEEE Robotics and Automation Letters}, 
title={Learning Long-Range Perception Using Self-Supervision from Short-Range Sensors and Odometry}, 
year={2019},
keywords={Range Sensing,Computer Vision for Other Robotic Applications,Deep Learning in Robotics and Automation}, 
doi={10.1109/LRA.2019.2894849}, 
ISSN={2377-3766},
}

Videos

All the video material of models trained with the proposed approach on different scenarios, robots and systems is available here.

Datasets

The real world dataset is available at this link. It is stored as an HDF5 file containing two groups per recording called respectively "bag{index}_x" and "bag{index}_y" for a total of 11 recordings (22 groups).

The simulation dataset is available at this link. It is stored as an HDF5 file containing a main group per recording called "bag{index}". Each main group is divided into subgroups "/x" and "/y" that are respectively divded into "/input_cam1", "/input_cam2", "/input_cam3" and "/output_target1".

Code

The entire codebase is avaliable here. In order to generate the datasets, of which download links are present above, one should launch the script preprocess.py which will create the dataset in hdf5 file format, starting from a collection of ROS bagfiles stored in a given folder.

The script train.py is used to train the model, which is defined in unified_model.py, using a given hdf5 dataset. A list of the available parameters can be seen by launching python train.py -h .

The script test.py is used to test the model, which is defined in unified_model.py, using a subset of the hdf5 groups defined in the script. A list of the available parameters can be seen by launching python test.py -h .

The scripts visualize.py and visualize_output.py are respectively used to visualize the real world dataset collected consisting in the camera's view and the ground truth labels, and to visualize the same information plus the selected models' prediction.