3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods

Reference implementation of the publication 3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods published on the 33rd IEEE Intelligent Vehicles Symposium (IV 2022) in Aachen, Germany.

This repository implements a RNN-based compression framework for range images implemented in Tensorflow 2.9.0. Furthermore, it implements preprocessing and inference nodes for ROS Noetic in order to perform the compression task on a /points2 sensor data stream.

3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods

Till Beemelmanns, Yuchen Tao, Bastian Lampe, Lennart Reiher, Raphael van Kempen, Timo Woopen, and Lutz Eckstein

Institute for Automotive Engineering (ika), RWTH Aachen University

Abstract — Storing and transmitting LiDAR point cloud data is essential for many AV applications, such as training data collection, remote control, cloud services or SLAM. However, due to the sparsity and unordered structure of the data, it is difficult to compress point cloud data to a low volume. Transforming the raw point cloud data into a dense 2D matrix structure is a promising way for applying compression algorithms. We propose a new lossless and calibrated 3D-to-2D transformation which allows compression algorithms to efficiently exploit spatial correlations within the 2D representation. To compress the structured representation, we use common image compression methods and also a self-supervised deep compression approach using a recurrent neural network. We also rearrange the LiDAR’s intensity measurements to a dense 2D representation and propose a new metric to evaluate the compression performance of the intensity. Compared to approaches that are based on generic octree point cloud compression or based on raw point cloud data compression, our approach achieves the best quantitative and visual performance.

Content

Approach
Range Image Compression
Model Training
Download of dataset, model weights and evaluation frames
Inference & Evaluation Node
Authors of this Repository
Cite
Acknowledgement

Approach

Range Image Compression

This learning framework serves the purpose of compressing range images projected from point clouds captured by Velodyne LiDAR sensor. The network architectures are based on the work proposed by Toderici et al. and implemented in this repository using Tensorflow 2.9.0.

The following architectures are implemented:

Additive LSTM - Architecture that we used for the results in our paper.
Additive LSTM Demo - Lightweight version of the Additive LSTM in order to test your setup
Additive GRU - Uses Gated Recurrent Units instead of a LSTM cell. We achieved slightly worse results compared to the LSTM variant
Oneshot LSTM - Does not use the additive reconstruction path in the network, as shown below.

The differences between the additive and the oneshot framework are visualized below.

Additive reconstruction framework

This reconstruction framework adds the previous progressive reconstruction to the current reconstruction.

Oneshot reconstruction framework

The current reconstruction is directly computed from the bitstream by the decoder RNN.

Model Input - Output

The range representations are originally (32, 1812) single channel 16-bit images. During training, they are online random cropped as (32, 32) image patches. The label is the input image patch itself. Note that during validation, the random crop is performed offline to keep the validation set constant.

Input: 16-bit image patch of shape (32, 1812, 1) which will be random cropped to shape (32, 32, 1) during training.

Output: reconstructed 16-bit image patch of shape (32, 32, 1).

Input	Label	Prediction

During inference, the model can be applied to range image with any width and height divisible to 32, as the following example shows:

Input range image representation:

Predicted range image representation:

Model Training

Sample Dataset

A demo sample dataset can be found in range_image_compression/demo_samples in order to quickly test your environment.

Dependencies

The implementation is based on Tensorflow 2.9.0. All necessary dependencies can be installed with

# /range_image_compression >$
pip install -r requirements.txt

The architecture Oneshot LSTM uses GDN layers proposed by Ballé et al., which is supported in TensorFlow Compression. It can be installed with:

# /range_image_compression >$
pip3 install tensorflow-compression==2.9.0

Run Training

A training with the lightweight demo network using the sample dataset can be started with the following command:

# /range_image_compression >$
python3 ./train.py \
        --train_data_dir="demo_samples/training" \
        --val_data_dir="demo_samples/validation" \
        --train_output_dir="output" \
        --model=additive_lstm_demo

If you downloaded the whole dataset (see below) then you would have to adapt the values for --train_data_dir and --val_data_dir accordingly.

Run Training with Docker

We also provide a docker environment to train the model. First build the docker image and then start the training:

# /docker >$
./docker_build.sh
./docker_train.sh

Configuration

Parameters for the training can be set up in the configurations located in the directory range_image_compression/configs. It is recommended to use a potent multi-gpu setup for efficient training.

Learning Rate Schedule

The following cosine learning rate scheduler is used:

Parameter	Description	Example
`init_learning_rate`	initial learning rate	`2e-3`
`min_learning_rate`	minimum learning rate	`1e-7`
`max_learning_rate_epoch`	epoch where learning rate begins to decrease	`0`
`min_learning_rate_epoch`	epoch of minimum learning rate	`3000`

Network Architecture

Default parameters for the network architectures

Parameter	Description	Example
`bottleneck`	bottleneck size of architecture	`32`
`num_iters`	maximum number of iterations	`32`

Model Zoo

Model Architecture	Filename	Batch Size	Validation MAE	Evaluation SNNRMSE `iter=32`
Additive LSTM	`additive_lstm_32b_32iter.hdf5`	32	`2.6e-04`	`0.03473`
Additive LSTM Slim	`additive_lstm_128b_32iter_slim.hdf5`	128	`2.3e-04`	`0.03636`
Additive LSTM Demo	`additive_lstm_128b_32iter_demo.hdf5`	128	`2.9e-04`	`0.09762`
Oneshot LSTM	`oneshot_lstm_b128_32iter.hdf5`	128	`2.9e-04`	`0.05137`
Additive GRU	Will be uploaded soon	TBD	TBD	TBD

Download of Dataset, Models and Evaluation Frames

The dataset to train the range image compression framework can be retrieved by filling out the following form https://www.unicaragil.de/de/point-cloud-compression. After filling out the form, you will obtain a link where you can download the following files

pointcloud_compression.zip (1.8 GB): ZIP File which contains three directories
- train: 30813 range images for training
- val: 6162 cropped 32 x 32 range images for validation
- test: 1217 range images for testing
evaluation_frames.bag (17.6 MB): ROS bag which contains the Velodyne package data in order to evaluate the model. Frames in this bag file were not used for training nor for validation.
additive_lstm_32b_32iter.hdf5 (271 MB): Trained model with a bottleneck size of 32
additive_lstm_128b_32iter_slim.hdf5 (67.9 MB): Trained model with a bottleneck size of 32
oneshot_lstm_b128_32iter.hdf5 (68.1 MB): Trained model with a bottleneck size of 32
additive_lstm_256b_32iter_demo.hdf5 (14.3 MB): Trained model with a bottleneck sizeof 32

Inference & Evaluation Node

The inference and evaluation is conducted with ROS. The inferences and preprocessing nodes can be found under catkin_ws/src. The idea is to use recorded Velodyne package data which are stored in a .bag file. This bag file is then played with rosbag play and the preprocessing and inference nodes are applied to this raw sensor data stream. In order to run the inference and evaluation you will need to execute the following steps.

1. Initialize Velodyne Driver

To run the evaluation code we need to clone the Velodyne Driver from the official repository. The driver is integrated as a submodule to catkin_ws/src/velodyne. You can initialize the submodule with:

git submodule init
git submodule update

2. Pull Docker Image

We provide a docker image which contains ROS Noetic and Tensorflow in order to execute the inference and evaluation nodes. You can pull this dockerfile from Docker Hub with the command:

docker pull tillbeemelmanns/pointcloud_compression:noetic

3. Download Weights

Download the model's weights as .hdf5 from the download link and copy this file to catkin_ws/models. Then, in the configuration file insert the correct path for the parameter weights_path. Note, that we will later mount the directory catkin_ws into the docker container to /catkin_ws. Hence, the path will start with /catkin_ws.

4. Download ROS Bag

Perform the same procedure with the bag file. Copy the bag file into catkin_ws/rosbags. The filename should be evaluation_frames.bag. This bag file will be used in this launch file.

5. Start Docker Container

Start the docker container using the script

# /docker >$
./docker_eval.sh

Calling this script for the first time will just start the container. Then open a second terminal and execute ./docker_eval.sh script again in order to open the container with bash.

# /docker >$
./docker_eval.sh

6. Build & Execute

Run the following commands inside the container to build and launch the compression framework.

# /catkin_ws >$
catkin build
source devel/setup.bash
roslaunch pointcloud_to_rangeimage compression.launch

You should see the following RVIZ window which visualizes the reconstructed point cloud.

7. GPU Inference

In case you have a capable GPU, try to change line 44 in docker_eval.sh and define your GPU ID with flag -g

$DIR/run-ros.sh -g 0 $@

Authors of this Repository

Till Beemelmanns and Yuchen Tao

Mail: till.beemelmanns (at) rwth-aachen.de

Mail: yuchen.tao (at) rwth-aachen.de

Cite

@INPROCEEDINGS{9827270,
  author={Beemelmanns, Till and Tao, Yuchen and Lampe, Bastian and Reiher, Lennart and Kempen, Raphael van and Woopen, Timo and Eckstein, Lutz},
  booktitle={2022 IEEE Intelligent Vehicles Symposium (IV)}, 
  title={3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods}, 
  year={2022},
  volume={},
  number={},
  pages={345-351},
  doi={10.1109/IV51971.2022.9827270}}

Acknowledgement

This research is accomplished within the project ”UNICARagil” (FKZ 16EMO0284K). We acknowledge the financial support for the project by the Federal Ministry of Education and Research of Germany (BMBF).

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
assets		assets
catkin_ws		catkin_ws
docker		docker
range_image_compression		range_image_compression
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

License

ika-rwth-aachen/Point-Cloud-Compression

Folders and files

Latest commit

History

Repository files navigation