By Harald Lilja
This repository contains an implementation of an encoder-decoder network for semantic scene segmentation based on 2D-LiDAR and depth data. The network employs sensor fusion on a network level with feature maps from both modalities being concatenated during upsampling in order to learn correlated information from the data inputs. Applying this strategy yields improvement in accuracy while acting independantly from textures in the scene.
The network design is based on two modified instances of [SalsaNet](https://gitlab.com/aksoyeren/salsanet) with three shared representation layers in the decoder.
Predictions from an unseen circuit after training on a dataset of 1080 datapoints from a different circuit.