Skip to content

Latest commit

 

History

History
136 lines (95 loc) · 5.92 KB

README.md

File metadata and controls

136 lines (95 loc) · 5.92 KB

This repository provides the SemanticRT dataset and ECM code for multispectral semantic segmentation (MSS). The repository is structured as follows.

  1. Task Introduction
  2. SemanticRT Dataset
  3. ECM Source Code

📖 Task Introduction

avatar

Introduction Figure. Visual illustration of the advantages of employing multispectral (RGB-Thermal) images for semantic segmentation. The complementary nature of RGB and thermal images are highlighted using yellow and green boxes, respectively. The RGB-only method, DeepLabV3+, is susceptible to incorrect segmentation or even missing target objects entirely. In contrast, multispectral segmentation methods, e.g., EGFNet and our ECM method, which incorporate thermal infrared information, effectively identify the segments within the context. Particularly, our results are visually closer to the ground truths compared to the state-of-the-art EGFNet.


📔 SemanticRT Dataset

SemanticRT dataset - the largest MSS dataset to date, comprises 11,371 high-quality, pixel-level annotated RGB-thermal image pairs. It covers a wide range of challenging scenarios in adverse lighting conditions such as low-light and pitch black, as displayed in the figure below.

avatar

Getting Started

  • Dataset Access

Download the SemanticRT dataset (Google Drive), which is structured as follows:

SemanticRT_dataset/
├─ train.txt
├─ val.txt
├─ test.txt
├─ test_day.txt
├─ test_night.txt
├─ test_mo.txt
├─ test_xxx.txt
│ ···
├─ rgb/
│  ├─ ···
├─ thermal/
│  ├─ ···
├─ labels/
│  ├─ ···
···

Training/testing/validation splits can be found in train.txt, test_xxx.txt or val.txt.

  • Dataset ColorMap

Here is the reference for SemanticRT dataset color visualization.

[
    (0, 0, 0),          # 0: background (unlabeled)
    (72, 61, 39),       # 1: car stop
    (0, 0, 255),        # 2: bike
    (148, 0, 211),      # 3: bicyclist
    (128, 128, 0),      # 4: motorcycle
    (64, 64, 128),      # 5: motorcyclist
    (0, 139, 139),      # 6: car
    (131, 139, 139),    # 7: tricycle
    (192, 64, 0),       # 8: traffic light
    (126, 192, 238),    # 9: box
    (244, 164, 96),     # 10:pole
    (211, 211, 211),    # 11:curve
    (205, 155, 155),    # 12:person
]
  • Dataset Acknowledgement

Our SemanticRT dataset is mainly based on LLVIP as well as other RGBT sources (OSU and INO). They are annotated and adjusted to better fit the MSS task. All data and annotations provided are strictly intended for non-commercial research purpose only. If you are interested in our SemanticRT dataset, we sincerely appreciate your citation of our work and strongly encourage you to cite the source datasets mentioned above.


📗 ECM Source Code

Installation

The code requires python>=3.7, as well as pytorch>=1.9 and torchvision>=0.11. Please follow the instructions here to install both PyTorch and TorchVision dependencies. Installing both PyTorch and TorchVision with CUDA support is strongly recommended. We also provide the environment rgbt.yaml used in this work for reference.

In this repo, we provide the ECM code for three benchmark datasets, including MFNet, PST900, and SemanticRT. In the following, we take ECM on SemanticRT dataset as an example.

Getting Started

  • Clone this repo.

    $ git clone https://github.com/jiwei0921/SemanticRT.git
    $ cd SemanticRT-main/ECM_SemanticRT
  • Model Training

First download the SemanticRT dataset. Then the model can be used in just a few adaptions to start training:

  1. Set your SemanticRT dataset path in ./configs/ECM.json
  2. Perform training, with python train_semanticRT.py
  • Model Inference

Meanwhile, the segmentation maps can be generated by loading our pre-trained model checkpoint, with:

  1. Set your SemanticRT dataset path in ./run/ECM.json
  2. Put the pre-trained ckpt into ./run
  3. Perform inference, with python test_semanticRT.py

Here, we provide the IoU results of MFNet for each class. For a more comprehensive evaluation of our ECM, you can refer to the model inference section above to access ECM's results.

Class Car Person Bike Curve Car Stop Guardrail Color Cone Bump
IoU(%) 87.5 73.4 61.7 46.8 37.5 9.1 51.1 56.9
  • Code Acknowledgement

This code repository was originally built from EGFNet. It was modified and extended to support our network design and dataset setup.


Citation

@InProceedings{ji2023semanticrt,
      title     = {SemanticRT: A Large-Scale Dataset and Method for Robust Semantic Segmentation in Multispectral Images},
      author    = {Ji, Wei and Li, Jingjing and Bian, Cheng and Zhang, Zhicheng and Cheng, Li},
      booktitle = {Proceedings of the 31th ACM International Conference on Multimedia},
      year      = {2023},
      pages     = {3307–3316}
}

If you have any further questions, please email us at wji3@ualberta.ca.