Skip to content

Latest commit

 

History

History

segmentation

InternImage for Semantic Segmentation

This folder contains the implementation of the InternImage for semantic segmentation.

Our segmentation code is developed on top of MMSegmentation v0.27.0.

Installation

  • Clone this repository:
git clone https://github.com/OpenGVLab/InternImage.git
cd InternImage
  • Create a conda virtual environment and activate it:
conda create -n internimage python=3.9
conda activate internimage

For examples, to install torch==1.11 with CUDA==11.3:

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113  -f https://download.pytorch.org/whl/torch_stable.html
  • Install other requirements:

    note: conda opencv will break torchvision as not to support GPU, so we need to install opencv using pip.

conda install -c conda-forge termcolor yacs pyyaml scipy pip -y
pip install opencv-python
  • Install timm, mmcv-full and `mmsegmentation':
pip install -U openmim
mim install mmcv-full==1.5.0
mim install mmsegmentation==0.27.0
pip install timm==0.6.11 mmdet==2.28.1
  • Install other requirements:
pip install opencv-python termcolor yacs pyyaml scipy
# Please use a version of numpy lower than 2.0
pip install numpy==1.26.4
pip install pydantic==1.10.13
  • Compile CUDA operators

Before compiling, please use the nvcc -V command to check whether your nvcc version matches the CUDA version of PyTorch.

cd ./ops_dcnv3
sh ./make.sh
# unit test (should see all checking is True)
python test.py
  • You can also install the operator using precompiled .whl files DCNv3-1.0-whl

Data Preparation

Prepare datasets according to the guidelines in MMSegmentation.

Released Models

Dataset: ADE20K
method backbone resolution mIoU (ss/ms) #param FLOPs Config Download
UperNet InternImage-T 512x512 47.9 / 48.1 59M 944G config ckpt | log
UperNet InternImage-S 512x512 50.1 / 50.9 80M 1017G config ckpt | log
UperNet InternImage-B 512x512 50.8 / 51.3 128M 1185G config ckpt | log
UperNet InternImage-L 640x640 53.9 / 54.1 256M 2526G config ckpt | log
UperNet InternImage-XL 640x640 55.0 / 55.3 368M 3142G config ckpt | log
UperNet InternImage-H 896x896 59.9 / 60.3 1.12B 3566G config ckpt | log
Mask2Former InternImage-H 896x896 62.6 / 62.9 1.31B 4635G config ckpt | log
Dataset: Cityscapes
method backbone resolution mIoU (ss/ms) #params FLOPs Config Download
UperNet InternImage-T 512x1024 82.58 / 83.40 59M 1889G config ckpt | log
UperNet InternImage-S 512x1024 82.74 / 83.45 80M 2035G config ckpt | log
UperNet InternImage-B 512x1024 83.18 / 83.97 128M 2369G config ckpt | log
UperNet InternImage-L 512x1024 83.68 / 84.41 256M 3234G config ckpt | log
UperNet* InternImage-L 512x1024 85.94 / 86.22 256M 3234G config ckpt | log
UperNet InternImage-XL 512x1024 83.62 / 84.28 368M 4022G config ckpt | log
UperNet* InternImage-XL 512x1024 86.20 / 86.42 368M 4022G config ckpt | log
SegFormer* InternImage-L 512x1024 85.16 / 85.67 220M 1580G config ckpt | log
SegFormer* InternImage-XL 512x1024 85.41 / 85.93 330M 2364G config ckpt | log
Mask2Former* InternImage-H 1024x1024 86.37 / 86.96 1094M 7878G config ckpt | log

* denotes the model is trained using extra Mapillary dataset.

Dataset: COCO-Stuff-164K
method backbone resolution mIoU (ss/ms) #params FLOPs Config Download
Mask2Former InternImage-H 896x896 52.6 / 52.8 1.31B 4635G config ckpt | log
Dataset: COCO-Stuff-10K
method backbone resolution mIoU (ss/ms) #params FLOPs Config Download
Mask2Former InternImage-H 512x512 59.2 / 59.6 1.28B 1528G config ckpt | log
Dataset: Pascal-Context-59
method backbone resolution mIoU (ss/ms) #param FLOPs Config Download
Mask2Former InternImage-H 480x480 69.7 / 70.3 1.07B 867G config ckpt | log
Dataset: NYU-Depth-V2
method backbone resolution mIoU (ss/ms) #param FLOPs Config Download
Mask2Former InternImage-H 480x480 67.1 / 68.1 1.07B 867G config ckpt | log
Dataset: Mapillary
method backbone resolution #param FLOPs Config Download
UperNet InternImage-L 512x1024 256M 3234G config ckpt
UperNet InternImage-XL 512x1024 368M 4022G config ckpt
SegFormer InternImage-L 512x1024 220M 1580G config ckpt
SegFormer InternImage-XL 512x1024 330M 2364G config ckpt
Mask2Former InternImage-H 896x896 1094M 7878G config ckpt

Evaluation

To evaluate our InternImage on ADE20K val, run:

sh dist_test.sh <config-file> <checkpoint> <gpu-num> --eval mIoU

For example, to evaluate the InternImage-T with a single GPU:

python test.py configs/ade20k/upernet_internimage_t_512_160k_ade20k.py pretrained/upernet_internimage_t_512_160k_ade20k.pth --eval mIoU

For example, to evaluate the InternImage-B with a single node with 8 GPUs:

sh dist_test.sh configs/ade20k/upernet_internimage_b_512_160k_ade20k.py pretrained/upernet_internimage_b_512_160k_ade20k.pth 8 --eval mIoU

Training

To train an InternImage on ADE20K, run:

sh dist_train.sh <config-file> <gpu-num>

For example, to train InternImage-T with 8 GPU on 1 node (total batch size 16), run:

sh dist_train.sh configs/ade20k/upernet_internimage_t_512_160k_ade20k.py 8

Manage Jobs with Slurm

For example, to train InternImage-XL with 8 GPU on 1 node (total batch size 16), run:

GPUS=8 sh slurm_train.sh <partition> <job-name> configs/ade20k/upernet_internimage_xl_640_160k_ade20k.py

Image Demo

To inference a single/multiple image like this. If you specify image containing directory instead of a single image, it will process all the images in the directory.

CUDA_VISIBLE_DEVICES=0 python image_demo.py \
  data/ade/ADEChallengeData2016/images/validation/ADE_val_00000591.jpg \
  configs/ade20k/upernet_internimage_t_512_160k_ade20k.py  \
  checkpoint_dir/seg/upernet_internimage_t_512_160k_ade20k.pth  \
  --palette ade20k

Export

Install mmdeploy at first:

pip install mmdeploy==0.14.0

To export a segmentation model from PyTorch to TensorRT, run:

MODEL="model_name"
CKPT_PATH="/path/to/model/ckpt.pth"

python deploy.py \
    "./deploy/configs/mmseg/segmentation_tensorrt_static-512x512.py" \
    "./configs/ade20k/${MODEL}.py" \
    "${CKPT_PATH}" \
    "./deploy/demo.png" \
    --work-dir "./work_dirs/mmseg/${MODEL}" \
    --device cuda \
    --dump-info

For example, to export upernet_internimage_t_512_160k_ade20k from PyTorch to TensorRT, run:

MODEL="upernet_internimage_t_512_160k_ade20k"
CKPT_PATH="/path/to/model/ckpt/upernet_internimage_t_512_160k_ade20k.pth"

python deploy.py \
    "./deploy/configs/mmseg/segmentation_tensorrt_static-512x512.py" \
    "./configs/ade20k/${MODEL}.py" \
    "${CKPT_PATH}" \
    "./deploy/demo.png" \
    --work-dir "./work_dirs/mmseg/${MODEL}" \
    --device cuda \
    --dump-info