This folder contains the implementation of the InternImage for semantic segmentation.
Our segmentation code is developed on top of MMSegmentation v0.27.0.
- Installation
- Data Preparation
- Released Models
- Evaluation
- Training
- Manage Jobs with Slurm
- Image Demo
- Export
- Clone this repository:
git clone https://github.com/OpenGVLab/InternImage.git
cd InternImage
- Create a conda virtual environment and activate it:
conda create -n internimage python=3.9
conda activate internimage
- Install
CUDA>=10.2
withcudnn>=7
following the official installation instructions - Install
PyTorch>=1.10.0
andtorchvision>=0.9.0
withCUDA>=10.2
:
For examples, to install torch==1.11
with CUDA==11.3
:
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 -f https://download.pytorch.org/whl/torch_stable.html
-
Install other requirements:
note: conda opencv will break torchvision as not to support GPU, so we need to install opencv using pip.
conda install -c conda-forge termcolor yacs pyyaml scipy pip -y
pip install opencv-python
- Install
timm
,mmcv-full
and `mmsegmentation':
pip install -U openmim
mim install mmcv-full==1.5.0
mim install mmsegmentation==0.27.0
pip install timm==0.6.11 mmdet==2.28.1
- Install other requirements:
pip install opencv-python termcolor yacs pyyaml scipy
# Please use a version of numpy lower than 2.0
pip install numpy==1.26.4
pip install pydantic==1.10.13
- Compile CUDA operators
Before compiling, please use the nvcc -V
command to check whether your nvcc
version matches the CUDA version of PyTorch.
cd ./ops_dcnv3
sh ./make.sh
# unit test (should see all checking is True)
python test.py
- You can also install the operator using precompiled
.whl
files DCNv3-1.0-whl
Prepare datasets according to the guidelines in MMSegmentation.
Dataset: ADE20K
method | backbone | resolution | mIoU (ss/ms) | #param | FLOPs | Config | Download |
---|---|---|---|---|---|---|---|
UperNet | InternImage-T | 512x512 | 47.9 / 48.1 | 59M | 944G | config | ckpt | log |
UperNet | InternImage-S | 512x512 | 50.1 / 50.9 | 80M | 1017G | config | ckpt | log |
UperNet | InternImage-B | 512x512 | 50.8 / 51.3 | 128M | 1185G | config | ckpt | log |
UperNet | InternImage-L | 640x640 | 53.9 / 54.1 | 256M | 2526G | config | ckpt | log |
UperNet | InternImage-XL | 640x640 | 55.0 / 55.3 | 368M | 3142G | config | ckpt | log |
UperNet | InternImage-H | 896x896 | 59.9 / 60.3 | 1.12B | 3566G | config | ckpt | log |
Mask2Former | InternImage-H | 896x896 | 62.6 / 62.9 | 1.31B | 4635G | config | ckpt | log |
Dataset: Cityscapes
method | backbone | resolution | mIoU (ss/ms) | #params | FLOPs | Config | Download |
---|---|---|---|---|---|---|---|
UperNet | InternImage-T | 512x1024 | 82.58 / 83.40 | 59M | 1889G | config | ckpt | log |
UperNet | InternImage-S | 512x1024 | 82.74 / 83.45 | 80M | 2035G | config | ckpt | log |
UperNet | InternImage-B | 512x1024 | 83.18 / 83.97 | 128M | 2369G | config | ckpt | log |
UperNet | InternImage-L | 512x1024 | 83.68 / 84.41 | 256M | 3234G | config | ckpt | log |
UperNet* | InternImage-L | 512x1024 | 85.94 / 86.22 | 256M | 3234G | config | ckpt | log |
UperNet | InternImage-XL | 512x1024 | 83.62 / 84.28 | 368M | 4022G | config | ckpt | log |
UperNet* | InternImage-XL | 512x1024 | 86.20 / 86.42 | 368M | 4022G | config | ckpt | log |
SegFormer* | InternImage-L | 512x1024 | 85.16 / 85.67 | 220M | 1580G | config | ckpt | log |
SegFormer* | InternImage-XL | 512x1024 | 85.41 / 85.93 | 330M | 2364G | config | ckpt | log |
Mask2Former* | InternImage-H | 1024x1024 | 86.37 / 86.96 | 1094M | 7878G | config | ckpt | log |
* denotes the model is trained using extra Mapillary dataset.
Dataset: COCO-Stuff-164K
Dataset: COCO-Stuff-10K
Dataset: Pascal-Context-59
Dataset: NYU-Depth-V2
Dataset: Mapillary
method | backbone | resolution | #param | FLOPs | Config | Download |
---|---|---|---|---|---|---|
UperNet | InternImage-L | 512x1024 | 256M | 3234G | config | ckpt |
UperNet | InternImage-XL | 512x1024 | 368M | 4022G | config | ckpt |
SegFormer | InternImage-L | 512x1024 | 220M | 1580G | config | ckpt |
SegFormer | InternImage-XL | 512x1024 | 330M | 2364G | config | ckpt |
Mask2Former | InternImage-H | 896x896 | 1094M | 7878G | config | ckpt |
To evaluate our InternImage
on ADE20K val, run:
sh dist_test.sh <config-file> <checkpoint> <gpu-num> --eval mIoU
For example, to evaluate the InternImage-T
with a single GPU:
python test.py configs/ade20k/upernet_internimage_t_512_160k_ade20k.py pretrained/upernet_internimage_t_512_160k_ade20k.pth --eval mIoU
For example, to evaluate the InternImage-B
with a single node with 8 GPUs:
sh dist_test.sh configs/ade20k/upernet_internimage_b_512_160k_ade20k.py pretrained/upernet_internimage_b_512_160k_ade20k.pth 8 --eval mIoU
To train an InternImage
on ADE20K, run:
sh dist_train.sh <config-file> <gpu-num>
For example, to train InternImage-T
with 8 GPU on 1 node (total batch size 16), run:
sh dist_train.sh configs/ade20k/upernet_internimage_t_512_160k_ade20k.py 8
For example, to train InternImage-XL
with 8 GPU on 1 node (total batch size 16), run:
GPUS=8 sh slurm_train.sh <partition> <job-name> configs/ade20k/upernet_internimage_xl_640_160k_ade20k.py
To inference a single/multiple image like this. If you specify image containing directory instead of a single image, it will process all the images in the directory.
CUDA_VISIBLE_DEVICES=0 python image_demo.py \
data/ade/ADEChallengeData2016/images/validation/ADE_val_00000591.jpg \
configs/ade20k/upernet_internimage_t_512_160k_ade20k.py \
checkpoint_dir/seg/upernet_internimage_t_512_160k_ade20k.pth \
--palette ade20k
Install mmdeploy
at first:
pip install mmdeploy==0.14.0
To export a segmentation model from PyTorch to TensorRT, run:
MODEL="model_name"
CKPT_PATH="/path/to/model/ckpt.pth"
python deploy.py \
"./deploy/configs/mmseg/segmentation_tensorrt_static-512x512.py" \
"./configs/ade20k/${MODEL}.py" \
"${CKPT_PATH}" \
"./deploy/demo.png" \
--work-dir "./work_dirs/mmseg/${MODEL}" \
--device cuda \
--dump-info
For example, to export upernet_internimage_t_512_160k_ade20k
from PyTorch to TensorRT, run:
MODEL="upernet_internimage_t_512_160k_ade20k"
CKPT_PATH="/path/to/model/ckpt/upernet_internimage_t_512_160k_ade20k.pth"
python deploy.py \
"./deploy/configs/mmseg/segmentation_tensorrt_static-512x512.py" \
"./configs/ade20k/${MODEL}.py" \
"${CKPT_PATH}" \
"./deploy/demo.png" \
--work-dir "./work_dirs/mmseg/${MODEL}" \
--device cuda \
--dump-info