[CVPR2024] Open-Vocabulary Semantic Segmentation with Image Embedding Balancing

We release our code and trained models for our CVPR2024 paper Open-Vocabulary Semantic Segmentation with Image Embedding Balancing

Getting started

Environment setup

First, clone this repo:

git clone https://github.com/slonetime/EBSeg.git

Then, create a new conda env and install required packeges:

cd EBSeg
conda create --name ebseg python=3.9
conda activate ebseg
pip install -r requirements.txt
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'

At last, install the MultiScaleDeformableAttention in Mask2former:

cd ebseg/model/mask2former/modeling/pixel_decoder/ops/
sh make.sh

Data preparation

We follow the dataset preparation process in SAN, so please follow the instructions in https://github.com/MendelXu/SAN?tab=readme-ov-file#data-preparation.

Training

First, change the config_file path, dataset_dir path and ourput_dir path in train.sh. Then, you can train an EBSeg model with the following command:

bash train.sh

Inference with our trained model

Download our trained models from the url links in the followding table(with mIoU metric):

Model	A-847	PC-459	A-150	PC-59	VOC
EBSeg-B	11.1	17.3	30.0	56.7	94.6
EBSeg-L	13.7	21.0	32.8	60.2	96.4

Like training, you should change the config_file path, dataset_dir path, checkpoint path and ourput_dir path in test.sh. Then, test a EBSeg model by:

bash test.sh

Acknowledgments

Our code are based on SAN, CLIP, CLIP Surgery, Mask2former and ODISE.

We thanks them for their excellent works!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
configs		configs
datasets		datasets
ebseg		ebseg
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
test.sh		test.sh
train.sh		train.sh
train_net.py		train_net.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[CVPR2024] Open-Vocabulary Semantic Segmentation with Image Embedding Balancing

Getting started

Environment setup

Data preparation

Training

Inference with our trained model

Acknowledgments

About

Releases

Packages

Languages

License

slonetime/EBSeg

Folders and files

Latest commit

History

Repository files navigation

[CVPR2024] Open-Vocabulary Semantic Segmentation with Image Embedding Balancing

Getting started

Environment setup

Data preparation

Training

Inference with our trained model

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages