In this project, we propose a simple Hybrid panoptic method designed to cover a large number of classes (2000+) without compromising on accuracy.
Open-vocab panoptic segmentation techniques often face the challenge of significantly increasing the number of classes. While this approach offers greater flexibility, it tends to achieve lower precision on novel classes. Moreover, continually adding new classes to such models can result in a computational burden, as it requires retraining these massive models (with over 1000M parameters).
On the other hand, Closed Vocab methods excel in achieving high accuracy across all annotated classes. However, they are inherently limited to the initially annotated classes. Expanding the scope of such models by adding new classes typically necessitates manual annotation, which can be labor-intensive and time-consuming.
conda create -n odise python=3.9
conda activate odise
pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url
pip install git+
pip install pillow==9.5.0
python -m pip uninstall numpy
python -m pip install numpy==1.23.1
conda activate odise
cd ~/ODISE
python demo/ --input demo/examples/coco.jpg --output demo/coco_pred.jpg
conda create --name openmmlab python=3.8 -y
conda activate openmmlab
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=10.2 -c pytorch
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"
git clone
mim install mmdet
cd ~/mmdetection
#download base model
cd ~/mmdetection
mim download mmdet --config rtmdet_tiny_8xb32-300e_coco --dest .
#add this to a file and run
from mmdet.apis import init_detector, inference_detector
config_file = ''
checkpoint_file = 'rtmdet_tiny_8xb32-300e_coco_20220902_112414-78e30dcc.pth'
model = init_detector(config_file, checkpoint_file, device='cpu') # or device='cuda:0'
inference_detector(model, 'demo/demo.jpg')
git clone
#download finetuned model
cd ~/HybridPan/models
wget --content-disposition ""
cd ~/HybridPan