NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning

The offical implementation of "NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning".

News

[6/23] 🔥 We released the checkpoint trained on PNG and Flickr30K Entites. Please see documentations.

What is NICE?

NICE is a multi-task collaborative cascaded framework for Panoptic Narrative Segmentation and Panoptic Narrative Detection (Visual Grounding). It introduces a novel insight that is "mask first and box next".

Instruction

Environments

You need the Pytorch >= 1.7.1, and follow the command that:

conda create -n nice python=3.7.15
conda activate nice
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch
pip install -r requirements.txt

After that, please follow the instruction of detectron2 to install detectron2 for the enviroment with:

python -m pip install -e detectron2

Dataset

Download the 2017 MSCOCO Dataset from its official webpage. You will need the train and validation splits' images and panoptic segmentations annotations.
Download the Panoptic Narrative Grounding Benchmark from the PNG's project webpage. Organize the files as follows:

NICE
|_ panoptic_narrative_grounding
   |_ images
   |  |_ train2017
   |  |_ val2017
   |_ annotations
   |  |_ png_coco_train2017.json
   |  |_ png_coco_val2017.json
   |  |_ panoptic_segmentation
   |  |  |_ train2017
   |  |  |_ val2017
   |  |_ panoptic_train2017.json
   |  |_ panoptic_val2017.json
|_ data

Pre-process the Panoptic narrative Grounding Ground-Truth Annotation for the dataloader using utils/pre_process.py.
At the end of this step you should have two new files in your annotations folder.

panoptic_narrative_grounding
|_ annotations
   |_ png_coco_train2017.json
   |_ png_coco_val2017.json
   |_ png_coco_train2017_dataloader.json
   |_ png_coco_val2017_dataloader.json
   |_ panoptic_segmentation
   |  |_ train2017
   |  |_ val2017
   |_ panoptic_train2017.json
   |_ panoptic_val2017.json

Pretrained Bert Model and PFPN

The pre-trained checkpoint can be downloaded from here, and the folder should be like:

pretrained_models
|_fpn
|  |_model_final_cafdb1.pkl
|_bert
|  |_bert-base-uncased
|  |  |_pytorch_model.bin
|  |  |_bert_config.json
|  |_bert-base-uncased.txt

Train and Inference

Modify the routes in train_net.sh according to your local paths. If you want to only test the pretrained model, add --ckpt_path ${PRETRAINED_MODEL_PATH} and --test_only.

The checkpoint trained on PNG and Flickr30K Entites is released, you can download it and test it!

Customized

You can try to only train the PND task (visual grounding) with ignore the output of masks. We provide the Flickr30K Entries to test the PND task only. You can check the train_net_flickr.py and replace it on the main.py to adapt the PND task. For dataset, you can download the Flickr30K annotations from here and the image from the offical.

Acknowledge

Some of the codes are built upon K-Net and PNG. Thanks them for their great works!

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
configs		configs
data		data
detectron2		detectron2
models		models
utils		utils
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
definition.png		definition.png
main.py		main.py
requirements.txt		requirements.txt
test.py		test.py
train_net.py		train_net.py
train_net.sh		train_net.sh
train_net_flickr.py		train_net_flickr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning

News

What is NICE?

Instruction

Environments

Dataset

Pretrained Bert Model and PFPN

Train and Inference

Customized

Acknowledge

About

Releases

Packages

Languages

License

Mr-Neko/NICE

Folders and files

Latest commit

History

Repository files navigation

NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning

News

What is NICE?

Instruction

Environments

Dataset

Pretrained Bert Model and PFPN

Train and Inference

Customized

Acknowledge

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages