Skip to content

Tanveer81/GRAttVIS

Repository files navigation

GRAtt-VIS

DOI PWC

This is an official Pytorch implementation of GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation. In this repository, we provide PyTorch code for training and testing our proposed GRAtt-VIS model. GRAtt-VIS is an efficient video instance segmentation and tracking model that achieves state-of-the-art results on several benchmarks, such as YTVIS-19/21/22 and OVIS.

Updates

  • Jun 14, 2023: Code is now available!

Installation

GRAtt-VIS is built upon VITA. See installation instructions.

Getting Started

We provide a script train_net_grattvis.py, that is made to train all the configs provided in GRAtt-VIS. To train a model with "train_net_grattvisvis.py" on VIS, first setup the corresponding datasets following Preparing Datasets. Then run with pretrained weights on target VIS dataset in VITA's Model Zoo:

python3 train_net_genvis.py --num-gpus 4 \
--config-file configs/genvis/ovis/grattvis_R50_bs8.yaml \
MODEL.WEIGHTS weights/vita_r50_ovis.pth \
MODEL.GENVIS.USE_MEM False MODEL.GENVIS.GATED_PROP True \
OUTPUT_DIR your_output_dir

To evaluate a model's performance, use

python3 train_net_genvis.py --num-gpus 1 \
--config-file YOUR_MODEL_PATH/config.yaml \
--eval-only MODEL.WEIGHTS YOUR_MODEL_PATH/model_checkpoint.pth \
MODEL.GENVIS.USE_MEM False MODEL.GENVIS.GATED_PROP True \
OUTPUT_DIR your_output_dir

Model Zoo

YouTubeVIS-2019

Backbone AP AP50 AP75 AR1 AR10 Download
R-50 50.4 70.7 55.2 48.4 58.7 model
Swin-L 63.1 85.6 67.2 55.5 67.8 model

YouTubeVIS-2021

Backbone AP AP50 AP75 AR1 AR10 Download
R-50 48.9 69.2 53.1 41.8 56.0 model
Swin-L 60.3 81.3 67.1 48.8 64.5 model

YouTubeVIS-2022

Backbone AP AP50 AP75 AR1 AR10 Download
R-50 40.8 60.1 45.9 35.7 46.9 model
Swin-L 52.6 74.0 57.9 45.0 57.1 model

OVIS

Backbone AP AP50 AP75 AR1 AR10 Download
R-50 36.2 60.8 36.8 16.8 40.0 model
Swin-L 45.7 69.1 47.8 19.2 49.4 model

License

The majority of GRAtt-VIS is licensed under a Apache-2.0 License. However portions of the project are available under separate license terms: Detectron2(Apache-2.0 License), Mask2Former(MIT License), Deformable-DETR(Apache-2.0 License), GENVIS(Apache-2.0 License), and VITA(Apache-2.0 License).

Citing GRAttVIS

If you find GRAtt-VIS useful in your research and wish to refer to the baseline results, please use the following BibTeX entry as a citation.

@article{hannan2023gratt,
  title={GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation},
  author={Hannan, Tanveer and Koner, Rajat and Bernhard, Maximilian and Shit, Suprosanna and Menze, Bjoern and Tresp, Volker and Schubert, Matthias and Seidl, Thomas},
  journal={arXiv preprint arXiv:2305.17096},
  year={2023}
}

Acknowledgement

We acknowledge the following repositories from where we have inherited code snippets.

  1. Detectron2
  2. Mask2Former
  3. VITA
  4. GENVIS

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages