Skip to content

[NeurIPS'2023] Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models

Notifications You must be signed in to change notification settings

HKUST-LongGroup/RECODE

Repository files navigation

RECODE for SGG in Pytorch

LICENSE Python PyTorch

Our paper Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models has been accepted by NIPS 2023.

Installation

Check INSTALL.md for installation instructions.

Dataset

Check DATASET.md for instructions of dataset preprocessing.

Extract CLIP Visual Features

bash scripts/extract_clip_obj_feature.sh

Generate Spatial Images and Offline Spatial Logits

bash scripts/draw_imgs_and_generate_spatial_logits.sh

Inference with RECODE

bash scripts/infer.sh

Generated Files

We provide the extracted clip visual feature, visual cue descriptions, and some spatial information, you can download from here*.

Citations

If you find this project helps your research, please kindly consider citing our paper in your publications.

@article{li2023zero,
  title={Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models},
  author={Li, Lin and Xiao, Jun and Chen, Guikun and Shao, Jian and Zhuang, Yueting and Chen, Long},
  journal={arXiv preprint arXiv:2305.12476},
  year={2023}
}

Credits

Our codebase is based on Scene-Graph-Benchmark.pytorch.

About

[NeurIPS'2023] Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published