Skip to content

snaredataset/snare

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SNARE Dataset

SNARE dataset and code for MATCH and LaGOR models.

Paper and Citation

Language Grounding with 3D Objects

@article{snare,
  title={Language Grounding with {3D} Objects},
  author={Jesse Thomason and Mohit Shridhar and Yonatan Bisk and Chris Paxton and Luke Zettlemoyer},
  journal={arXiv},
  year={2021},
  url={https://arxiv.org/abs/2107.12514}
}

Installation

Clone

$ git clone https://github.com/snaredataset/snare.git

$ virtualenv -p $(which python3) --system-site-packages snare_env # or whichever package manager you prefer
$ source snare_env/bin/activate

$ pip install --upgrade pip
$ pip install -r requirements.txt

Edit root_dir in cfgs/train.yaml to reflect your working directory.

Download Data and Checkpoints

Download pre-extracted image features, language features, and pre-trained checkpoints from here and put them in the data/ folder.

Usage

Zero-shot CLIP Classifier

$ python train.py train.model=zero_shot_cls train.aggregator.type=maxpool 

MATCH

$ python train.py train.model=single_cls train.aggregator.type=maxpool 

LaGOR

$ python train.py train.model=rotator train.aggregator.type=two_random_index train.lr=5e-5 train.rotator.pretrained_cls=<path_to_pretrained_single_cls_ckpt>

Scripts

Run scripts/train_classifiers.sh and scripts/train_rotators.sh to reproduce the results from the paper.

To train the rotators, edit scripts/train_rotators.sh and replace the PRETRAINED_CLS with the path to the checkpoint you wish to use to train the rotator:

PRETRAINED_CLS="<root_path>/clip-single_cls-random_index/checkpoints/<ckpt_name>.ckpt'"

Preprocessing

If you want to extract CLIP vision and language features from raw images:

  1. Download models-screenshot.zip from ShapeNetSem, and extract it inside ./data/.
  2. Edit and run python scripts/extract_clip_features.py to save shapenet-clipViT32-frames.json.gz and langfeat-512-clipViT32.json.gz

Leaderboard

Please send your ...test.json prediction results to Mohit Shridhar. We will get back to you as soon as possible.

Instructions:

  • Include a name for your model, your team name, and affiliation (if not anonymous).
  • Submissions are limited to a maximum of one per week. Please do not create fake email accounts and send multiple submissions.

Rankings:

Rank Model All Visual Blind
1 DA4LG
(Anonymous)
5 Feb 2024
81.9 88.5 75.0
2 MAGiC
(Mitra et al.)
8 Jun 2023
81.7 87.7 75.4
3 DA4LG
(Anonymous)
27 Jan 2024
80.9 87.7 73.7
4 VLG
(Corona et al.)
15 Mar 2022
79.0 86.0 71.7
5 LOCKET
(Anonymous)
14 Oct 2022
79.0 86.1 71.5
6 VLG
(Corona et al.)
13 Nov 2021
78.7 85.8 71.3
7 LOCKET
(Anonymous)
23 Oct 2022
77.7 85.5 69.5
8 LAGOR
(Thomason et. al)
15 Sep 2021
77.0 84.3 69.4
9 MATCH
(Thomason et. al)
15 Sep 2021
76.4 83.7 68.7

About

SNARE Dataset with MATCH and LaGOR models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages