Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning

This is the homepage for the [CVPR 2023 highlight (top 2.5%)] paper:

Zhuowan Li, Xingrui Wang, Elias Stengel-Eskin, Adam Kortylewski, Wufei Ma, Benjamin Van Durme, Alan Yuille.

In this paper, we generate the Super-CLEVR dataset to systematically study the domain robustness of visual reasoning models on four factors: visual complexity, question redundancy, concept distribution, concept compositionality.

Dataset

Super-CLEVR contains 30k images of vehicles (from UDA-Part) randomly placed in the scenes, with 10 question-answer pairs for each image. The vehicles have part annotations and so the objects in the images can have distinct part attributes.

Here [link] is the list of objects and parts in Super-CLEVR scenes.

The first 20k images and paired are used for training, the next 5k for validation and the last 5k for testing.

Data	Download Link
images	images.zip
scenes	superCLEVR_scenes.json
questions	superCLEVR_questions_30k.json
questions (- redundancy)	superCLEVR_questions_30k_NoRedundant.json
questions (+ redundancy)	superCLEVR_questions_30k_AllRedundant.json

Dataset generation

To generate images:

Install Blender 2.79b. This repo is highly built on the CLEVR data generation code. Please refer to its README for additional details.
Download CGPart dataset.
Then we want to preprocess the 3D models. To do this, you may need to modify the input and output paths in image_generation/preprocess_cgpart.py, then run sh scripts/preprocess_cgpart.py.
Next run sh scripts/render_images.sh to render images with GPUs.
After the images and corresponding scene files are generated, you can use scripts/merge_scenes.py to merge the scene files into one json file (as output/superCLEVR_scenes.json).

10 example generated images and scenes are in output/images and output/scenes.

To generate questions

run sh scripts/generate_questions.sh. This bash file include several different scripts for generate questions with/without parts.
output/superCLEVR_questions_5.json and outputsuperCLEVR_questions_part_5.json are examples for questions generated using templates without and with parts respectively.
The argument --remove_redundant controls the level of redundancy in the generated questions.

Aknowledgements

This repo is highly motivated by CLEVR and render-3d-segmentation.

Citation

If you find this code useful in your research then please cite:

@inproceedings{li2023super,
  title={Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning},
  author={Li, Zhuowan and Wang, Xingrui and Stengel-Eskin, Elias and Kortylewski, Adam and Ma, Wufei and Van Durme, Benjamin and Yuille, Alan L},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={14963--14973},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
image_generation		image_generation
images		images
output		output
question_generation		question_generation
scripts		scripts
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
PATENTS		PATENTS
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image_generation

image_generation

images

images

output

output

question_generation

question_generation

scripts

scripts

.gitignore

.gitignore

CODE_OF_CONDUCT.md

CODE_OF_CONDUCT.md

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

PATENTS

PATENTS

README.md

README.md

Repository files navigation

Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning

Dataset

Dataset generation

To generate images:

To generate questions

Aknowledgements

Citation

About

Releases

Packages

Languages

License

Lizw14/Super-CLEVR

Folders and files

Latest commit

History

Repository files navigation

Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning

Dataset

Dataset generation

To generate images:

To generate questions

Aknowledgements

Citation

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages