Skip to content

mariebexte/vl-probing

Repository files navigation

Visio-Linguistic Probing

This repository contains the Rainbow (EACL 2024) and EViL-Probe (LREC-COLING 2024) visio-linguistic probing benchmarks. It provides the code to derive these benchmarks from the existing datasets they base on.

Benchmark Overview

Preparation

Before you can compile the benchmarks, the required source datasets have to be placed into the respective directories as detailed below:

subdir of source_datasets source dataset files corresponding images found here (place in subdir of images)
/ARO from ARO: visual_genome_attribution.json (generated using VG_Attribution(image_preprocess=preprocess, download=True, root_dir=root_dir), as described in the repo) and visual_genome_relation.json (generated using VG_Relation(image_preprocess=preprocess, download=True, root_dir=root_dir, as described in the repo ) uses VisualGenome images that are downloaded when files are created, place image files in /ARO
/Compositional-Visual-Genome ComVG.csv uses VisualGenome images, place image files in /VisualGenome
/Counting-Probe git clone Counting Probe linked in the Counting Probe repo, place image files in /visual7w
/EqBen this file linked in the EqBen Repository  linked in the EqBen repo, place subdirs in /EqBen
/Flickr30k dataset_flickr30k.json as linked here by this repository  sign up here, place image files in /Flickr30k
/FOIL-IT foilv1.0_test_2017.json as linked here from the FOIL-IT page uses MS COCO images, place subdirs in /MS_COCO
/High-level test.jsonl from here uses MS COCO images they link here, place subdirs in /MS_COCO
/MS_COCO dataset_coco.json as linked here by this repository here, place subdirs in /MS_COCO
/Predicate-Noun eval_set.json uses images from OpenImages, place image files in /OpenImages
/SVO-Probes svo_probes.csv  image urls are linked in svo_probes.csv, uncomment line 180 of evil-probe/prepare_SVO_probes.py to download, place image files in /SVO_Probes
/VALSE all of these files uses images from SWiG (place image files in /SWiG), VisualDialog (place validation images in /VisualDialog/val2018), MS COCO (place subdirs in /MS_COCO), visual7w (place image files in visual7w),
/Visual-Spatial-Reasoning all_vsr_validated_data.jsonl  uses MS COCO images, place subdirs in /MS_COCO
/VL-Checklist these subdirs overview here: uses VisualGenome (place image files in /VisualGenome), OpenImages (place image files in /OpenImages) and SWiG (place image files in /SWiG)
/Why-Winoground-Hard examples_augmented.jsonl as generated per the instructions in this repository here, place image files in /Winoground
/Winoground examples.jsonl  here, place image files in /Winoground

EViL-Probe

To compile EViL-Probe:

  • Place the source datasets in the subdirectories of source_datasets as detailed in the above table.
  • Execute bash evil-probe/build_benchmark.sh. (If you only wish to compile part of the benchmark, uncomment the respective line(s) in the script.) Files will be written to evil-probe/benchmark.

All *.jsonl files in the benchmark have the same structure. This is what an exemplary entry looks like:

{"example_id":"example_440188",
"img_id":"COCO_val2014_000000050125.jpg",
"img_ds":"MS_COCO\/val2014",
"ds_aspect":"noun",
"sent_ds":"FOIL-IT",
"sent_1":"An older picture of a bus and other vehicles in a parking lot. ",
"sent_1_label":true,
"sent_2":"an older picture of a bicycle and other vehicles in a parking lot.",
"sent_2_label":false}

Rainbow

Rainbow builds on EViL-Probe. To compile Rainbow, make sure you have compiled EViL-Probe and then run python3 rainbow/extract_color_data.py. Files will be written to rainbow/benchmark.

We annotated the Flickr30k examples with the color hues that are mentioned in the image descriptions. These can be found in rainbow/Flickr_30k_hex_codes.csv.

All *.jsonl files in the benchmark have the same structure. This is what an exemplary entry looks like:

{"example_id": "108_swapped",
"img_ds": "Flickr30k",
"img_id": "2332986053.jpg",
"sent_1": "A man in an orange shirt and a blue hard hat smiles .",
"sent_2": "A man in a blue shirt and an orange hard hat smiles .",
"sent_1_label": true, "sent_2_label": false,
"ds_aspect": "swap_color",
"sent_ds": ["flickr30k_random.jsonl"],
"sent_1_color": "orange",
"sent_2_color": "blue"}

Citations

EViL-Probe

@inproceedings{bexte-etal-2024-evil-probe,
    title = "{EV}il-Probe - a Composite Benchmark for Extensive Visio-Linguistic Probing",
    author = "Bexte, Marie  and
      Horbach, Andrea  and
      Zesch, Torsten",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.591",
    pages = "6682--6700",
    }

Rainbow

@inproceedings{bexte-etal-2024-rainbow,
    title = "Rainbow - A Benchmark for Systematic Testing of How Sensitive Visio-Linguistic Models are to Color Naming",
    author = "Bexte, Marie  and
      Horbach, Andrea  and
      Zesch, Torsten",
    editor = "Graham, Yvette  and
      Purver, Matthew",
    booktitle = "Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = mar,
    year = "2024",
    address = "St. Julian{'}s, Malta",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.eacl-long.112",
    pages = "1858--1875",
    }

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published