Skip to content
EMNLP 2018. Learning to Describe Differences Between Pairs of Similar Images. Harsh Jhamtani, Taylor Berg-Kirkpatrick.
Jupyter Notebook Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
code
data
README.md

README.md

Spot-the-diff

Harsh Jhamtani, Taylor Berg-Kirkpatrick. Learning to Describe Differences Between Pairs of Similar Images. EMNLP 2018
Link: https://arxiv.org/pdf/1808.10584.pdf

Dataset

  • v0.1 of dataset is present in data/.

Annotations:

  • data/annotations/ contains threee json files representing train,val,test splits
  • format of each json file is as follows: each file represents a list. each item in the list is a dictionary consisting of 'img_id' and 'sentences' keys. e.g.
    {"img_id": "400", "sentences": ["two of the three people in the front of the picture have moved", "there is a vehicle in the far back that is only in image two"]

Images

  • data/resized_images/ contains the relevant images.
  • naming convention: <img_id>.png, <img_id>_2.png
  • we have also provided the corresponding diff images: <img_id>_diff.jpg
  • All images have been resized to 224,224
  • Original size images: bit.ly/spot_diff_data

Cluster data

  • We provide clusters of differing pixels computed under suggested paramter settings and clustering algorithm.
  • For more details, check Code/usage.ipynb

Others

  • Clustering code has been added

TODO

  • Model Predictions (multi)

Reference

If you use the data or code, please consider citing

@inproceedings{jhamtani2018learning,
  title={Learning to Describe Differences Between Pairs of Similar Images},
  author={Jhamtani, Harsh and Berg-Kirkpatrick, Taylor},
  booktitle={Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year={2018}
}
You can’t perform that action at this time.