Skip to content
/ ECE Public

[ECCV'22 Poster] Explicit Image Caption Editing

Notifications You must be signed in to change notification settings

baaaad/ECE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Explicit Image Caption Editing

This repository contains the datasets and reference code for the paper Explicit Image Caption Editing accpeted to ECCV 2022. Refer to our full paper for detailed intructions and analysis. Example

Overview

The Explicit Caption Editing (ECE) task is defined as follows. Given an image and a reference caption (Ref-Cap), ECE models aim to explicitly predict a sequence of edit operations (e.g., KEEP/DELETE/ADD) on the Ref-Cap, which can translate the Ref-Cap close to the ground-truth caption (GT-Cap). Typically, Ref-Cap is lightly misaligned with the image.

ECE datasets

The ECE datasets include the COCO-EE and Flickr30K-EE.

Specifically, the COCO-EE was built based on dataset MSCOCO, the Flikr30K-EE was built based on the dataset e-ViL and Flickr30K.

Each ECE instance contains three main information:

  • image_id, the original image ID of the given image in the MSCOCO or Flikr30K-EE.
  • Ref-Cap, the reference caption which needs to be edited.
  • GT-Cap, the ground-truth caption of the given image and also the editing target.

Examples from COCO-EE and Flickr30K-EE

Example2

Statistical summary of the COCO-EE and Flickr30K-EE

COCO-EE Flickr30K-EE
Train Dev Test Train Dev Test
#Editing instances 97,567 5,628 5,366 108,238 4,898 4,910
#Images 52,587 3,055 2,948 29,783 1,000 1,000
Mean Reference Caption Length 10.3 10.2 10.1 7.3 7.4 7.4
Mean Ground-Truth Caption Length 9.7 9.8 9.8 6.2 6.3 6.3
Mean Edit Distance 10.9 11.0 10.9 8.8 8.8 8.9

Dataset Construction

The processed datasets have been placed in the dataset folder, they can also be directly download from here, including the COCO-EE and Flickr30K-EE in train, dev and test splits.

Or, you can follow the instructions below to set up the environment and construct them:

COCO-EE Construction

  1. Setup coco-edit submodule and follow its instructions form this.

Flickr30K-EE Construction

  1. Setup environment
    conda create -n flkree python=3.7
    conda activate flkree
    conda install json
    conda install csv
  2. Prepare the esnlive data and the output folder
  3. Construct Flikr30K-EE
python construct_flickr30k_ee.py --split <split>

The ECE model: TIger

The code of our proposed ECE model TIger are now available here.

About

[ECCV'22 Poster] Explicit Image Caption Editing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages