Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding

Haoxuan You, Rui Sun*, Zhecan Wang*, Kai-Wei Chang, Shih-Fu Chang

[*: equal contribution]

Data:

Please download annotation data from train/validation/test.

Please also prepare the VCR image data/metadata because our annotations reuse them.

Here is a detailed explanation of different items in each data sample.

annot_id: Annotation id of the dataset
objects: Annotated objects (persons only)
boxes: box location of objects (x1,x2,y1,y2,s)  
img_fn: Image filename in VCR's raw data.
metadata_fn: Metadata filename in VCR's raw data.
statement: Commonsense description for the persons. If its element is a list of a number, it refers to a person, and the number in list is the index in objects and boxes. 
original_vcr_annot_id: Original annotation id in VCR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding

Data:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Find Someone Who: Visual Commonsense Understanding in Human-Centric Grounding

Data: