Skip to content

Latest commit

 

History

History
78 lines (63 loc) · 2.57 KB

README.md

File metadata and controls

78 lines (63 loc) · 2.57 KB

Visual Story Post-Editing

Vist-Edit is the dataset of ACL 2019 short paper "Visual Story Post-Editing". Please do not redistribute without our permission. Thanks.

Arxiv: https://arxiv.org/abs/1906.01764

ACLWeb: https://www.aclweb.org/anthology/P19-1658

Crowd-AI Lab: https://crowd.ist.psu.edu/crowd-ai-lab.html


AREL & GLAC Dataset

Each Json file contains a machine-generated story (from AREL[1] or GLAC[2]), and five human-edited stories.

Filename: <photo_sequence_id>.json

JSON Attributes

photo_sequence_id: The concatenation of the photo sequence ids.

photo_sequence_ids: The list of photo sequence ids.

photo_sequence_urls: The list of photo urls.

auto_story_text_normalized: The automatic story generated by the model. Normalized by NLP tools.

edited_story_text: A list of stories.

worker_id: The de-identified id of the worker who edited the story.

edited_story_text: The original text edited by the worker.

normalized_edited_story_text_sent: The normalized edited_story_text.

data_split: train / dev / test.

model: The model used for generation.

note: Notes.

album_id: The album id of the photos.


Human_eval

Human evaluation results of different models on AREL & GLAC. The six evaluation aspects are used in the Visual Storytelling paper [3].

CSV Attributes

team: The concatenation of the photo sequence ids.

story_id: The list of photo sequence ids.

album_id: The list of photo urls.

worker_id: The de-identified id of the worker who rated the story.

focused: "Focus".

coherent: "Structure and Coherence".

share: " "I Would Share".

human: "This story sounds like it was written by a human".

grounded: "Visually-Grounded"

detailed: "Detailed"


Reference

[1] Wang, Xin, et al. "No metrics are perfect: Adversarial reward learning for visual storytelling." arXiv preprint arXiv:1804.09160 (2018).

[2] Kim, Taehyeong, et al. "GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story Generation." arXiv preprint arXiv:1805.10973 (2018).

[3] Huang, Ting-Hao Kenneth, et al. "Visual storytelling." Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016.