Skip to content

FuxiaoLiu/VisualNews-Repository

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
May 3, 2022

Visual News: Benchmark and Challenges in News Image Captioning

Fuxiao Liu, Yinghan Wang, Tianlu Wang, Vicente Ordonez

Abstract

We propose Visual News Captioner, an entity-aware model for the task of news image captioning. We also introduce Visual News, a large-scale benchmark consisting of more than one million news images along with associated news articles, image captions, author information, and other metadata. Unlike the standard image captioning task, news images depict situations where people, locations, and events are of paramount importance. Our proposed method can effectively combine visual and textual features to generate captions with richer information such as events and entities. More specifically, built upon the Transformer architecture, our model is further equipped with novel multi-modal feature fusion techniques and attention mechanisms, which are designed to generate named entities more accurately. Our method utilizes much fewer parameters while achieving slightly better prediction results than competing methods. Our larger and more diverse Visual News dataset further highlights the remaining challenges in captioning news images.

Dataset Examples

Examples from our VisualNews dataset

VisualNews is more Diverse!

Examples from our VisualNews dataset

Getting Data

  • Our dataset is available upon request. Please contact fl3es@virginia.edu
  • To access our dataset, please refer to this demo
  • Some of the articles have the image position, image titles and keyphrases. We will release it soon. Stay tuned!

Requirements

  • Python 3
  • Pytorch > 1.0

Model

The code of our model is in ./model.

CUDA_VISIBLE_DEVICES=0 python main.py

If you have any questions, please email: fl3es@virginia.edu Examples from our VisualNews dataset

Resourse

Citing

If you find our paper/code useful, please consider citing:

@misc{liu2020visualnews,
      title={VisualNews : Benchmark and Challenges in Entity-aware Image Captioning}, 
      author={Fuxiao Liu and Yinghan Wang and Tianlu Wang and Vicente Ordonez},
      year={2020},
      eprint={2010.03743},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published