No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Latest commit 2f2d6de Nov 28, 2016
Failed to load latest commit information. Initial commit Nov 28, 2016 Initial commit Nov 28, 2016

YJ Captions 26k Dataset

We have developed a Japanese version of the MS COCO caption dataset, which we call YJ Captions 26k Dataset. It is created to facilitate the development of image captioning in Japanese language. Each Japanese caption describes the specified image provided in MS COCO dataset and each image has 5 captions.

Annotation Format

The annotations are stored using the JSON file format. The annotation scheme is the same as that of MS COCO. Please see the section on Image Caption Annotations.


Creative Commons Attribution 4.0 License


  author = "Miyazaki, Takashi and Shimizu, Nobuyuki",
  title = "Cross-Lingual Image Caption Generation",
  booktitle = "Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
  year = "2016",
  publisher = "Association for Computational Linguistics",
  pages = "1780--1790",
  location = "Berlin, Germany",
  doi = "10.18653/v1/P16-1168",
  url = ""