Dataset Release Page for OVEN

Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities (ICCV 2023 Oral)

H. Hu, Y. Luan, Y. Chen, U. Khandelwal, M. Joshi, K. Lee, K. Toutanova, and M.-W. Chang.

[Project Page] [Annotation] [Images] [Contributed Code] [Leaderboard (Coming Soon)]

OVEN models recognize the Visual Entity on the Wikipedia, from images in the wild

Please use the following bib entry to cite this paper if you are using any resources from the repo.

@article{hu2023open,
  title={Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities},
  author={Hu, Hexiang and Luan, Yi and Chen, Yang and Khandelwal, Urvashi and Joshi, Mandar and Lee, Kenton and Toutanova, Kristina and Chang, Ming-Wei},
  journal={arXiv preprint arXiv:2302.11154},
  year={2023}
}

Introduction

In this project, we formally present the task of Open-domain Visual Entity recognitioN (OVEN), where a model need to link an image onto a Wikipedia entity with respect to a text query. We construct OVEN-Wiki by re-purposing 14 existing datasets with all labels grounded onto one single label space: Wikipedia entities. OVEN challenges models to select among six million possible Wikipedia entities, making it a general visual recognition benchmark with the largest number of labels.

OVEN Annotation

The annotations are released as jsaonline file for each set and data split as discussed in the paper.

Below is an example of the format for a training data:

{
	"data_id": "oven_entity_train_00000000",
	"image_id": "oven_00000000",
	"question": "what is the model of this aircraft?",
	"entity_id": "Q1141409",
	"entity_text": "Dassault Falcon 900",
	"data_split": "entity_train"
}

Here entity id are the wikidata id, which is unique and can be searched using the wikidata API: https://www.wikidata.org/wiki/{entity_id}. Meanwhile, the entity text are the name of the entity of its corresponding wikidata id, on the 2022/10/01's Wikidump. Note that Wikipedia is constantly updating the entity text name as the name and definition of an entity if changing over the time.

Following are links to each annotation file:

Entity Set
- Train Split Link (962M)
- Val Split Link (26M)
- Test Split Link (80M)
Query Set
- Train Split Link (6.0M)
- Val Split Link (644K)
- Test Split Link (2.3M)
Human Set Link

Meanwhile, to facilitate the reproducibility of experiments, we also release the 6M wikipedia text information (derived from Wikidump 2022/10/01).

6 Million Wikipedia Text Information
- Full Info (6.9G)
- Title Only (419M)

To reproduce the dual encoder results using Wikipedia infobox images, you would need to download images from the url in the field wikipedia_image_url.

OVEN Images

See this guideline for image downloading

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
assets		assets
image_downloads		image_downloads
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bootstrap.bundle.min.js		bootstrap.bundle.min.js
bootstrap.min.css		bootstrap.min.css
index.html		index.html
jquery.flip.min.js		jquery.flip.min.js
jquery.min.js		jquery.min.js
styles.css		styles.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dataset Release Page for OVEN

Introduction

OVEN Annotation

OVEN Images

About

Releases

Packages

Languages

License

open-vision-language/oven

Folders and files

Latest commit

History

Repository files navigation

Dataset Release Page for OVEN

Introduction

OVEN Annotation

OVEN Images

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages