ActivityNet-EKG

Introduction

This repository consists of scripts, datasets, unified schema, and RML files for generating ActivityNet-EKG and ActivityNet-EKG* knowledge graphs. ActivityNet-EKG contains short clips of 11,839 videos and their corresponding 46,195 descriptions (captions) in which 137,777 visual objects and 114,203 textual entities are aligned with entity mentions in their corresponding classes (types) in knowledge bases called DBpedia and Wikidata. The ActivityNet-EKG knowledge graphs can open new research directions and can used to solve the problems of multimedia processing, indexing, and retrieval tasks, e.g., information extraction from video-text, video questions answering, video captioning, and video dialogue systems.

The below Figure shows how the vision and language parts come into play to form three segments with captions describing the video contents. The knowledge base (KB) part represents the recognized and linked entities in the same colors with corresponding classes (types) in the KB and represents their actual DBpedia linked in the table.

VIDEO-TEXTUAL-KNOWLEDGE-ENTITY-LINKING-(ViTEL)

We proposed a novel task called ViTEL (Video-Textual-Knowledge-Entity-Linking), in which the document is composed of textual data, visual data (in the form of video frames), and a knowledge base. The ViTEL task is trying to recognize and link maximum portions of visual and textual parts with the corresponding entity mentions (or classes) in the knowledge base, or linking them to new entities, extending the A-box of knowledge base with its type assertion(s). ActivityNet-EKG and ActivityNet-EKG* knowledge graphs can be used for the design, training and evaluation of algorithms solving the task of ViTEL. The ViTEL closed the loops between vision (i.e., videos), language (texts), and semantics (background knowledge) modalities.

Framework for ActivityNet-EKG and ActivityNet-EKG* development

The below Figure shows the architecture of the framework which is also used for the development of ActivityNet-EKG and ActivityNet-EKG* knowledge graphs. The input to the framework is visual data (e.g. images or videos), textual data, and annotated data, (ii) the textual part is processed by entity recognition and linking tasks/tool, (iii) the annotation of visual data has been used for the visual part or a visual object detector can be used in this part. The descriptive mapping rules (e.g. RML) have mapped the heterogeneous data into RDF triples to generate knowledge graph. The resultant knowledge graph can be saved in the graph database (e.g. GraphDB).

Statistics of ActivityNet-EKG and ActivityNet-EKG*

Statistics of the developed ActivityNet-EKG (AN-EKG) and ActivityNet-EKG* (AN-EKG*) are shown in the below table.

Example from the ActivityNet-EKG Knowledge-Graph

A single entry of the ActivityNet-EKG knowledge graph is represented below in details. A video segment consists of three video frames (each frame has one bounding box) and a textual caption describing the visual part. Figure b shows the SPARQL query for extracting RDF triples for a document with ID:v_QOlSCBRmfWY. A small RDF-based knowledge graph for this entry is shown in green (the resource information), and blue and yellow RDF triples (textual and visual portions).

Cite

If you find ActivityNet-EKG, ActivityNet-EKG* and ViTEL helpful in your work please cite the paper:

Shahi Dost, Maria-Esther Vidal, Enrique Iglesias, Ahmad Sakor. 2023.
Knowledge Capturing from Multimodal Video-Textual Knowledge-Entity Linking (KCAP’2023)[https://www.k-cap.org/2023/]

License

The source codes and files are licensed under MIT License.

Contact

If you have any query regarding ActivityNet-EKG, ActivityNet-EKG*, ViTEL, source files, or want to contribute to this work, please contact on shahi.dost[at]tib.eu.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
activitynet_ekg		activitynet_ekg
data_sources		data_sources
rml_mapping		rml_mapping
src		src
unified_schema		unified_schema
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

activitynet_ekg

activitynet_ekg

data_sources

data_sources

rml_mapping

rml_mapping

src

src

unified_schema

unified_schema

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

ActivityNet-EKG

Introduction

VIDEO-TEXTUAL-KNOWLEDGE-ENTITY-LINKING-(ViTEL)

Framework for ActivityNet-EKG and ActivityNet-EKG* development

Statistics of ActivityNet-EKG and ActivityNet-EKG*

Example from the ActivityNet-EKG Knowledge-Graph

Cite

License

Contact

About

Releases

Packages

Languages

License

SDM-TIB/Video-Entity-Linking

Folders and files

Latest commit

History

Repository files navigation

ActivityNet-EKG

Introduction

VIDEO-TEXTUAL-KNOWLEDGE-ENTITY-LINKING-(ViTEL)

Framework for ActivityNet-EKG and ActivityNet-EKG* development

Statistics of ActivityNet-EKG and ActivityNet-EKG*

Example from the ActivityNet-EKG Knowledge-Graph

Cite

License

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages