Semantic Metadata Extraction from Generated Video Captions

This repository provides implementation code of our entity, property & relation extraction methods which we introduced in our paper Event and Entity Extraction from Generated Video Captions (CD-MAKE 2023) (Johannes Scherer, Ansgar Scherp and Deepayan Bhowmik). We proposed a framework (combining Video Captioning and NLP methods) to extract semantic metadata solely from automatically generated video captions. As metadata, we considered entities, the entities’ properties, relations between entities, and the video category.

Test the extraction methods on custom text and captioned events (see Usage). This repository does not contain the implementation of our method for video classification using generated captioned video events, the scripts that we used to evaluate our extraction methods, nor the trained models of the Dense Video Captioning methods that we employed (see References) and the captioned events they generated.

Installation
Usage
References

Installation

Create conda environment.

conda create -n Video2Metadata python=3.7
conda activate Video2Metadata

Install spaCy with NeuralCoref from source (see huggingface/neuralcoref#310).

cd src
git clone https://github.com/huggingface/neuralcoref.git
cd neuralcoref
pip install -r requirements.txt
pip install -e .
cd ../../

When an error occurs when installing spaCy with NeuralCoref from source, the following installation may work instead (see huggingface/neuralcoref#209). Note that spaCy 2.1.0 is much slower.

pip install spacy==2.1.0
pip install neuralcoref

Download spaCy language model of choice.

python -m spacy download en_core_web_lg

Install WordNet to validate (compound) nouns, verbs, adjectives and adverbs.

conda install -c anaconda nltk

Usage

Entity, Property & Relation Extraction from Text

Apply the semantic metadata extraction methods on custom text. For example, the following command

python extract_from_text.py --text "A man is standing in front of a fridge. He opens it and takes out a red glass."

results in the output

Input: A man is standing in front of a fridge. He opens it and takes out a red glass

Detected Sentences:
A man is standing in front of a fridge.
He opens it and takes out a red glass.

Entities:
fridge
front
glass
man

Entity-Property Pairs:
glass [red]

Relations:
(man, standing, ['in'], front)
(man, takes, ['out'], glass)

Entity, Property & Relation Extraction from Captioned Events

To apply the semantic metadata extraction methods on captioned events (including temporal information) instead of text, you may add an example consisting of sentences and temporal segments to the given list of examples in extract_from_captioned_events.py (already included there are the examples as presented in the paper).

python extract_from_captioned_events.py

References

The DVC models that we used for testing our framework

Contributors

Johannes Scherer, Ansgar Scherp and Deepayan Bhowmik

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
resources		resources
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
extract_from_captioned_events.py		extract_from_captioned_events.py
extract_from_text.py		extract_from_text.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Semantic Metadata Extraction from Generated Video Captions

Table of Contents

Installation

Usage

Entity, Property & Relation Extraction from Text

Entity, Property & Relation Extraction from Captioned Events

References

Contributors

About

Releases

Packages

Languages

License

josch14/semantic-metadata-extraction-from-videos

Folders and files

Latest commit

History

Repository files navigation

Semantic Metadata Extraction from Generated Video Captions

Table of Contents

Installation

Usage

Entity, Property & Relation Extraction from Text

Entity, Property & Relation Extraction from Captioned Events

References

Contributors

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages