Live Video Captioning

Introduction to the Live Video Captioning Problem

Live Video Captioning (LVC) involves detecting and describing dense events within video streams. Traditional dense video captioning approaches typically focus on offline solutions where the entire video is available for analysis by the captioning model. In contrast, the LVC paradigm requires models to generate captions for video streams in an online manner. This imposes significant constraints, such as working with incomplete observations of the video and the need for temporal anticipation.

In this repository we release the evaluation toolkit for the LVC problem, where we include the scripts for the novel Live Score metric detailed in our [paper].

If you use any content of this repo for your work, please cite the following bib entry:

@article{lvc2024,
  title={Live Video Captioning},
  author={Eduardo Blanco-Fernández and Carlos Gutiérrez-Álvarez and Nadia Nasri and Saturnino Maldonado-Bascón and Roberto J. López-Sastre},
  journal={arXiv preprint arXiv:2406.14206},
  year={2024}
}

Installation

Dependencies:

Python 3.9: We recommend using Anaconda to create a virtual environment with the required dependencies: conda create -n lvc python=3.9
- Activate the environment: conda activate lvc
- Python dependencies: pip install -r requirements.txt
Java 1.8.0

Some basic instructions:

Clone this github repository.
Unzip the file with the LVC annotations for the ActivityNet Captions dataset.

cd lvc/data/validation
tar -xvzf data_validation.tar.gz

How to reproduce the results of our paper?

Unzip the file with the dense captions produced by our LVC model

cd lvc/data/captions
tar -xvzf data_captions.tar.gz

To obtain the Live Score metric, for all the scorers used in our paper (METEOR, Bleu_4 and ROUGE_L), run the script: generate_results_lvc.py. This script will generate the results for the LVC model and the scorers used in our paper. The results will be saved in the results folder. Then, run the script average_scores.py
To obtain the live evolution of the Live Score metric for a particular video, run the following script: generate_images.py This script generates the images that we release in the paper, where one can observe the evolution of the novel metrics in all the videos.

Enjoy!

How to generate your own results using the novel Live Score metric?

Save your LVC results in a JSON file. Use the JSON format detailed in the ActivityNet Captions challenge. We provide a sample JSON file in the results folder.
Specify the delta_t parameter your LVC model is using to cast the dense video captions, and run the generate_results_lvc.py with your JSON file.

License

This repository is released under the GNU General Public License v3.0 License (refer to the LICENSE file for details).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Live Video Captioning

Introduction to the Live Video Captioning Problem

Installation

How to reproduce the results of our paper?

How to generate your own results using the novel Live Score metric?

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
pycocoevalcap		pycocoevalcap
resources		resources
results		results
LICENSE		LICENSE
README.md		README.md
average_scores.py		average_scores.py
generate_images.py		generate_images.py
generate_results_lvc.py		generate_results_lvc.py
requirements.txt		requirements.txt

License

gramuah/lvc

Folders and files

Latest commit

History

Repository files navigation

Live Video Captioning

Introduction to the Live Video Captioning Problem

Installation

How to reproduce the results of our paper?

How to generate your own results using the novel Live Score metric?

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages