Automatic Detection of Slide Transitions in Lecture Videos using Convolutional Neural Networks
This is the source code to the conference article "SliTraNet: Automatic Detection of Slide Transitions in Lecture Videos using Convolutional Neural Networks" published at OAGM Workshop 2021.
If you use the code, please cite our paper (arxiv)
title={SliTraNet: Automatic Detection of Slide Transitions in Lecture Videos using Convolutional Neural Networks},
author={Aline Sindel and Abner Hernandez and Seung Hee Yang and Vincent Christlein and Andreas Maier},
booktitle={Proceedings of the OAGM Workshop 2021},
Install the requirements using pip or conda (python 3):
- torch >= 1.7
- torchvision
- opencv-contrib-python-headless
- numpy
- decord
The dataset needs to be in the following folder structure:
- Video files in: "/videos/PHASE/", where PHASE is "train", "val" or "test".
- Bounding box labels in: "/videos/PHASE_bounding_box_list.txt"
Bounding box labels define the rectangle of the slide area in the format: Videoname,x0,y0,x1,y1
Here one example test_bounding_box_list.txt file (the header needs to be included):
The pretrained weights of SliTraNet from the paper can be downloaded here. Move them into the folder: "/weights"
Some settings have to be specified, as described in the python file, such as the dataset and output folders and model paths.
Stage 1 of SliTraNet can also be applied separately (see and afterwards the results can be loaded in
@author Aline Sindel