Automatic Detection of Slide Transitions in Lecture Videos using Convolutional Neural Networks
This is the source code to the conference article "SliTraNet: Automatic Detection of Slide Transitions in Lecture Videos using Convolutional Neural Networks" published at OAGM Workshop 2021.
If you use the code, please cite our paper (arxiv)
@InProceedings{sindel2022slitranet,
title={SliTraNet: Automatic Detection of Slide Transitions in Lecture Videos using Convolutional Neural Networks},
author={Aline Sindel and Abner Hernandez and Seung Hee Yang and Vincent Christlein and Andreas Maier},
year={2022},
booktitle={Proceedings of the OAGM Workshop 2021},
doi={10.3217/978-3-85125-869-1-10},
pages={59-64}
}
Install the requirements using pip or conda (python 3):
- torch >= 1.7
- torchvision
- opencv-contrib-python-headless
- numpy
- decord
The dataset needs to be in the following folder structure:
- Video files in: "/videos/PHASE/", where PHASE is "train", "val" or "test".
- Bounding box labels in: "/videos/PHASE_bounding_box_list.txt"
Bounding box labels define the rectangle of the slide area in the format: Videoname,x0,y0,x1,y1
Here one example test_bounding_box_list.txt file (the header needs to be included):
Video,x0,y0,x1,y1
Architectures_1,38,57,1306,1008
Architectures_2,38,57,1306,1008
The pretrained weights of SliTraNet from the paper can be downloaded here. Move them into the folder: "/weights"
Run test_SliTraNet.py
Some settings have to be specified, as described in the python file, such as the dataset and output folders and model paths.
Stage 1 of SliTraNet can also be applied separately (see test_slide_detection_2d.py) and afterwards the results can be loaded in test_SliTraNet.py.
@author Aline Sindel