Video Foundation Models & Data for Multimodal Understanding
-
Updated
Jun 5, 2024 - Python
Video Foundation Models & Data for Multimodal Understanding
Official repository for the paper titled "Bitstream-corrupted Video Recovery: A Novel Benchmark Dataset and Method", accepted by NeurIPS 2023 Dataset and Benchmark Track
[AAAI 2023] AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work
Official This-Is-My Dataset published in CVPR 2023
Improving Transfer Learning with a Dual Image and Video Transformer for Multi-label Movie Trailer Genre Classification
Generic PyTorch dataset implementation to load and augment VIDEOS for deep learning training loops.
Tools for loading video dataset and transforms on video in pytorch. You can directly load video files without preprocessing.
Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)
The repository contains the code for extracting image and mask from a video segmentation dataset by using the OpenCV library in the Python programming language.
This annotation tool is build to clean and create video dataset.
the frame extractor for Video Datasets with GPU Acceleration
Surveillance Perspective Human Action Recognition Dataset: 7759 Videos from 14 Action Classes, aggregated from multiple sources, all cropped spatio-temporally and filmed from a surveillance-camera like position.
Add a description, image, and links to the video-dataset topic page so that developers can more easily learn about it.
To associate your repository with the video-dataset topic, visit your repo's landing page and select "manage topics."