OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.
-
Updated
Jun 9, 2024 - Python
OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.
FreeVA: Offline MLLM as Training-Free Video Assistant
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""
Video Foundation Models & Data for Multimodal Understanding
Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models
SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap (CVPR24 - CVSports workshop)
A Large Short-video Recommendation Dataset with Raw Text/Audio/Image/Videos (Talk Invited by DeepMind).
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective
Code for the Paper: Quasi-Online Detection of Take and Release Actions from Egocentric Videos. International Conference on Image Analysis and Processing 2023.
[ICCV 2021] A new codebase containing various methods for Group Activity Recognition. Paper title: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition.
The official code of "CSTA: CNN-based Spatiotemporal Attention for Video Summarization"
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Official code for MiniGPT4-video
Code for LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videos
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Code release for "Training a Large Video Model on a Single Machine in a Day"
Add a description, image, and links to the video-understanding topic page so that developers can more easily learn about it.
To associate your repository with the video-understanding topic, visit your repo's landing page and select "manage topics."