Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
-
Updated
May 2, 2024 - Python
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
kapture is a file format as well as a set of tools for manipulating datasets, and in particular Visual Localization and Structure from Motion data.
A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"
Original VinVL visual backbone with simplified APIs to easily extract features, boxes, object detections, in a few lines of Python code.
Add a description, image, and links to the visual-features topic page so that developers can more easily learn about it.
To associate your repository with the visual-features topic, visit your repo's landing page and select "manage topics."