A deep learning framework for classifying videos using motion vectors, pixel data, or both. Designed to distinguish between real videos and videos generated by different AI models (SVD, Pika, CogVideo, T2VZ, VideoCrafter2, etc.).
This repository bundles two tightly coupled components:
- A motion-vector extractor built on top of
ffmpeg, producing dense MV tensors plus metadata. - A PyTorch training pipeline that learns to distinguish real and AI-generated videos using motion vectors, RGB frames, or both.
Use the extractor to populate motion-vector datasets, then fine-tune or evaluate classification baselines with the provided scripts.
- Python 3.8+ (extraction tools verified on 3.8–3.10)
- ffmpeg 4.4+ in
PATH - CUDA-capable GPU (recommended)
git clone <repository-url>
cd Motion-Vector-Learning
pip install -r requirements.txt
# optional
wandb loginThe project expects datasets in the following structure:
Data/
├── videos/
│ ├── vript/ # Real videos (class 0)
│ ├── hdvg/ # Real videos (class 0)
│ ├── cogvideo/ # CogVideo generated (class 1)
│ ├── svd/ # Stable Video Diffusion (class 2)
│ ├── pika/ # Pika generated (class 3)
│ ├── t2vz/ # Text-to-Video-Zero (class 4)
│ └── vc2/ # VideoCrafter2 (class 5)
└── motion_vectors/
├── vript/
├── hdvg/
├── cogvideo/
├── svd/
├── pika/
├── t2vz/
└── vc2/
More details about the dataset, including the downloads we used, can be found here.
- Videos:
.mp4 - Motion vectors:
.npyarrays ([T, H, W, 3]with dx, dy, sign)
Sample data and scripts live in MotionVectorExtractor/.
ffmpeg -version # verify dependency
python MotionVectorExtractor/extract_mv.py \
--data_root MotionVectorExtractor/Data/ \
--out_root MotionVectorExtractor/TestOut/ \
--override --keepFramesCommand help: python MotionVectorExtractor/extract_mv.py --help
Outputs per video include a motion-vector tensor, resolution metadata, frame types, timestamps, and optional visualization frames.
Train a model using motion vectors only:
python run.py \
--root_dir /path/to/motion_vectors \
--classes_config configs/multi.json \
--data mv \
--model resnet18 \
--epochs 50 \
--batch_size 8 \
--lr 1e-4 \
--frames 16 \
--pretrainedTrain with combined modalities (motion vectors + pixels):
python run.py \
--root_dir /path/to/videos /path/to/motion_vectors \
--classes_config configs/multi.json \
--data combined \
--merge_strategy mvaf \
--model resnet18 \
--epochs 50 \
--batch_size 4 \
--pretrained- Class mapping files live in
configs/(binary, multi-class, and per-generator variants). - Training scripts expect motion-vector arrays under
Data/motion_vectors/and optional RGB videos underData/videos/.
# motion-vector only
python run.py \
--root_dir /path/to/motion_vectors \
--classes_config configs/multi.json \
--data mv \
--model resnet18
# combined RGB + MV
python run.py \
--root_dir /path/to/videos /path/to/motion_vectors \
--classes_config configs/multi.json \
--data combined \
--merge_strategy mvafConvenience wrappers (Scripts/run_mv.sh, Scripts/run_vid.sh, Scripts/run_combined.sh) provide ready-to-run presets. SLURM launchers are under Scripts/slurm/.
visualize_motion_patterns.py: summarize motion statistics and heatmaps.optimize_density_thresholds.py: tune MVAF thresholds.MotionVectorExtractor/: contains CLI tools, sample assets, and logs produced during MV extraction.
