Measuring 3D Spatial Geometric Consistency in Dynamically Generated Videos

Weijia Dou^1*, Wenzhao Zheng^2,3*,†, Weiliang Chen², Yu Zheng², Jie Zhou², Jiwen Lu²

(*Equal contribution; †Project leader.)

¹Tongji University ²Tsinghua University ³University of California, Berkeley

Table of Contents

🧠 Overview
- Key Features
🛠️ Installation
🚀 Usage
- Data Preparation
- Running the Evaluation
📊 Evaluation Results
🙏 Acknowledgments

🧠 Overview

Recent generative models can produce high-fidelity videos, yet they often exhibit 3D spatial geometric inconsistencies. These failures include geometric warping, incoherent motion, object impermanence, and perspective failures. Existing evaluation methods fail to accurately characterize these inconsistencies: fidelity-centric metrics like FVD are insensitive to geometric distortions, while consistency-focused benchmarks often penalize valid foreground dynamics.

To address this gap, we introduce SGC, a metric for evaluating 3D Spatial Geometric Consistency in dynamically generated videos. We quantify geometric consistency by measuring the divergence among multiple camera poses estimated from distinct local regions.

Key Features

Foreground-Background Disentanglement: Our approach first segments dynamic objects using motion object segmentation (MOS) to isolate the static background. Crucially, all subsequent SGC metrics and methods are applied only after this MOS step. As our evaluations show, seemingly improved scores without MOS can be misleading; metrics calculated with MOS more accurately reflect true background geometric inconsistencies.
Depth-Aware Partitioning: After isolating the static areas, we predict depth for each pixel and partition the remaining static background into spatially coherent sub-regions.
Composite Variance Scoring: We estimate a local camera pose for each subregion and compute the divergence among these poses. The overall SGC score is computed by aggregating three key evaluations: local inter-segment consistency, global pose consistency, and cross-frame depth consistency error.

🛠️ Installation

For detailed installation instructions, please refer to the Installation Guide.

Please see docs/Install.md for a comprehensive guide on setting up the environment, dependencies, and any required submodules.

🚀 Usage

Data Preparation

Our SGC metric can process video files directly or pre-extracted frames. Organize your custom data in a directory structure like this:

your_dataset/
├── video1.mp4                 # Option A: Direct video files
├── experiment_A_video/        # Option B: Pre-extracted frames
│   ├── 00000.jpg 
│   ├── 00001.jpg
│   └── ...
└── ...

Running the Evaluation

Because our metrics are calculated strictly on the static background, you must perform motion object segmentation (MOS) before running the SGC calculation.

Step 1: Extract motion masks to isolate the static background

bash scripts/run_seganymo.sh

Step 2: Compute the SGC score on the segmented sub-areas

bash scripts/run_sgc.sh
python sgc/calculatescore.py

The output will be saved as a JSON file containing the overall SGC score and the breakdown of the three component metrics.

📊 Evaluation Results

We curate a comprehensive benchmark of 1,296 videos, comprising 996 generated videos and 300 high-motion real videos. Experiments on real and generative videos demonstrate that SGC robustly quantifies geometric inconsistencies, effectively identifying critical failures missed by existing metrics.

Method	SGC Score (↓)
Cosmos	0.0722
Hotshot	0.1172
Latte	0.3226
Lavie	0.1241
Modelscope	0.3129
opensora-i	0.1631
opensora-t	0.0831
Seine	0.2837
Videocrafter	0.0973
Zeroscope	0.0912
RT-1 (Real)	0.0639
Nuscenes (Real)	0.0613
OpenVid (Real)	0.0530

(For full quantitative comparisons across all 10 state-of-the-art models, please refer to Table 1 in our paper ).

🙏 Acknowledgments

This implementation is made possible by several excellent open-source foundational estimators. We sincerely thank the authors of:

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
assets		assets
constants		constants
docs		docs
scripts		scripts
sgc		sgc
third-party		third-party
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Measuring 3D Spatial Geometric Consistency in Dynamically Generated Videos

🧠 Overview

Key Features

🛠️ Installation

🚀 Usage

Data Preparation

Running the Evaluation

📊 Evaluation Results

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Measuring 3D Spatial Geometric Consistency in Dynamically Generated Videos

🧠 Overview

Key Features

🛠️ Installation

🚀 Usage

Data Preparation

Running the Evaluation

📊 Evaluation Results

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages