VABench — VideoAdAgent Bench

Coming soon.

VABench is a public, reproducible benchmark for AI agents that produce video advertisements end-to-end. We will be releasing the full harness — briefs, evaluation scorers, baseline runners, and aggregation scripts — as open source under the Apache 2.0 license in a future update.

What's in this benchmark

VABench grades AI video-ad systems on three independent evaluation arms:

Arm 1 — Capability. Pairwise head-to-head against general-purpose video agents on the brief.
Arm 2 — Ad Quality. Pairwise against frontier text-to-video models (single-scene and multi-scene configurations) on eight ad-rubric dimensions.
Arm 3 — Video Production Quality. Brief-blind grading on production polish, with a frame-level hallucination rubric folded in at 3× weight.

The brief set covers every major performance-ad pattern (problem-solution, before-after, demo, testimonial, lifestyle) plus targeted stress tests against the hardest axes of multi-scene production (anchor storytelling, persona consistency, brand-asset reuse, structured CTA text, persuasion-arc compliance).

Current results

The headline leaderboard and methodology are public now:

Leaderboard: https://creatify.ai/research/agent-benchmarks
Technical report: https://creatify.ai/research/agent

What's coming

Brief set + JSON Schema
Pairwise judge implementation (Claude Opus 4.7 + GPT-5, position-swap consensus)
Frame-level hallucination scorer
Structural-compliance checks (duration, aspect, audio, OCR)
Baseline runners for the raw text-to-video frontier (Veo 3.1, Kling 3.0, Seedance 2.0) and general-purpose video agents (HeyGen V3)
Aggregation scripts that emit the published leaderboard tables

License

Released under the Apache 2.0 license (see LICENSE).

Stay tuned

Watch this repository to be notified when the harness lands.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VABench — VideoAdAgent Bench

What's in this benchmark

Current results

What's coming

License

Stay tuned

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

VABench — VideoAdAgent Bench

What's in this benchmark

Current results

What's coming

License

Stay tuned

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Packages