Automated pipeline for inserting brand assets into video scenes — eliminating manual frame-by-frame editing for sponsored content production.
Built for Movicorn, a platform democratizing film investment by connecting independent producers with early-stage sponsors.
Given a video scene and a brand logo, this pipeline:
- Segments the target object using SAM 2 (Segment Anything Model 2) for precise instance segmentation
- Selects the best-matching asset from a brand library using CLIP zero-shot similarity scoring — no manual asset tagging required
- Tracks the object across frames using Farneback optical flow, maintaining consistent placement without per-frame annotation
- Warps the logo geometry using Delaunay triangulation to preserve 3D geometric consistency as the object moves
- Enforces motion coherence via FFmpeg/PyAV to ensure temporal smoothness across the full sequence
- Serves the pipeline as a REST API via Flask for integration into production workflows
| Component | Technology |
|---|---|
| Instance segmentation | SAM 2 (Meta) |
| Asset selection | CLIP (OpenAI), zero-shot |
| Motion tracking | Farneback optical flow (OpenCV) |
| Geometry warping | Delaunay triangulation |
| Video processing | FFmpeg / PyAV |
| API layer | Flask REST |
| Containerization | Docker |
| Testing | pytest |
aibranding/
├── src/ # Core pipeline modules
├── deploy/ # Flask API + deployment config
├── scripts/ # Preprocessing and utility scripts
├── tests/ # pytest test suite
├── docs/ # Architecture documentation
├── templates/ # HTML templates for API UI
├── data/ # Sample assets and test scenes
├── Dockerfile
└── requirements.txt
The core segmentation, asset selection, and optical flow pipeline is implemented and functional. Logo warping geometry is working on controlled scenes. Active work is focused on improving geometric consistency and blending quality across complex, high-motion sequences — the core challenge in making placements look production-ready rather than composited.
This is an open engineering problem: maintaining photorealistic brand integration across arbitrary video content without per-scene manual tuning.
Built using the Claude Agent SDK harness pattern — initializer agent for environment setup and context seeding, coding agents for iterative implementation across sessions. The CLAUDE.md file in the repo root documents the architecture decisions and agent workflow used throughout development.
# Install dependencies
pip install -r requirements.txt
# Run tests
pytest
# Start Flask API
python deploy/app.pySponsors increasingly join film productions after principal photography is complete. Re-shooting scenes to add branded products is expensive and often impossible. This pipeline makes it viable to integrate brand placements into existing footage programmatically — turning a post-production problem into a scalable software workflow.