FynCut

FynCut is an AI-powered, serverless video editing platform that automatically converts long-form video content (like podcasts, interviews, and talk shows) into highly engaging, viral, vertical (9:16) clips ready for TikTok, YouTube Shorts, and Instagram Reels.

The system leverages Google Gemini for identifying key moments, WhisperX for fast word-level audio transcription, and intelligent face tracking/active speaker detection (Columbia ASD) to automatically crop and focus the camera on whoever is speaking.

Product Screenshots

Landing Page

Dashboard & Clip Editor

📖 The Story Behind FynCut

FynCut was originally built and launched almost a year ago as a commercial SaaS product. The platform was designed to automate vertical clip editing for creators. However, managing serverless GPU infrastructure, API costs, marketing, and founder burnout solo proved challenging.

Instead of letting the code sit idle, I have open-sourced the entire project for the developer community to study, self-host, and extend!

Read the full story & post-mortem blog post: Why I'm Open-Sourcing My AI Video SaaS (FynCut)

Core Features

Intelligent Reframing (Face Tracking): Automatically crops horizontal video (16:9) to vertical (9:16) by dynamically panning/tracking the active speaker using a face-detection pipeline.
AI Moment Identification: Analyzes transcriptions with Google Gemini API to detect high-impact moments (viral hooks, stories, emotional segments, key questions) and gives them viral scores.
GPU-Accelerated Transcription: WhisperX is run on serverless GPUs for fast, word-level timestamps.
TikTok-Style Captions: Generates synchronized, word-by-word highlighted captions.
Custom Caption Burner: Allows users to customize caption fonts (Poppins, Anton, Montserrat, etc.), sizes, layout positions, and burn them directly into the video.
Scale-to-Zero Serverless Backend: Powered by Modal, executing GPU heavy workloads only when needed, minimizing infrastructure costs.
Robust Background Queues: Built with Next.js and Inngest to handle long-running video processing asynchronously with automatic retries and exponential backoff.

System Architecture

FynCut uses a decoupled, full-stack architecture divided into a Next.js client-facing web application and a serverless Modal Python backend.

High-Level Component Design

graph TD
    User(["User Browser"]) <--> |"HTTP / WebSockets"| NextJS["Next.js Frontend"]
    NextJS <--> |"Prisma ORM"| PostgreSQL[("PostgreSQL Database")]
    NextJS <--> |"Presigned Upload / Download"| S3[("AWS S3 Storage")]
    NextJS <--> |"Trigger Workflows"| Inngest["Inngest Event Engine"]
    
    subgraph SB ["Serverless Backend (Modal Cloud)"]
        Inngest <--> |"POST Request + Bearer Auth"| ModalGPU["Modal GPU: FynCutAi"]
        NextJS <--> |"POST Request + Bearer Auth"| ModalCPU["Modal CPU: CaptionBurner"]
        
        ModalGPU <--> |"Download Video / Upload Clips"| S3
        ModalCPU <--> |"Download Clips / Upload Captioned Clips"| S3
        ModalGPU --> |"moment analysis"| Gemini["Google Gemini AI"]
    end

Video Processing Sequence Diagram

The following sequence diagram details what happens when a user uploads a video for processing:

sequenceDiagram
    autonumber
    actor User as "User Browser"
    participant FE as "Next.js Web App"
    participant DB as "PostgreSQL"
    participant S3 as "AWS S3 Bucket"
    participant IG as "Inngest Runner"
    participant M_GPU as "Modal GPU (FynCutAi)"
    participant Gemini as "Gemini AI API"

    User->>FE: Selects video & clicks "Upload"
    FE->>DB: Create upload record (status: pending)
    FE->>S3: Upload video directly via S3 Presigned URL
    S3-->>FE: Upload complete
    FE->>IG: Dispatch "process-video-events" Event
    FE-->>User: Redirect to dashboard (status: queued)

    Note over IG: Inngest worker picks up the job
    IG->>DB: Check user credits & set status to "processing"
    IG->>M_GPU: Trigger Video Processing (s3_key)
    activate M_GPU

    M_GPU->>S3: Download input video
    M_GPU->>M_GPU: Generate video thumbnail
    M_GPU->>S3: Upload thumbnail
    
    Note over M_GPU: Run WhisperX model (GPU)
    M_GPU->>M_GPU: Extract audio & transcribe (word-level timestamps)
    
    M_GPU->>Gemini: Send transcript (Get interesting moments & metadata)
    Gemini-->>M_GPU: Return moments JSON (title, start, end, viral_score, keywords)

    Note over M_GPU: Parallel Clip Generation
    loop For each detected moment
        M_GPU->>M_GPU: Track active speaker faces (Columbia ASD + OpenCV)
        M_GPU->>M_GPU: Reframe & crop video to 9:16 centered on speaker
        M_GPU->>M_GPU: Generate subtitle files (SRT, VTT, TXT)
        M_GPU->>S3: Upload raw cropped clip & subtitles to S3
    end

    M_GPU-->>IG: Return completion response (clip keys, metadata)
    deactivate M_GPU

    IG->>DB: Save clip URLs and metadata, deduct user credits
    IG->>DB: Set video status to "processed"
    IG->>FE: Send email notification to user via Resend API
    FE-->>User: Update dashboard UI (clips ready)

📂 Repository Structure

The project is split into two folders:

FynCut/
├── frontend/             # Next.js App Router (T3 Stack)
│   ├── src/
│   │   ├── app/          # Next.js Pages and API Routes
│   │   ├── actions/      # Next.js Server Actions (Auth, Captions, S3)
│   │   ├── components/   # React Components (Dashboard, Player, Editor)
│   │   ├── inngest/      # Background event workflow definitions
│   │   └── env.js        # Environment validation schema (zod)
│   ├── prisma/           # Database Schema (PostgreSQL)
│   └── .env.example      # Frontend environment template
│
└── server/               # Serverless Python Backend (Modal)
    ├── main.py           # Modal entrypoint & GPU pipeline endpoints
    ├── requirements.txt  # Python package dependencies
    ├── Makefile          # Setup, test, and deployment automation
    └── .env.example      # Backend environment template

🚀 Getting Started

Follow the dedicated setup and deployment guides inside each folder to run the project locally or deploy it to production:

Backend Setup: Go to the Backend README to set up python, configure secrets, and deploy endpoints to Modal.
Frontend Setup: Go to the Frontend README to launch the Next.js app, configure database migrations, and connect the background worker.

🤝 Contributing

Contributions are welcome! If you'd like to improve FynCut, optimize face-tracking performance, or add support for new caption styles:

Fork the Repository.
Create your Feature Branch (git checkout -b feature/AmazingFeature).
Commit your changes (git commit -m 'Add some AmazingFeature').
Push to the Branch (git push origin feature/AmazingFeature).
Open a Pull Request.

License

This project is open-source and licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
docs		docs
frontend		frontend
server		server
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FynCut

Product Screenshots

Landing Page

Dashboard & Clip Editor

📖 The Story Behind FynCut

Core Features

System Architecture

High-Level Component Design

Video Processing Sequence Diagram

📂 Repository Structure

🚀 Getting Started

🤝 Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FynCut

Product Screenshots

Landing Page

Dashboard & Clip Editor

📖 The Story Behind FynCut

Core Features

System Architecture

High-Level Component Design

Video Processing Sequence Diagram

📂 Repository Structure

🚀 Getting Started

🤝 Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages