Skip to content

iyashjayesh/fyncut

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FynCut Logo

FynCut

FynCut is an AI-powered, serverless video editing platform that automatically converts long-form video content (like podcasts, interviews, and talk shows) into highly engaging, viral, vertical (9:16) clips ready for TikTok, YouTube Shorts, and Instagram Reels.

The system leverages Google Gemini for identifying key moments, WhisperX for fast word-level audio transcription, and intelligent face tracking/active speaker detection (Columbia ASD) to automatically crop and focus the camera on whoever is speaking.

Product Screenshots

Landing Page

Landing Page

Dashboard & Clip Editor

Dashboard and Clip Editor


📖 The Story Behind FynCut

FynCut was originally built and launched almost a year ago as a commercial SaaS product. The platform was designed to automate vertical clip editing for creators. However, managing serverless GPU infrastructure, API costs, marketing, and founder burnout solo proved challenging.

Instead of letting the code sit idle, I have open-sourced the entire project for the developer community to study, self-host, and extend!


Core Features

  • Intelligent Reframing (Face Tracking): Automatically crops horizontal video (16:9) to vertical (9:16) by dynamically panning/tracking the active speaker using a face-detection pipeline.
  • AI Moment Identification: Analyzes transcriptions with Google Gemini API to detect high-impact moments (viral hooks, stories, emotional segments, key questions) and gives them viral scores.
  • GPU-Accelerated Transcription: WhisperX is run on serverless GPUs for fast, word-level timestamps.
  • TikTok-Style Captions: Generates synchronized, word-by-word highlighted captions.
  • Custom Caption Burner: Allows users to customize caption fonts (Poppins, Anton, Montserrat, etc.), sizes, layout positions, and burn them directly into the video.
  • Scale-to-Zero Serverless Backend: Powered by Modal, executing GPU heavy workloads only when needed, minimizing infrastructure costs.
  • Robust Background Queues: Built with Next.js and Inngest to handle long-running video processing asynchronously with automatic retries and exponential backoff.

System Architecture

FynCut uses a decoupled, full-stack architecture divided into a Next.js client-facing web application and a serverless Modal Python backend.

High-Level Component Design

graph TD
    User(["User Browser"]) <--> |"HTTP / WebSockets"| NextJS["Next.js Frontend"]
    NextJS <--> |"Prisma ORM"| PostgreSQL[("PostgreSQL Database")]
    NextJS <--> |"Presigned Upload / Download"| S3[("AWS S3 Storage")]
    NextJS <--> |"Trigger Workflows"| Inngest["Inngest Event Engine"]
    
    subgraph SB ["Serverless Backend (Modal Cloud)"]
        Inngest <--> |"POST Request + Bearer Auth"| ModalGPU["Modal GPU: FynCutAi"]
        NextJS <--> |"POST Request + Bearer Auth"| ModalCPU["Modal CPU: CaptionBurner"]
        
        ModalGPU <--> |"Download Video / Upload Clips"| S3
        ModalCPU <--> |"Download Clips / Upload Captioned Clips"| S3
        ModalGPU --> |"moment analysis"| Gemini["Google Gemini AI"]
    end
Loading

Video Processing Sequence Diagram

The following sequence diagram details what happens when a user uploads a video for processing:

sequenceDiagram
    autonumber
    actor User as "User Browser"
    participant FE as "Next.js Web App"
    participant DB as "PostgreSQL"
    participant S3 as "AWS S3 Bucket"
    participant IG as "Inngest Runner"
    participant M_GPU as "Modal GPU (FynCutAi)"
    participant Gemini as "Gemini AI API"

    User->>FE: Selects video & clicks "Upload"
    FE->>DB: Create upload record (status: pending)
    FE->>S3: Upload video directly via S3 Presigned URL
    S3-->>FE: Upload complete
    FE->>IG: Dispatch "process-video-events" Event
    FE-->>User: Redirect to dashboard (status: queued)

    Note over IG: Inngest worker picks up the job
    IG->>DB: Check user credits & set status to "processing"
    IG->>M_GPU: Trigger Video Processing (s3_key)
    activate M_GPU

    M_GPU->>S3: Download input video
    M_GPU->>M_GPU: Generate video thumbnail
    M_GPU->>S3: Upload thumbnail
    
    Note over M_GPU: Run WhisperX model (GPU)
    M_GPU->>M_GPU: Extract audio & transcribe (word-level timestamps)
    
    M_GPU->>Gemini: Send transcript (Get interesting moments & metadata)
    Gemini-->>M_GPU: Return moments JSON (title, start, end, viral_score, keywords)

    Note over M_GPU: Parallel Clip Generation
    loop For each detected moment
        M_GPU->>M_GPU: Track active speaker faces (Columbia ASD + OpenCV)
        M_GPU->>M_GPU: Reframe & crop video to 9:16 centered on speaker
        M_GPU->>M_GPU: Generate subtitle files (SRT, VTT, TXT)
        M_GPU->>S3: Upload raw cropped clip & subtitles to S3
    end

    M_GPU-->>IG: Return completion response (clip keys, metadata)
    deactivate M_GPU

    IG->>DB: Save clip URLs and metadata, deduct user credits
    IG->>DB: Set video status to "processed"
    IG->>FE: Send email notification to user via Resend API
    FE-->>User: Update dashboard UI (clips ready)
Loading

📂 Repository Structure

The project is split into two folders:

FynCut/
├── frontend/             # Next.js App Router (T3 Stack)
│   ├── src/
│   │   ├── app/          # Next.js Pages and API Routes
│   │   ├── actions/      # Next.js Server Actions (Auth, Captions, S3)
│   │   ├── components/   # React Components (Dashboard, Player, Editor)
│   │   ├── inngest/      # Background event workflow definitions
│   │   └── env.js        # Environment validation schema (zod)
│   ├── prisma/           # Database Schema (PostgreSQL)
│   └── .env.example      # Frontend environment template
│
└── server/               # Serverless Python Backend (Modal)
    ├── main.py           # Modal entrypoint & GPU pipeline endpoints
    ├── requirements.txt  # Python package dependencies
    ├── Makefile          # Setup, test, and deployment automation
    └── .env.example      # Backend environment template

🚀 Getting Started

Follow the dedicated setup and deployment guides inside each folder to run the project locally or deploy it to production:

  1. Backend Setup: Go to the Backend README to set up python, configure secrets, and deploy endpoints to Modal.
  2. Frontend Setup: Go to the Frontend README to launch the Next.js app, configure database migrations, and connect the background worker.

🤝 Contributing

Contributions are welcome! If you'd like to improve FynCut, optimize face-tracking performance, or add support for new caption styles:

  1. Fork the Repository.
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature).
  3. Commit your changes (git commit -m 'Add some AmazingFeature').
  4. Push to the Branch (git push origin feature/AmazingFeature).
  5. Open a Pull Request.

License

This project is open-source and licensed under the MIT License.

About

FynCut is an AI-powered, serverless video editing platform that automatically converts long-form video content (like podcasts, interviews, and talk shows) into highly engaging, viral, vertical (9:16) clips ready for TikTok, YouTube Shorts, and Instagram Reels.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors