Skip to content

BitByBit-B3/frame-os

Repository files navigation

FrameOS

chat-first AI video editor. drop raw footage, talk to it, get an editable timeline in 60 seconds. desktop-native, powered entirely by gemini's multimodal stack.

20-hour hackathon build. spec lives in docs/DESIGN.md.


quick demo (the 5-line pitch)

  1. open the tauri app — cd apps/desktop && pnpm tauri dev
  2. drop samples/sintel-30s.mp4 into the assets panel
  3. wait ~45s — watch the three pipeline badges go green (transcript + frames + timeline build in parallel, automatically, no button click)
  4. click the asset → real editable timeline materializes → chat "add a caption at 5 seconds that says the magic" → it just does it
  5. hit Export MP4 → server-side ffmpeg burns captions in → file lands in your downloads

full recorded-walkthrough script + judging criteria hooks + "do not click X" warnings live in docs/DEMO.md.


where to look

if you're doing... read this first
architecture / engineering decisions docs/DESIGN.md
frontend, UI, UX, design generation docs/PRODUCT-DESIGN.md
picking what to build next docs/PLAN.md (after design sign-off)
understanding what we're NOT doing docs/NOT-DOING.md
writing code (standards, setup, commands) AGENTS.md
claude code-specific bits CLAUDE.md

the stack (one line)

tauri 2 rust desktop · python fastapi backend in k8s (kind locally) · seaweedfs in-cluster storage · gemini 2.5/3.x + veo 3.1 + imagen 4 + nano banana + lyria 3 · groq whisper · pexels.


the vision (raw brainstorm — keep adding to it)

we have a main concept called project- that is the main part of our project, it is the main componenet that we have, so we have diff projects per diff things sorta ykwim?

within those projects we have assets mainly those assets are video content (raw) which contains the video and audio raw content and we can provide custom sources for context.

we have a few features:

  • thumnail generation
  • QnA with the content
  • transcript generation
  • search
  • b roll generation (or any content generation)
  • transitions generation (again through veo 3)
  • background audio generation (through gemini)

we need to provide some tools to our agents, some of the ideas i have are:

  • zoom
  • changing color tones
  • cropping
  • camera stabilizations
  • cut clip
  • remove clip
  • move clip
  • apply transitions

we need a massive sort of a chat that sort of the main thing- we need a AI native sort of a view, its the timeline, preview and then the chat mainly sorta ykwim?

we need to be able change audio levels and shape it out and all of that, and sound effects

prepped like text alignment as well, sort of text alignment and shit like that ykwim?

we have a big feature called auto edit- that is our killer feature

retention graph of timeline needs to be provided like a guess of sorts needs to be provided.

short form content re purpose of scnearios for the content man.

highlight generation for the intro for high retention sorta

AI caption adding to the content it self like what needs to be added sorta- we need to have styles as well for this shit- we have transcriptions this is something built on top of the transcripts

review video functiaonlity- sort of a User Acceptance Editors

section labelling for the video like for youtube for example

timeline heatmap sort of like the strong moments, weak moments, etc...

auto voice over


quick start (local dev with kind)

# prerequisites (macOS — adapt for linux)
brew install kind kubectl skaffold docker rustup uv pnpm

# spin up local k8s cluster
kind create cluster --name frameos --config infra/kind-config.yaml
kubectl create namespace frameos

# load API keys into a k8s secret
kubectl -n frameos create secret generic api-keys \
  --from-literal=GEMINI_API_KEY=$GEMINI_API_KEY \
  --from-literal=GROQ_API_KEY=$GROQ_API_KEY \
  --from-literal=PEXELS_API_KEY=$PEXELS_API_KEY

# deploy backend + seaweedfs
kubectl apply -f infra/k8s/

# install deps
cd backend && uv sync && cd ..
cd desktop && pnpm install && cd ..

# run (two terminals)
cd infra && skaffold dev          # backend live-reload
cd desktop && pnpm tauri dev      # desktop app

full setup, day-to-day commands, coding standards, testing, and git workflow are in AGENTS.md.

About

the operating system for ai native video editing.

Resources

Stars

Watchers

Forks

Contributors