feat: Phase 1 video upload support (Blossom-compliant-ish) by tlongwell-block · Pull Request #285 · block/sprout

tlongwell-block · 2026-04-09T19:26:40Z

Video Upload Support — Full Stack

What

End-to-end video support for Sprout: Blossom-compliant relay upload/validation/serving + desktop client upload with ffmpeg transcode + inline video playback with poster frame thumbnails.

Relay — Upload, Validation, Serving

Upload: Streaming upload to S3 via temp file — incremental SHA-256, size enforcement mid-stream, magic-byte validation (4 KiB sniff buffer)
Validation: Codec allowlist (H.264), duration cap (600s), resolution cap (3840×2160), moov-before-mdat check with bounded atom scanner (MAX_ATOMS=1024), container brand check, zero-duration rejection, timescale=0 guard (prevents div-by-zero panic in mp4 crate)
Serving: S3-native range GET for 206 Partial Content (video seeking), streaming full downloads, suffix range support (bytes=-N) per RFC 9110, multi-range fallback to 200 (RFC 9110 §14.2)
Security: Content-Type spoofing prevention (each path rejects the other via magic bytes), auth-before-body, no internal details leaked in errors, moov scanner fails closed on budget exceeded
imeta: Full NIP-71 validation — duration/bitrate/image fields gated to video/mp4 only, hash cross-checks (thumb keyed by parent, poster frame independent), poster frame blob+sidecar verification with MIME/extension checks, duration cross-check against sidecar, defense-in-depth on poster hash extraction

Desktop Client — Upload + Playback

ffmpeg transcode: All video files (including MP4) transcoded to H.264/AAC/MP4/fast-start before upload. Handles HEVC, VP9, ProRes, non-faststart MP4, 10-bit, wrong pixel format, MOV containers
Poster frame extraction: After transcode, ffmpeg extracts a JPEG poster frame (-ss 1 with fallback to first frame for <1s videos, scale=640:-2, q:v 2). Poster is uploaded as a separate image blob (best-effort — failure does not block video upload). Poster URL returned in BlobDescriptor.image and emitted as NIP-71 image field in imeta tags
All upload paths: 📎 button (file dialog), drag-and-drop, paste — all support video with transcode + poster extraction
File picker: Accepts mp4, mov, mkv, webm, avi (+ existing image formats)
Inline playback: Videos render as <video poster={url} preload="metadata"> — browser fetches only moov atom initially, shows poster thumbnail, uses range requests for playback
Poster rendering on received messages: MessageRow parses imeta tags via parseImetaTags(), threads imetaByUrl map to Markdown component for poster lookup (falls back to thumb for compatibility)
Proxy: sprout-media:// custom protocol forwards Range headers for video seeking, propagates Content-Range/Accept-Ranges
Duration: End-to-end — BlobDescriptor → imeta tag builder → relay cross-check
Safety: spawn_blocking on all sync I/O paths, UUID temp files, RAII temp cleanup (closure guard ensures cleanup on all exit paths), OsStr path handling, find_ffmpeg() with platform-specific install instructions, ffmpeg stderr captured and logged for debugging
Requires: ffmpeg on PATH (not bundled)

Architecture

Desktop:
  📎/Drop/Paste → is_video_file()? → ffmpeg transcode → extract poster frame
                                   → upload video → upload poster (best-effort)
                                   → BlobDescriptor { image: poster_url }
                                   → direct upload (images, no poster)

Relay:
  PUT /media/upload → Content-Type branch →
    video/ → process_video_upload() → stream to temp → validate → put_file() to S3
    image/ → process_upload() → validate_content() → put() to S3

  GET /media/{hash}.mp4 →
    Range header? → HEAD for size → get_range() → 206 Partial Content
    No range?    → get_stream()   → Body::from_stream() → 200 OK

  Message send → validate_imeta_tags() → verify_imeta_blobs() →
    Checks: video blob exists, poster blob exists, poster is image MIME,
    poster extension matches sidecar, duration cross-check

  sprout-media:// proxy → forwards Range → propagates 200/206/416

Files Changed (25 files, ~3500 lines)

sprout-media: storage.rs, upload.rs, validation.rs, error.rs, config.rs, types.rs, auth.rs, lib.rs, Cargo.toml
sprout-relay: media.rs, messages.rs, config.rs, router.rs
sprout-test-client: e2e_media_video.rs (7 E2E tests including poster imeta accept/reject)
desktop/src-tauri: commands/media.rs (ffmpeg transcode + poster extraction), lib.rs (proxy Range forwarding)
desktop/src: useMediaUpload.ts, markdown.tsx, parseImeta.ts, MessageComposer.tsx, MessageRow.tsx, tauri.ts, mediaUrl.ts
desktop/scripts: check-file-sizes.mjs (media.rs limit bump for poster helpers)
scripts: test-video-upload.sh (15-case live test script including poster tests)

Test Coverage

761 unit tests passing across workspace (sprout-media + sprout-relay + all crates)
7 E2E integration tests (upload roundtrip, Content-Type spoofing, range 206/416, auth, video+poster imeta accepted via WS, video-as-poster rejected via WS)
15-case live test script verified against running relay (including poster upload, blob coexistence, sidecar metadata, Content-Type checks)
Zero clippy warnings, all pre-push hooks green (biome, rustfmt, cargo check, desktop build)

Review Scores (after crossfire fix iterations)

Opus (Claude 4.6 Opus): 9/10 APPROVE_WITH_NOTES — all prior findings addressed, remaining notes are low-severity (proxy streaming deferred to v2)
Codex (GPT-5.4): 9/10 APPROVE — iterated through 4 review passes total; all findings addressed including scoped auth, TOCTOU fd-pinning, proxy OOM cap, ffmpeg timeout + stderr deadlock prevention
No merge blockers

Fix Commits (crossfire follow-ups)

c06e4f2 — scoped auth window (600s images / 3600s video), TOCTOU fd-pinning restored, proxy 20 MiB OOM cap, body-limit error robustness, client auth expiry scoped
34a6f7b — ffmpeg wall-clock timeout (10min transcode, 30s poster extraction)
3745637 — -loglevel error on all ffmpeg calls to prevent stderr pipe deadlock in timeout wrapper

No Database Changes

Zero migrations, schema changes, or new tables. Video metadata stored in S3 sidecars (same pattern as images). Poster frames are independent content-addressed blobs linked only through the imeta tag.

V2 Roadmap

Smart skip/remux via ffprobe (~100x faster for compliant files)
Progress bar for ffmpeg transcode
Streaming upload from desktop (avoid RAM buffering for large files)
Streaming proxy response (avoid desktop RAM buffering on playback)
Bundled ffmpeg (zero-friction install)
Per-pubkey upload rate limiting + storage quotas
Cancellation for long transcodes
Blurhash generation for poster frames

…r tests messages.rs: - Add image_value tracking and hash cross-check against x field (same pattern as thumb — NIP-71 poster frame must reference same blob) - image field already rejects .mp4 URLs (image-only extensions) - 2 new tests: hash mismatch rejection, matching hash acceptance validation.rs: - Add MAX_ATOMS=1024 iteration limit to check_moov_before_mdat() (prevents DoS from crafted files with millions of tiny atoms) - Handle extended atom size (compact_size==1): read 64-bit size and continue scanning instead of silently stopping - Handle atom_size==0 (extends-to-EOF): check mdat before breaking - 4 new tests: iteration limit, extended size, extended mdat-before-moov, EOF atom mdat-before-moov upload.rs: - build_descriptor already correctly filters empty strings to None (no code change needed — added 3 tests proving it) - Tests verify JSON serialization omits empty thumb/blurhash for video

Duration validation now rejects d <= 0.0 instead of d < 0.0. Zero-duration videos are semantically invalid — server-side validate_video_file() also catches this via mvhd timescale, but belt-and-suspenders at the imeta layer is cheap and safe. Addresses Clove's re-review item #3 (low severity).

- get_range(key, start, end): S3-native range GET via bucket.get_object_range(), inclusive byte offsets, only transfers requested slice (never loads full blob) - put_file(key, path, content_type): streaming upload from disk via 8 MiB BufReader, full file never held in RAM simultaneously - duration_secs field on BlobMeta for video sidecar metadata - Improved doc comments on put() method

config.rs: - Add SPROUT_MAX_VIDEO_BYTES env var parsing (default 500 MB) - Wires sprout-media's max_video_bytes into the relay Config router.rs: - Change media body limit from max_image_bytes to max(max_image_bytes, max_video_bytes) - Ensures video uploads aren't rejected at the transport layer - Per-MIME app-level limits still enforced in sprout-media validation

Replace full-blob load + in-memory slice with: - HEAD to get total size (no blob data loaded) - get_range(key, start, end) for the 206 path only - get(key) preserved for 200 full-download path Eliminates O(blob_size) RAM allocation per range request. A 500 MB video range request now allocates at most 16 MiB. Also includes rustfmt cleanup on pre-existing lines. Closes C3 from code review.

Before: check_moov_before_mdat() returned Ok(()) when MAX_ATOMS was exceeded, silently passing files with 1025+ junk atoms hiding mdat. After: returns Err(MoovNotAtFront) — fail closed. A file with too many top-level atoms is abnormal and cannot be verified as fast-start. Updated test to assert the error instead of Ok.

The poster frame (image field) is an independent blob with its own content hash — it cannot match the video's x hash. The cross-check rejected all legitimate poster frames by construction. Fix: remove the cross-check entirely. Keep URL format validation and image-extension allowlist (jpg/png/gif/webp). The poster frame is validated as a local media URL with an image extension only. Also: update thumb cross-check comment to clarify it checks URL key consistency (thumbnails are keyed by parent hash), not content identity. Removed: image_value variable, hash cross-check block, 2 obsolete tests. Updated: poster frame test now uses different hash to prove independence.

Remove video/mp4 from ALLOWED_MIME_TYPES in validate_content(). This closes the Content-Type spoofing attack: an MP4 uploaded as image/jpeg now hits the image path, infer::get() detects video/mp4, and validate_content() rejects it as DisallowedContentType. Video uploads use process_video_upload() which has its own independent magic-byte check. Each path rejects the other's content — defense in depth. Also removes dead video/mp4 branches in validate_content() (size cap, image bomb skip) since video/mp4 can no longer reach that code.

Closes the contract mismatch between validate_video_file() (accepted duration <= 0.0) and validate_imeta_tags() (rejected duration <= 0.0). A zero-duration video would pass upload validation but later fail imeta validation — inconsistent behavior. Now both paths agree: duration must be > 0.0 and <= 600.0.

- get_stream(key): returns ByteStream (Pin<Box<Stream<Item=Result<Bytes, MediaError>>>) - Wraps bucket.get_object_stream(), checks status_code for 404, maps S3 errors - Full object never buffered in RAM — intended for Body::from_stream() responses - ByteStream type alias exported from lib.rs for downstream use

When axum's RequestBodyLimitLayer rejects an oversized stream, the error propagates as a 'length limit' error through the body stream. Previously this was mapped to MediaError::Io → 500 Internal Server Error. Now: detect 'length limit' / 'body limit' in the stream error message, map to io::ErrorKind::WriteZero, catch in the read loop, and return MediaError::FileTooLarge → 413 Payload Too Large. This gives clients a proper 413 response instead of a confusing 500.

New test file: e2e_media_video.rs with 5 integration tests: 1. test_video_upload_and_get — upload MP4, verify descriptor + GET 2. test_video_content_type_spoofing_rejected — MP4 as image/jpeg → rejected 3. test_video_range_request_206 — Range header → 206 + correct bytes 4. test_video_range_request_416 — out-of-range → 416 5. test_video_upload_no_auth_returns_401 — no auth → 401 Includes self-contained minimal MP4 builder (hand-crafted H.264 boxes). Tests are #[ignore] — require running relay + MinIO.

Upload: swap Bytes extractor for axum::body::Body. Video path streams directly to disk via into_data_stream() — never fully buffered in RAM. Image path collects to bytes with explicit limit. Removes futures_util::stream::once() workaround. Download: 200 path uses get_stream() + Body::from_stream() instead of get() — streams from S3 without loading full blob into RAM. HEAD first for Content-Length (same pattern as 206 path). Cleanup: remove stale streaming TODOs from media.rs, update router.rs comment to reflect streaming reality.

Add image field HEAD check to verify_imeta_blobs, same pattern as thumb. Key difference: poster frames are independent blobs, so the hash is extracted from the image URL itself (via extract_hash_from_media_url), not from x_value. This closes the gap where clients could reference nonexistent poster images in imeta tags and the relay would accept them. Note: unit testing requires MediaStorage (S3 HEAD). Covered by E2E tests in e2e_media_extended.rs (WebSocket imeta validation).

Add suffix range parsing to parse_byte_range(): bytes=-N returns the last N bytes. Clamps to file start if N > total. Rejects bytes=-0 and suffix on empty files. 4 new/updated tests. Removes known-deviation comment.

The first network read could be as small as 1 byte (proxy fragmentation), which is too small for infer::get() to detect MP4 magic bytes (needs 12+ bytes for ftyp header). Previously we captured only the first chunk. Now: accumulate up to 64 bytes across reads into a sniff buffer before passing to infer::get(). This handles tiny initial chunks from proxies, slow clients, or chunked transfer encoding.

4 KiB is the standard sniff buffer size — infer checks signatures at various offsets, not just the first few bytes. 64 was sufficient for MP4 ftyp but too small for robust format detection in general. Per Hana's architecture recommendation.

Covers the full Blossom video upload flow: - Upload MP4 with kind:24242 auth (nak + ffmpeg-generated test file) - GET full blob (200, size match) - HEAD with Accept-Ranges: bytes - Range GET (206 Partial Content, exact byte count) - Range GET past EOF (416 Range Not Satisfiable) - Content-Type spoofing rejection (video/mp4 header, PNG body) - Idempotent re-upload (same hash returns 200) Requires: ffmpeg, nak, curl, jq, shasum. Works in dev mode.

Add dependencies: mp4, tempfile, tokio-util, futures-util, futures-core Add video error variants: WrongCodec, DurationTooLong, ResolutionTooHigh, MoovNotAtFront, UnsupportedContainer, InvalidVideo, Io Add max_video_bytes config field (default 500 MB) Add duration field to BlobDescriptor Bump Blossom auth window from 10min to 1hr for large uploads

messages.rs: - Poster frame verification now loads sidecar (proves upload completed) - Verifies sidecar MIME is image type (not video/other) - HEADs canonical blob key using sidecar extension (matches serving path) - Without sidecar check, a poster URL could pass verification but 404 on serve e2e_media_video.rs: - Add X-SHA-256 header to all 4 authenticated upload requests (BUD-11) - Without this header, uploads would get 401 instead of testing the feature

…_blobs 1. Poster frame extension: extract ext from image URL, compare against sidecar's canonical extension. Mismatch means the URL would 404 on serve (GET resolves via sidecar ext, not URL ext). 2. Duration cross-check: if sidecar has duration_secs and client claims a duration in imeta, compare within 0.1s tolerance (float rounding from mvhd timescale). Prevents clients from lying about duration.

…e standalone blobs)

… poster defense-in-depth - Reject video-only NIP-71 fields (duration, bitrate, image) on non-video imeta tags — previously accepted silently for image blobs - Fall back to 200 full-body response for unsupported multi-range requests instead of returning 416 (per RFC 9110 §14.2: server MAY ignore Range) - Return error instead of silently skipping poster frame verification when hash extraction fails (defense-in-depth; syntactic validation catches this upstream, but fail-closed is safer) - Drop temp file immediately after S3 upload to free disk space eagerly instead of waiting for function return (matters for 500MB uploads) - Add FRAGILE marker on body-limit string-matching error detection - Add tests: .thumb.jpg rejected as poster frame, duration on image rejected

…line playback Desktop client now supports uploading video files through all entry points: 📎 Button (pick_and_upload_media): - File picker accepts mp4, mov, mkv, webm, avi (+ existing image formats) - All video files transcoded to H.264/AAC/MP4/fast-start via ffmpeg - Sniff magic bytes → transcode if video → upload - All sync I/O in spawn_blocking to avoid async runtime starvation Drag-and-drop / Paste (upload_media_bytes): - Video bytes written to temp file → ffmpeg transcode → upload - Accepts video/mp4, video/quicktime, video/x-matroska, video/webm, video/x-msvideo - Temp files cleaned up after upload (or on failure) Rendering: - Videos use ![video](url) markdown syntax - Markdown renderer detects .mp4 URLs → renders <video> with controls - Images continue to render as <img> as before Infrastructure: - find_ffmpeg() distinguishes NotFound vs broken install vs other errors - UUID-based temp file names (no collision under concurrent uploads) - OsStr path passing to ffmpeg (handles non-UTF-8 paths on Unix) - BlobDescriptor gains duration field (Tauri + TS types) - imeta tag builder includes duration for video - parseImeta parses duration from incoming tags Requires ffmpeg on PATH. Clear error message with install instructions if missing. TODO(v2): smart skip/remux via ffprobe, progress bar, streaming upload, cancellation, bundled ffmpeg.

- Extract JPEG poster frame from transcoded MP4 via ffmpeg (-ss 1, fallback to first frame for <1s videos, scale=640:-2, q:v 2) - Upload video first, then poster as separate image blob (best-effort: poster failure does not block video upload) - Return poster URL in BlobDescriptor.image field - Emit NIP-71 `image` field in imeta tags (server already validates it) - Render with <video poster={url} preload="metadata"> — browser fetches only moov atom initially, uses range requests for playback - Thread imetaByUrl from MessageRow through Markdown for received messages - Parse `image` field in parseImetaTags for poster lookup on render - Guard timescale=0 in MP4 validation (prevents div-by-zero panic in mp4 crate) - Add E2E tests: video+poster imeta accepted via WS, video-as-poster rejected - Extend test-video-upload.sh with poster upload, blob coexistence, sidecar checks - Bump media.rs file size limit (550→650) for poster extraction helpers

… cap, error robustness Server-side: - auth: verify_blossom_auth_event takes max_age_secs parameter; images use 600s (10 min), video uses 3600s (1 hr). Previously all uploads shared the 1-hour window. - upload: body-limit error detection adds LengthLimitError pattern for belt-and-suspenders robustness. FileTooLarge.size reports honest bytes-received-before-cutoff instead of nonsensical total+max sum. - relay: AuthenticatedUpload extractor uses permissive 3600s window (content type unknown at extraction time); upload functions re-verify with the correct per-type window after body consumption. Desktop: - media: pick_and_upload_media restores TOCTOU-safe fd-pinning. File opened before spawn_blocking to pin inode; sniff header read from pinned fd; video path resolves fd_real_path for ffmpeg. Fd kept alive through entire ffmpeg transcode (drop only after completion). - media: sign_blossom_upload_auth takes expiry_secs; do_upload derives it from MIME (3600s video, 300s images). Previously all uploads used 300s, so video uploads >5 min would fail with expired auth. - lib: proxy adds 20 MiB OOM defense cap for non-range GETs. Range requests (≤16 MiB from server) unaffected. - media: enhanced TODO on do_upload with streaming fix guidance for v2.

wesbillman

SICK!

- New run_ffmpeg_with_timeout() helper: spawns child, polls try_wait() every 500ms, kills the process if the deadline is exceeded. - Transcode: 10-minute timeout (FFMPEG_TIMEOUT). Generous for any reasonable video; pathological inputs get killed instead of blocking a Tokio worker thread indefinitely. - Poster extraction: 30-second timeout. Single-frame decode should complete in seconds. - All three ffmpeg invocations (transcode, poster seek-to-1s, poster fallback) now use the timeout wrapper.

Add -loglevel error to all three ffmpeg invocations (transcode, poster seek-to-1s, poster fallback). Without this, ffmpeg's progress and diagnostic output can fill the OS pipe buffer (~64 KiB), causing the child to block on write() and never exit. The timeout wrapper only reads stderr after exit, so a full pipe creates a deadlock that manifests as a false timeout after 10 minutes. -loglevel error suppresses progress spam while preserving actual error messages (which are small and won't fill the buffer). Added a doc comment on run_ffmpeg_with_timeout explaining the constraint.

…ona-migration * origin/main: feat(desktop): add Pulse social notes surface (#296) Fix flaky desktop smoke tests (#294) Add agent lifecycle controls to channel members sidebar (#291) Update nest_agents.md tagging info (#292) feat: add Sprout nest — persistent agent workspace at ~/.sprout (#290) Fix auth and SSRF vulns (#261) Add per-agent MCP toolset configuration to agent setup (#279) feat(desktop): team & persona import/edit flows (#288) Remove menu item subtitles and fix persona card overflow (#289) feat: Phase 1 video upload support (Blossom-compliant-ish) (#285) Add inline subtitles to menu items and field descriptions (#276) Improve ephemeral channel affordances and hide archived sidebar rows (#286) Fix @mention search to use word-boundary prefix matching (#278) Allow bot owners to remove their agents from any channel (#284) [codex] Polish agent selectors and settings layout (#283) # Conflicts: # desktop/scripts/check-file-sizes.mjs

tlongwell-block added 20 commits April 9, 2026 11:43

fix(media): support suffix byte ranges (bytes=-N) per RFC 9110

04e0071

Add suffix range parsing to parse_byte_range(): bytes=-N returns the last N bytes. Clamps to file start if N > total. Rejects bytes=-0 and suffix on empty files. 4 new/updated tests. Removes known-deviation comment.

fix: use std::io::Error::other() to satisfy clippy io_other_error lint

0b354bf

tlongwell-block requested a review from wesbillman as a code owner April 9, 2026 19:26

tlongwell-block added 7 commits April 9, 2026 16:19

fix: remove needless Ok wrapper to satisfy clippy

3ff104e

fix: add Authorization header and remove stray tokens in E2E video tests

a5b8901

fix: reject thumbnail URLs in imeta image field (poster frames must b…

0d0274a

…e standalone blobs)

style: rustfmt formatting fixes

aefeb5c

tlongwell-block force-pushed the feat/video-support branch from 5cded13 to 861a395 Compare April 10, 2026 01:40

tlongwell-block force-pushed the feat/video-support branch from 861a395 to f2fdec4 Compare April 10, 2026 02:24

tlongwell-block added 3 commits April 9, 2026 22:50

fix(desktop): show video thumbnail before play (preload=auto)

12a0557

wesbillman approved these changes Apr 10, 2026

View reviewed changes

tlongwell-block added 2 commits April 10, 2026 11:08

tlongwell-block changed the title ~~feat: Phase 1 video upload support (Blossom-compliant)~~ feat: Phase 1 video upload support (Blossom-compliant-ish) Apr 10, 2026

tlongwell-block merged commit 61efb88 into main Apr 10, 2026
12 of 13 checks passed

tlongwell-block deleted the feat/video-support branch April 10, 2026 16:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Phase 1 video upload support (Blossom-compliant-ish)#285

feat: Phase 1 video upload support (Blossom-compliant-ish)#285
tlongwell-block merged 33 commits intomainfrom
feat/video-support

tlongwell-block commented Apr 9, 2026 •

edited

Loading

Uh oh!

wesbillman left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tlongwell-block commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Video Upload Support — Full Stack

What

Relay — Upload, Validation, Serving

Desktop Client — Upload + Playback

Architecture

Files Changed (25 files, ~3500 lines)

Test Coverage

Review Scores (after crossfire fix iterations)

Fix Commits (crossfire follow-ups)

No Database Changes

V2 Roadmap

Uh oh!

wesbillman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tlongwell-block commented Apr 9, 2026 •

edited

Loading