Migrate attachment bytes to a Storage bucket (stage 1)#218
Merged
Conversation
Write up the in-progress plan to move message_attachments file bytes off base64-in-Postgres into a private Storage bucket (the documents pattern), unifying the app on one file-storage mechanism. Captures the four resolved design decisions: Venice vision reads via signed URL (the API accepts a public URL, so no client byte download); server-side expiry sweep via pg_cron + edge function (bucket objects are storage cost that can't depend on an open tab); recipe_images left out of scope; and no backfill - liveness keys off storage_path so every pre-migration row is "expired" by definition, with a one-time idempotent reclaim of the legacy base64 and the data-column drop deferred to a collapse follow-up. Linked from the dev README's In-progress section.
Move message_attachments file bytes off base64-in-Postgres into a private `attachments` Storage bucket, mirroring the documents bucket - one file-storage mechanism. This stage wires the full read/write path; the server-side expiry sweep and the `data`-column drop are follow-ups. Schema: add `storage_path` to message_attachments + the `attachments` bucket and its three storage.objects RLS policies (self-prefix scoped). Liveness now keys on storage_path (live = object present); the live index and the listAttachmentSummaries expired-proxy follow. A one-time idempotent reclaim nulls the legacy base64 so every pre-bucket row is treated as expired (extracted_text preserved, so they still render as chips and doc_create can still promote them from text). The existing expire_old_attachments RPC + browser worker are left inert this stage. Data layer: Attachment drops data_base64 for storage_path; addAttachments mints client-side UUIDs, uploads each file to the bucket, and stores the path (no bytes on the wire). New helpers: createAttachmentSignedUrls (batched) and downloadAttachmentBlob. listAttachments* stop projecting bytes, so thread load no longer drags megabytes of base64 into memory. Reads via signed URL: the chat-loop pre-resolves signed URLs for live image attachments and threads them through toVeniceMessage -> buildUserVeniceContent (now URL-driven, still a pure transform); Venice fetches the URL server-side (its vision input accepts a public URL). MessageAttachments resolves signed URLs in an effect for previews + download links. analyze_image hands Venice a signed URL; doc_create and recipe_photos_attach download the bytes from the bucket. Generated images ride the same addAttachments upload path. Tests updated for the storage_path shape and the imageUrls argument. NOT YET VERIFIED (no browser / no live Venice from here): that Venice actually fetches our signed URLs, the image-preview render path, and the upload/download round-trips. These need a live pass before merge.
Mark the bucket migration's data path as landed in the in-progress plan, record Stage 2 (server-side expiry) and the data-column collapse as still TODO, and list the live verification owed (Venice signed-URL fetch, preview rendering, upload/download round-trips). Add a status banner + storage_path/signed-URL note to docs/dev/attachments.md so the feature doc reflects the new reality rather than the base64-in-data design.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SYNOPSIS
Stage 1 of unifying file storage: move
message_attachmentsbytes off base64-in-Postgres into a privateattachmentsStorage bucket, read via signed URLs.PURPOSE
Attachments stored file bytes as base64 in
message_attachments.data, a second file-storage mechanism alongside the Library'sdocumentsbucket, and dragged megabytes of base64 into memory on every thread load. This unifies on the bucket pattern.DESCRIPTION
Schema. Adds
storage_pathtomessage_attachmentsand the privateattachmentsbucket + three self-prefixstorage.objectsRLS policies (mirroringdocuments). Liveness keys onstorage_path(live = object present), so every pre-bucket row is "expired" by definition - a one-time idempotent reclaim nulls the legacy base64 (extracted_textpreserved, so old rows still render as expired chips anddoc_createcan still promote them from text). No backfill.Write.
addAttachmentsmints attachment UUIDs client-side, uploads each file to the bucket, and storesstorage_path(never bytes). Generated images ride the same path.Read via signed URL.
listAttachments*stop projecting bytes. The chat-loop pre-resolves batched signed URLs for live image attachments and threads them throughtoVeniceMessage->buildUserVeniceContent(now URL-driven, still a pure transform); Venice fetches the URL server-side (its vision input accepts a public URL, which a signed URL is for its TTL).MessageAttachmentsresolves signed URLs in an effect for previews + downloads;analyze_imagehands Venice a signed URL;doc_create/recipe_photos_attachdownload bytes from the bucket.Deliberately deferred (NOT in this PR):
attachment_expiryworker +expire_old_attachmentsRPC are left INERT, so uploaded objects do not yet expire (they accumulate). Stage 2.data-column drop (collapse follow-up).Notes for reviewers:
data, so the column stays this PR; dropping it must happen together with removing that UPDATE (a later apply can't reference a dropped column).recipe_imagesis a separate base64 store, intentionally untouched.Not verified from the cloud env (needs a live pass): that Venice fetches the signed URL on a vision turn (load-bearing; base64 data-URI remains as fallback if not), the preview render path, and the upload/download round-trips. Requires
mise run syncto create the bucket + column.Gate: svelte-check 0 errors, lint, knip, 1770 tests, build clean, markdownlint.
Generated by Claude Code