Skip to content

Attachments stage 2: server-side expiry sweep#219

Merged
sysread merged 3 commits into
mainfrom
claude/affectionate-ritchie-1jcnn
May 30, 2026
Merged

Attachments stage 2: server-side expiry sweep#219
sysread merged 3 commits into
mainfrom
claude/affectionate-ritchie-1jcnn

Conversation

@sysread
Copy link
Copy Markdown
Owner

@sysread sysread commented May 30, 2026

SYNOPSIS

Attachments Stage 2: a server-side expiry sweep that deletes bucket objects on a schedule. Plus a doc capturing the text-parser CORS bug for the edge-function project.

PURPOSE

Stage 1 moved attachment bytes into the attachments bucket but left expiry inert (the old browser worker's RPC now matches zero rows), so uploaded objects never get reclaimed. Storage objects are real cost; cleanup must run server-side, not depend on an open tab.

DESCRIPTION

expire-attachments edge function - standalone, not a route on venice (expiry only touches Storage, never calls Venice; also keeps it clear of the in-flight edge-function migration). Service-role gated, mirroring the backfill's JWT-role auth. Deployed via its own line in deploy.yml (the workflow previously deployed only venice).

Schema - two service-definer, service-role-only RPCs: list_expirable_attachments (live + owning thread dormant p_days, bounded, FOR UPDATE SKIP LOCKED) and mark_attachments_expired (null storage_path + stamp expired_at). The function does the storage.remove between them. Plus nak_trigger_attachment_expiry + an hourly pg_cron job, same Vault-secret custody + local-stack guards as the embed backfill.

_shared/expire-attachments.ts - I/O-free runExpiry drain loop (batch -> delete -> mark, until short batch / row cap / time budget). No per-row claim: delete + mark are idempotent, so overlapping ticks can't corrupt.

Deferred (next commits): retiring the now-inert browser attachment_expiry supervisor unit + expire_old_attachments RPC (cleanup; harmless meanwhile), then the data-column collapse.

Also in this branch: a doc-only commit fleshing out text-parser.md with the CORS diagnosis for the other edge-function session (text extraction broken from the browser - pre-existing, unrelated to storage). Rides along since it's a migration note.

Verified: 5 new Deno tests (33 total green), handler deno check passes. Not verifiable from the cloud env: the cron + Storage round-trip - confirm post-deploy that an object disappears ~30 days after its thread goes dormant.


Generated by Claude Code

claude added 3 commits May 30, 2026 02:17
Text extraction (PDF/txt/md uploads) fails from the browser with
"Network error contacting Venice: Failed to fetch" - a CORS/network-
layer rejection of the direct call to /augment/text-parser, even though
the chat/image endpoints work from the same host + key. Pre-existing
(extraction runs before any storage write; the attachments-storage
migration is downstream), surfaced now because non-image upload was
never exercised live.

Flesh out the text-parser edge-function sub-plan with the full
diagnosis, the two call sites that must move server-side (chat
attachments in Chat.svelte; Library uploads in documents.ts), the
current extractText shape, the target /text-parser route, and the
large-file escape-hatch wrinkle. The server-side proxy is the fix
regardless of the exact browser-side cause.
Replace the (now-inert) browser attachment_expiry worker with a server-side
sweep that actually deletes bucket objects - the thing SQL can't do.

- schema.sql: two service-definer, service-role-only RPCs -
  list_expirable_attachments (live + thread dormant p_days, bounded, FOR
  UPDATE SKIP LOCKED) and mark_attachments_expired (null storage_path +
  stamp expired_at). Plus nak_trigger_attachment_expiry + an hourly pg_cron
  job, same Vault-secret custody + local-stack guards as the embed backfill.
- expire-attachments edge function: a standalone function (NOT a venice
  route - expiry never calls Venice, only Storage), service-role gated. Its
  deps wire the RPCs + storage.remove into the pure runExpiry orchestration.
- _shared/expire-attachments.ts: I/O-free drain loop (batch -> delete ->
  mark, until short batch / row cap / time budget). No per-row claim -
  delete + mark are idempotent, so overlapping ticks can't corrupt.
- deploy.yml: deploy the new function alongside venice.

Browser-worker removal lands next. Deno: 5 new offline tests pass, handler
type-checks. Cron/Storage round-trip can't be exercised here; verify after
deploy (uploaded objects should disappear 30 days after a thread goes quiet).
Update the migration plan + attachments banner to reflect the server-side
expiry sweep landing (expire-attachments function + cron + RPCs), and record
the browser-worker retirement as the remaining cleanup (Stage 2b) - left in
place because it's inert post-Stage-1.
@sysread sysread merged commit a9f4412 into main May 30, 2026
1 check passed
@sysread sysread deleted the claude/affectionate-ritchie-1jcnn branch May 30, 2026 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants