Skip to content

feat(session-ingest): stream large payloads via R2 + Queue#884

Merged
iscekic merged 29 commits intomainfrom
worktree-session-ingest-streaming
Mar 9, 2026
Merged

feat(session-ingest): stream large payloads via R2 + Queue#884
iscekic merged 29 commits intomainfrom
worktree-session-ingest-streaming

Conversation

@iscekic
Copy link
Copy Markdown
Contributor

@iscekic iscekic commented Mar 6, 2026

Summary

Support ingest payloads up to 5GB by streaming request bodies to R2 (zero worker memory) and processing asynchronously via Queue consumer with streaming JSON parser.

  • Ingest route: streams body directly to R2, enqueues processing message with ingestedAt timestamp
  • Queue consumer (src/queue-consumer.ts): reads R2 object, parses items one-at-a-time via @streamparser/json, sends each to DO individually
    • Items > 1.94MB: data stored in R2 with reference in DO SQLite
    • Items > 50MB: skipped with warning (never fully materialized)
    • Single item failure doesn't stop processing the rest
    • Timestamp guard prevents stale retries from overwriting newer data
  • DO changes: ingest() accepts ingestedAt + r2References; getAll() resolves R2-backed items; new getAllStream() for row-by-row export
  • Schema migration: adds item_data_r2_key and ingested_at columns
  • Bindings: R2 bucket (SESSION_INGEST_R2), Queue (INGEST_QUEUE with DLQ)

Infrastructure prerequisites

cd cloudflare-session-ingest
npx wrangler r2 bucket create session-ingest-staging
npx wrangler queues create session-ingest-processing
npx wrangler queues create session-ingest-dlq

Test plan

  • Typecheck passes
  • Lint passes
  • Unit tests pass (101 tests)
  • Integration tests pass (15 tests via @cloudflare/vitest-pool-workers)
  • Manual test: send large payload via POST /ingest, verify R2 → Queue → DO pipeline
  • Manual test: export session with R2-backed items produces valid JSON

Support payloads up to 5GB by streaming request bodies to R2 (zero
worker memory) and processing asynchronously via Queue consumer with
streaming JSON parser.

- Ingest route streams body to R2 and enqueues processing message
- Queue consumer parses items one at a time via @streamparser/json
- Items > 1.94MB stored in R2 with reference in DO SQLite
- Items > 50MB skipped with warning (no full materialization)
- Single item failure doesn't stop processing the rest
- Timestamp guard (ingestedAt) prevents stale retries overwriting newer data
- DO getAll() resolves R2-backed items; new getAllStream() for row-by-row export
- Schema migration adds item_data_r2_key and ingested_at columns
- Integration tests via @cloudflare/vitest-pool-workers
@iscekic iscekic self-assigned this Mar 6, 2026
Remove getAll(), rewrite getAllStream() to stream R2 object bodies
chunk-by-chunk instead of buffering via .text(). Build snapshot JSON
incrementally in 3 phases: scan refs from SQLite, group into
message/part structure, stream JSON with R2 body piping.

Replace RPC exportSession (32MB DO limit) with secret-protected HTTP
route /internal/session/:sessionId/export using service binding fetch.
Uses crypto.subtle.timingSafeEqual for secret comparison.

Keep createSessionForCloudAgent and deleteSessionForCloudAgent as RPC
(small payloads, no streaming needed).
Comment thread cloudflare-session-ingest/src/queue-consumer.ts Outdated
Comment thread cloudflare-session-ingest/src/queue-consumer.ts Outdated
Comment thread cloudflare-session-ingest/src/dos/SessionIngestDO.ts
Comment thread cloudflare-session-ingest/src/dos/SessionIngestDO.ts Outdated
@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented Mar 6, 2026

Code Review Summary

Status: 2 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 2
SUGGESTION 0

Fix these issues in Kilo Cloud

Issue Details (click to expand)

WARNING

File Line Issue
cloudflare-session-ingest/src/queue-consumer.ts 215 Non-array ingest payloads are still acknowledged as success
cloudflare-session-ingest/src/dos/SessionIngestDO.ts 500 deleted=true tombstone makes the DO unreusable for recreated sessions
Other Observations (not in diff)

None.

Files Reviewed (15 files)
  • cloud-agent-next/src/session-service.ts - 0 issues
  • cloud-agent-next/src/session-ingest-binding.ts - 0 issues
  • cloud-agent-next/wrangler.jsonc - 0 issues
  • cloudflare-session-ingest/src/dos/SessionIngestDO.ts - 1 issue
  • cloudflare-session-ingest/src/index.ts - 0 issues
  • cloudflare-session-ingest/src/index.test.ts - 0 issues
  • cloudflare-session-ingest/src/queue-consumer.ts - 1 issue
  • cloudflare-session-ingest/src/routes/api.test.ts - 0 issues
  • cloudflare-session-ingest/src/routes/api.ts - 0 issues
  • cloudflare-session-ingest/src/services/session-export.ts - 0 issues
  • cloudflare-session-ingest/src/session-ingest-rpc.ts - 0 issues
  • cloudflare-session-ingest/src/types/session-sync.ts - 0 issues
  • cloudflare-session-ingest/test/integration/session-ingest-do.test.ts - 0 issues
  • cloudflare-session-ingest/wrangler.jsonc - 0 issues
  • cloudflare-session-ingest/wrangler.test.jsonc - 0 issues

iscekic added 7 commits March 6, 2026 14:32
Replace 3-phase approach (scan all refs into memory, group, stream)
with single-pass that queries SQLite incrementally and streams JSON
directly: session row first, then cursor-iterate messages one at a
time, querying parts per message via LIKE prefix. Only one row + one
R2 stream in flight at any time.
Replace buffered JSONParser with Tokenizer + per-item TokenParser.
Items from $.data[] are now processed one at a time between chunk
reads instead of being collected into an array first.

Per-item byte budget via offset tracking: if an item exceeds
MAX_SINGLE_ITEM_BYTES, its tokens are discarded without ever building
a JS object, preventing OOM from oversized items. A fresh TokenParser
is created per item so budget violations don't corrupt parser state.
Delete share-output.ts, ingest-batching.ts and their tests.
Remove MAX_DO_INGEST_CHUNK_BYTES, byteLengthUtf8, SessionSyncInputSchema.
…st-streaming

# Conflicts:
#	cloudflare-session-ingest/package.json
Comment thread cloudflare-session-ingest/src/routes/api.ts
Comment thread cloudflare-session-ingest/src/dos/SessionIngestDO.ts Outdated
Comment thread cloudflare-session-ingest/src/queue-consumer.ts
- Use TextEncoder byte length for DO SQLite row limit check (non-ASCII safety)
- Escape LIKE wildcards in msgId for part queries
- Clean up R2 item blobs in SessionIngestDO.clear()
Comment thread cloudflare-session-ingest/src/queue-consumer.ts
Comment thread cloudflare-session-ingest/src/queue-consumer.ts Outdated
Comment thread cloudflare-session-ingest/src/dos/SessionIngestDO.ts
…cs comment

- Use sql template with ESCAPE '\' for LIKE wildcard escaping (SQLite requires it)
- Add comment explaining why R2-backed items show as {} in metrics
Comment thread cloudflare-session-ingest/src/routes/api.ts Outdated
Comment thread cloudflare-session-ingest/src/routes/api.ts
Comment thread cloudflare-session-ingest/src/queue-consumer.ts
Comment thread cloud-agent-next/wrangler.jsonc
Comment thread cloudflare-session-ingest/src/dos/SessionIngestDO.ts
Comment thread cloudflare-session-ingest/src/queue-consumer.ts
iscekic added 3 commits March 6, 2026 16:37
…ard, queue rollback

- Add ingestedAt to oversized-item R2 keys to prevent stale overwrites
- Clean up orphaned R2 blobs when items are replaced or skipped by timestamp guard
- Wrap queue send in try/catch to delete staging R2 object on failure
- Check session existence before processing queued messages to prevent
  deleted sessions from being repopulated
- Add secrets_store_secrets to cloud-agent-next dev env (missing binding)
- Surface tokenizer parse errors to trigger queue retry/DLQ instead of
  silently acking malformed payloads
Comment thread cloudflare-session-ingest/src/queue-consumer.ts Outdated
iscekic added 3 commits March 6, 2026 16:54
Cloudflare Workers requires queue() to be a property on the default
export, not a named export. Move queue to the default export object
alongside fetch.
…destructured value

Destructuring the getter property captured the initial null, so parse
errors were silently ignored. Replace with getParseError() method that
is called after streaming completes.
Comment thread cloudflare-session-ingest/src/queue-consumer.ts
Comment thread cloudflare-session-ingest/src/queue-consumer.ts
Comment thread cloudflare-session-ingest/src/dos/SessionIngestDO.ts
iscekic added 2 commits March 6, 2026 17:42
Resolve merge conflicts between streaming HTTP export (this branch) and
combined exportSessionWithDiff RPC (main). Keep streaming approach for
session export, move diff extraction client-side into cloud-agent-next
via new extractDiffsFromMessages helper. Remove server-side RPC export
methods and related DO code (readItems, getAll, collectDiffs, etc.).
Adapt tests to use fetch mocks with snapshot-embedded diffs.
Comment thread cloudflare-session-ingest/src/queue-consumer.ts
iscekic added 4 commits March 6, 2026 17:57
…on trigger token

When the closing `}` token triggers the byte budget overflow, `skippingItem` was
set `true` and `depth` decremented to `2`, but the flag was never cleared. This
caused the next valid item to be silently discarded.
Session deletion (clear()) can race with queued ingests that repopulate
the DO after it's been wiped. Add a 'deleted' key to ingestMeta that is
set after clear() and checked at the top of ingest() and alarm() to
silently reject work on deleted sessions.
Previously, a missing R2 blob during getAllStream() export silently fell
back to `{}`, producing semantically corrupted snapshot data with no
diagnostic signal. Add console.error before the fallback.
Comment thread cloud-agent-next/src/session-service.ts
Comment thread cloudflare-session-ingest/src/dos/SessionIngestDO.ts
Comment thread cloudflare-session-ingest/src/dos/SessionIngestDO.ts Outdated
Comment thread cloudflare-session-ingest/test/integration/session-ingest-do.test.ts Outdated
@iscekic iscekic requested a review from eshurakov March 6, 2026 17:36
iscekic added 2 commits March 6, 2026 18:47
…batch export queries

- ingest() now deletes caller-uploaded R2 blobs when session is deleted,
  preventing orphaned oversized item blobs after clear()
- Export cursor queries batched with LIMIT 30 instead of LIMIT 1,
  reducing query count ~30x for large sessions
- Fix R2-backed message test: r2References key was 'message:msg_r2'
  but getItemIdentity() produces 'message/msg_r2', so the R2 export
  path was never exercised
- Fix misleading comment in cloud-agent-next about "avoids buffering"
Comment thread cloudflare-session-ingest/src/dos/SessionIngestDO.ts
Comment thread cloudflare-session-ingest/src/queue-consumer.ts Outdated
Comment thread cloudflare-session-ingest/src/queue-consumer.ts
Comment thread cloudflare-session-ingest/src/dos/SessionIngestDO.ts
@iscekic iscekic merged commit 31e449f into main Mar 9, 2026
13 checks passed
@iscekic iscekic deleted the worktree-session-ingest-streaming branch March 9, 2026 10:23
eshurakov added a commit that referenced this pull request Mar 9, 2026
…908)

## Summary

- Move snapshot download from worker memory to sandbox via `curl` — the
full JSON never materializes in the 128MB worker
- Replace worker-side diff extraction (`extractDiffsFromMessages`) and
application (`applySessionDiff`) with a sandbox-side Node script that
reads, deduplicates, and applies diffs directly on disk
- Refactor `restoreSessionSnapshot` to accept a file path (curl writes
directly) instead of a string payload
- Remove `resolve` and `zod` imports that were only used by deleted
methods

Depends on PR #884 which adds the streaming `/api/session/:id/export`
endpoint.

## Test plan

- [x] All 601 existing tests pass (updated 4 cold-start tests for new
exec-based flow)
- [x] Typecheck passes
- [ ] Integration test: cold-start resume with real session to verify
curl download + kilo import + diff application
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants