feat(ai-partner): Cloudflare Worker AI proxy (#1450)#1478
Merged
Conversation
New top-level ai-proxy/ directory with the Worker that sits between the
mobile client and Anthropic/OpenAI. Co-located with the rest of the repo
so the client and proxy evolve together.
Endpoints:
- GET /ai/health — liveness; returns {status, version}
- POST /ai/embed — OpenAI text-embedding-3-small, 1536-dim vector
- POST /ai/chat — streaming Anthropic SSE with gap-signal detection
Auth: RevenueCat receipt validated with 5-min KV cache keyed on the
SHA-256 of the receipt; 401 / 402 / 503 classified by failure kind.
Rate limit: per-user monthly + 10-minute burst counters in KV.
premium: 300/month + 10/burst
partner_plus: 1,500/month + 30/burst
Anthropic client: zero-retention header, server-side system prompt
assembly, partner_plus always routes to Sonnet regardless of client
hint.
Gap detection: parses the trailing {"gap": ...} JSON envelope from the
streamed response; writes a stub D1 row + structured log today. #1471
replaces the stub with full D1 schema + semantic dedup + GitHub sync.
Logging: metadata-only — user_hash, endpoint, status, latency,
entitlement. Never logs request bodies, response text, retrieved
chunks, or profile summaries.
Tests: 37 Vitest tests across auth, rate-limit, gap-detection, and
Anthropic client. All run offline via fetch interception + in-memory
KV stub.
https://claude.ai/code/session_01Pht3kzgdvkn81DDfL9SnFe
Test Results✅ All tests passed
Coverage
⏱️ Duration: 77.8s |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1450. Phase 1 of epic #1446.
Summary
New
ai-proxy/directory containing the Cloudflare Worker that handles RevenueCat auth, rate limiting, zero-retention enforcement, streaming pass-through to Anthropic, and gap-signal detection for the downstream corpus-gap pipeline (#1471).Endpoints
/ai/health/ai/embedtext-embedding-3-small→ 1536 floats/ai/chatKey behaviors
premium300 + 10,partner_plus1500 + 30. 429 withretry_afterand optionalupgrade_urlfor premium users.anthropic-no-retention: trueheader set on every outbound call.partner_plusalways routes to Sonnet regardless of client hint. System prompt assembled server-side from profile + chunks.{"gap": true, ...}envelope, writes a stub D1 row + structured log. ai-partner: corpus gap capture (D1 + proxy detection + GitHub sync) #1471 replaces the stub with full persistence, semantic dedup, and GitHub sync.Test plan
npm installandnpx tsc --noEmit— strict TypeScript passesnpm test— 37 Vitest tests passwrangler devsmoke — reviewer to verify once a dev KV namespace is wired up; tests cover the code paths wrangler would exerciseOut of scope
wrangler deploy --env productionwaits until the rest of Phase 1 landshttps://claude.ai/code/session_01Pht3kzgdvkn81DDfL9SnFe