Skip to content

feat: auth gate, rate limiting, and cost attribution#26

Open
stackbilt-admin wants to merge 1 commit intomainfrom
feat/auth-rate-limit
Open

feat: auth gate, rate limiting, and cost attribution#26
stackbilt-admin wants to merge 1 commit intomainfrom
feat/auth-rate-limit

Conversation

@stackbilt-admin
Copy link
Copy Markdown
Member

Summary

Implements the three security-critical features from issue #18 (AgentRelay reference spec):

  • Rate limiting — Sliding window per-tenant rate limiter using RATELIMIT_KV. Tier-based limits: free=20/min, hobby=60, pro=300, enterprise=1000. Returns 429 with standard Retry-After and X-RateLimit-* headers on all responses.

  • Cost attribution — Per-tool credit costs with quality multipliers for image_generate. Reserves quota via edge-auth consumeQuota RPC before each tool call, then commits or refunds based on outcome. Zero-cost tools (read-only lookups) skip quota enforcement entirely.

  • Scope enforcement — Mutation tools (LOCAL_MUTATION, EXTERNAL_MUTATION, DESTRUCTIVE) require the generate scope. tools/list filters the catalog to only show tools the session has access to. API keys with read-only scopes cannot invoke mutations.

New files

  • src/rate-limiter.ts — Fixed-window rate limiter with KV TTL for auto-cleanup
  • src/cost-attribution.ts — Tool cost registry, quota reservation/settlement
  • test/rate-limiter.test.ts — 8 tests
  • test/cost-attribution.test.ts — 17 tests

Modified files

  • src/gateway.ts — Wire rate limiting after auth, quota reserve before tool call, scope check on tools/call
  • src/types.ts — Add checkQuota, consumeQuota, commitOrRefundQuota to AuthServiceRpc; add RATELIMIT_KV to GatewayEnv
  • wrangler.toml — Add RATELIMIT_KV namespace binding
  • All test files — Updated mocks to include quota methods and RATELIMIT_KV

Test plan

  • 25 new tests pass (rate-limiter: 8, cost-attribution: 17)
  • All 116 non-oauth tests pass (0 regressions)
  • TypeScript compiles cleanly (tsc --noEmit)
  • Pre-existing oauth-handler.test.ts failures are unchanged (6 tests, same on main)
  • Integration test: deploy to staging, send 21 requests with free-tier key, verify 429 on 21st
  • Integration test: verify X-RateLimit-Remaining header decrements
  • Integration test: call image_generate with ultra_plus quality, verify 40-credit cost

Closes #18

🤖 Generated with Claude Code

- Rate limiter: sliding window per-tenant using RATELIMIT_KV with
  tier-based limits (free=20/min, hobby=60, pro=300, enterprise=1000).
  Returns 429 with Retry-After and X-RateLimit-* headers.

- Cost attribution: per-tool credit costs with quality multipliers
  for image_generate. Reserves quota via edge-auth consumeQuota RPC
  before tool call, settles (commit/refund) after based on outcome.
  Free tools (read-only, zero cost) skip quota enforcement.

- Scope enforcement: mutation tools require 'generate' scope.
  tools/list filters catalog to match session scopes.

- AuthServiceRpc extended with checkQuota, consumeQuota, and
  commitOrRefundQuota methods matching edge-auth's entrypoint.

- All existing tests updated with new mocks; 25 new tests added.

Closes #18

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: implement auth, rate limiting, and cost attribution from AgentRelay reference spec

1 participant