-
Notifications
You must be signed in to change notification settings - Fork 2
Closed
Description
Summary
Implement distributed rate limiting using Redis to ensure rate limits are enforced globally across all proxy instances, not just per-instance.
Priority: 🔴 Critical (Priority 1)
Impact: MEDIUM-HIGH - Rate limiting is per-instance, not global
Effort Estimate: 1-2 weeks
Current State
- Rate limiting is per-token, per-instance
- Each proxy instance has its own counters
- Multi-instance deployments can exceed intended rate limits
- No shared state across instances
Problem
In multi-instance deployments:
- Each instance tracks rate limits independently
- Total requests can be N × limit (where N = number of instances)
- No coordination between instances
- Rate limits are effectively meaningless in scaled deployments
Workaround (Current)
- Deploy single instance only
- Use sticky sessions to route same token to same instance
- Set conservative rate limits accounting for multiple instances
Proposed Solution
- Implement Redis-backed rate limit counters
- Add distributed locking for atomic increments
- Update tests for distributed scenario
- Document configuration and deployment
Acceptance Criteria
- Redis-backed rate limit counters implemented
- Distributed locking for atomic operations
- Rate limits enforced globally across all instances
- Graceful fallback if Redis is unavailable
- Configuration options for Redis connection
- Tests for distributed rate limiting
- Documentation updated
- Performance benchmarks showing minimal overhead
Related Issues
- Related to [db] Phase 5: PostgreSQL support (opt-in over SQLite) #57 (PostgreSQL support - production deployments need distributed rate limiting)
- Related to brownfield architecture documentation
References
- Technical Debt Register:
docs/technical-debt.mdlines 75-101 - Brownfield Architecture:
docs/brownfield-architecture.md"Rate Limiting" section - PLAN.md line 347
Location
internal/token/ratelimit.go- Current rate limiting implementationinternal/eventbus/- Redis connection management (can be reused)
Note: This is a parent issue. Sub-issues will be created for specific implementation tasks.
Copilot