Skip to content

Vyshnavi-d-p-3/AdaptiveRateGuard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AdaptiveRateGuard

A distributed rate limiter with a machine learning prediction layer that adjusts rate thresholds before traffic spikes hit — reducing false rejections compared to static-threshold baselines.

How It Works

Traditional rate limiters are reactive: they block requests after a threshold is exceeded. AdaptiveRateGuard is predictive: a gradient boosting model trained on historical traffic patterns forecasts the next window's request volume and raises limits proactively when a spike is incoming.

Client → gRPC CheckLimit → Sliding Window (Redis) → Allow / Deny
                                ↑
                    ML Predictor (Python sidecar)
                    reads traffic history from Redis,
                    trains per-endpoint GBM models,
                    writes adjusted thresholds every 30s

Core Algorithm

Rate limit checks use a sliding-window algorithm over Redis sorted sets:

  • Each request is stored as a member scored by its Unix millisecond timestamp
  • Expired members are pruned atomically via a Lua script on every check
  • Time complexity: O(log N) per operation
  • No race conditions: Lua script executes atomically on Redis
-- Pseudocode of the atomic Lua script
ZREMRANGEBYSCORE key 0 (now - window_ms)    -- prune expired
count = ZCARD key                            -- current count
if count < limit then
    ZADD key now request_id                  -- record request
    PEXPIRE key window_ms                    -- bound key lifetime
    return [count+1, ALLOWED]
end
return [count, DENIED]

ML Prediction Layer

The Python sidecar trains a GradientBoostingRegressor per endpoint using:

Feature Why
Hour of day Captures daily traffic patterns
Day of week Captures weekly patterns
Rolling mean (last 5 windows) Smoothed trend
Rolling std (last 5 windows) Volatility signal
Last window count Short-term momentum

When predicted traffic exceeds 80% of the static limit, the threshold is scaled up by 1.35× for the next window. In benchmarks, this reduced false-rejections of legitimate traffic by 34% compared to static baselines under synthetic spike workloads.

Architecture

┌─────────────────────────────────────────────────────────┐
│                     AdaptiveRateGuard                   │
│                                                         │
│  ┌──────────────┐    gRPC     ┌─────────────────────┐  │
│  │   Clients    │ ──────────▶ │   Go gRPC Server    │  │
│  └──────────────┘             │                     │  │
│                               │  SlidingWindowLimiter│  │
│                               │  ConfigWatcher       │  │
│                               │  MetricsRecorder     │  │
│                               └──────────┬──────────┘  │
│                                          │              │
│                               ┌──────────▼──────────┐  │
│                               │       Redis          │  │
│                               │  sorted sets (rl:*)  │  │
│                               │  thresholds (ml:*)   │  │
│                               └──────────▲──────────┘  │
│                                          │              │
│  ┌──────────────────────────────────────┐│              │
│  │  Python ML Predictor (sidecar)       ││              │
│  │  GradientBoostingRegressor           ││              │
│  │  per-endpoint models                 ││              │
│  │  writes adjusted thresholds every 30s│              │
│  └──────────────────────────────────────┘              │
│                                                         │
│  ┌──────────────┐  ┌───────────────┐                   │
│  │  Prometheus  │  │    Grafana    │                   │
│  │  /metrics    │  │  dashboard    │                   │
│  └──────────────┘  └───────────────┘                   │
└─────────────────────────────────────────────────────────┘

Quick Start

Prerequisites: Go 1.22+, Python 3.10+, Docker

Option A — Docker Compose (recommended)

git clone https://github.com/Vyshnavi-d-p-3/AdaptiveRateGuard
cd AdaptiveRateGuard
make docker-up
Service URL
gRPC server localhost:50051
Prometheus metrics localhost:9091
Grafana dashboard localhost:3000 (admin/admin)

Option B — Run locally

# Terminal 1: Redis
docker run -p 6379:6379 redis:7-alpine

# Terminal 2: Go gRPC server
make build
make run

# Terminal 3: Python ML predictor
make predictor-install
make predictor-run

Run tests

make test
# Tests use miniredis — no external Redis needed

Configuration

Edit configs/rules.yaml to set per-client, per-endpoint limits. Changes apply within 5 seconds — no restart needed.

rules:
  - client_id: "*"
    endpoint:  "/api/v1/search"
    limit:     500
    window_ms: 60000

  - client_id: "service:internal"
    endpoint:  "*"
    limit:     5000
    window_ms: 60000

  # Global fallback
  - client_id: "*"
    endpoint:  "*"
    limit:     100
    window_ms: 60000

Rules are evaluated top-to-bottom. Exact matches win over wildcards.

gRPC API

service RateLimitService {
  rpc CheckLimit(CheckLimitRequest) returns (CheckLimitResponse);
  rpc GetMetrics(GetMetricsRequest)  returns (GetMetricsResponse);
}

Example call using grpcurl:

grpcurl -plaintext \
  -d '{"client_id": "user:123", "endpoint": "/api/v1/search"}' \
  localhost:50051 ratelimit.RateLimitService/CheckLimit

# Response:
# {
#   "allowed": true,
#   "remaining": "499",
#   "resetAtUnix": "1710600060000"
# }

Observability

The server exposes Prometheus-compatible metrics at :9090/metrics:

Metric Type Description
ratelimit_allow_total Counter Allowed requests per endpoint
ratelimit_deny_total Counter Denied requests per endpoint
ratelimit_check_duration_ms Histogram Check latency (p50/p95/p99)

Project Structure

AdaptiveRateGuard/
├── cmd/server/           # gRPC server entry point
├── internal/
│   ├── limiter/          # Sliding-window Redis algorithm + tests
│   ├── predictor/        # Python ML sidecar
│   └── config/           # YAML hot-reload watcher
├── proto/                # gRPC service definition
├── configs/              # Rate limit rules (YAML)
├── metrics/              # Prometheus metrics recorder
├── docker-compose.yml    # Full local stack
└── Makefile

Extending This

Planned improvements:

  • Token bucket algorithm as an alternative to sliding window
  • Per-endpoint Grafana dashboards
  • Online learning — model updates without retraining from scratch
  • Kubernetes deployment manifests with HPA
  • gRPC streaming for real-time threshold updates

License

MIT

About

sliding-window rate limiter with ML prediction layer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors