rate-limiter

A pluggable, async-native rate-limiting library for Python. Supports token bucket, fixed window, and sliding window log algorithms across in-memory and Redis backends, with a drop-in ASGI middleware for FastAPI/Starlette.

Features

3 algorithms — fixed window, sliding window log, token bucket — selectable at construction time
2 backends — in-memory (single-process, ~70k ops/s) and Redis (distributed, ~29k ops/s via atomic Lua scripts)
ASGI middleware — drop into any FastAPI/Starlette app with X-RateLimit-* headers, 429 responses, and Retry-After
Async-native — built on asyncio and redis-py async; no blocking calls
Bring-your-own Redis — the caller owns the redis.asyncio.Redis instance and its lifecycle
Configurable failure policy — fail-open (favour availability) or fail-closed (favour safety)
Fully typed — Pydantic models, mypy-clean

Algorithm trade-offs

Algorithm	Accuracy	Redis memory	Redis ops	Best for
Fixed window	Can allow 2x burst at window boundaries	O(1) — 64 B/key	`INCR` + `EXPIRE`	Simple rate caps where boundary bursts are acceptable
Sliding window log	Exact — no boundary bursts	O(N) — grows with `max_requests`	`ZADD` + `ZREMRANGEBYSCORE` + `ZCARD`	Strict per-client fairness
Token bucket	Smooth — allows controlled bursts up to bucket capacity	O(1) — 141 B/key	`HGET` + `HSET` + `TIME`	APIs that want to allow short bursts while enforcing an average rate

Quick start

Installation

pip install async-ratelimit

In-memory rate limiting

import asyncio
from rate_limiter import RateLimiterOrchestrator, AlgorithmType, RateLimiterType

limiter = RateLimiterOrchestrator(
    rate_limiter_type=RateLimiterType.IN_MEMORY,
    algorithm_type=AlgorithmType.TOKEN_BUCKET,
    max_requests=100,       # 100 requests
    time_window=60,         # per 60-second window
)

async def main():
    response = await limiter.get_response(uId="user-123")
    print(response.allowed)             # True
    print(response.remaining_requests)  # 99
    print(response.reset_time)          # seconds until the bucket refills

asyncio.run(main())

Distributed rate limiting with Redis

from redis.asyncio import Redis

redis_client = Redis(host="localhost", port=6379, decode_responses=True)

limiter = RateLimiterOrchestrator(
    rate_limiter_type=RateLimiterType.REDIS,
    algorithm_type=AlgorithmType.SLIDING_WINDOW,
    max_requests=1000,
    time_window=3600,
    redis_client=redis_client,      # bring your own client
)

# Use it the same way — Redis backend is transparent
response = await limiter.get_response(uId="user-123")

# Clean up when done
await redis_client.aclose()

FastAPI middleware

from fastapi import FastAPI
from redis.asyncio import Redis
from rate_limiter import RateLimiterOrchestrator, RateLimiterMiddleware, AlgorithmType, RateLimiterType

app = FastAPI()

redis_client = Redis(host="localhost", port=6379, decode_responses=True)
limiter = RateLimiterOrchestrator(
    rate_limiter_type=RateLimiterType.REDIS,
    algorithm_type=AlgorithmType.TOKEN_BUCKET,
    max_requests=10,
    time_window=60,
    redis_client=redis_client,
)

app.add_middleware(
    RateLimiterMiddleware,
    limiter=limiter,
    key_func=lambda r: r.headers.get("X-API-Key") or r.client.host,
    exclude_routes=["/health", "/docs"],
    fail_open=True,     # let requests through if the limiter errors
)

@app.get("/ping")
async def ping():
    return {"message": "pong"}

Every response includes standard rate-limit headers:

X-RateLimit-Limit: 10
X-RateLimit-Remaining: 7
X-RateLimit-Reset: 45

When the limit is exceeded, the middleware returns 429 Too Many Requests with a Retry-After header — the downstream handler is never invoked.

Architecture

┌──────────────────────────────────────────────────────────────────┐
│                     RateLimiterOrchestrator                      │
│  (public API — validates config, delegates to backend)           │
├──────────────┬───────────────────────────────────────────────────┤
│              │                                                   │
│  ┌───────────▼──────────┐       ┌───────────────────────────┐   │
│  │  InMemoryRateLimiter │       │    RedisRateLimiter       │   │
│  │  (asyncio.Lock)      │       │    (BYO redis client)     │   │
│  └───────────┬──────────┘       └───────────┬───────────────┘   │
│              │                              │                    │
│              ▼                              ▼                    │
│     AlgorithmFactory.create()       AlgorithmFactory.create()    │
│              │                              │                    │
│   ┌──────────┼──────────┐       ┌───────────┼──────────┐        │
│   │          │          │       │           │          │         │
│   ▼          ▼          ▼       ▼           ▼          ▼         │
│ FixedW   SlidingW   TokenB   FixedW    SlidingW    TokenB       │
│ InMem    InMem      InMem    InRedis   InRedis     InRedis      │
│ (dict)   (dict)     (dict)   (Lua)     (Lua)       (Lua)        │
└──────────────────────────────────────────────────────────────────┘

┌──────────────────────────────────────────────────────────────────┐
│                    RateLimiterMiddleware                          │
│  (ASGI — key_func, exclude_routes, fail_open/closed)             │
│  Wraps RateLimiterOrchestrator, adds X-RateLimit-* headers       │
└──────────────────────────────────────────────────────────────────┘

Design decisions:

Strategy + Factory pattern — algorithms are interchangeable behind a common Algorithm ABC. The factory dispatches on (backend, algorithm) to the correct implementation.
Bring-your-own Redis — the library never creates or configures a Redis connection. The caller passes in a redis.asyncio.Redis instance and owns its lifecycle (aclose()). This avoids env-var coupling and lets the caller control connection pooling, TLS, sentinels, etc.
Atomic Lua scripts — all Redis algorithms use EVAL/EVALSHA to run their logic server-side in a single atomic operation. No distributed locks needed.

Redis Lua scripts

Each Redis algorithm runs as a single Lua script executed atomically by Redis:

Algorithm	Script	Operations	Atomicity guarantee
Fixed window	`INCR` → conditional `EXPIRE` → `TTL`	Increment counter, set TTL on first request	Counter + expiry can never diverge
Sliding window	`ZREMRANGEBYSCORE` → `ZCARD` → `ZADD` → `EXPIRE`	Prune expired entries, count, add if under limit	No request can slip between prune and count
Token bucket	`TIME` → `HGET` → refill math → `HSET` → `EXPIRE`	Server-authoritative clock, lazy token refill	Refill + consume is one atomic step

Benchmarks

Microbenchmark results on Apple Silicon (M-series), Python 3.14.2, Redis 7 on localhost. Full methodology and commands in BENCHMARKS.md.

Throughput (ops/s) — hot key, 1 client

Backend	Algorithm	Conc=1	Conc=10	Conc=50	Conc=100	Conc=250
in_memory	fixed_window	68,404	69,823	67,906	66,705	59,387
in_memory	sliding_window	57,341	59,153	60,474	60,604	58,528
in_memory	token_bucket	70,934	69,237	71,112	69,173	69,721
redis	fixed_window	8,533	25,317	29,456	27,889	27,315
redis	sliding_window	7,631	21,407	24,718	25,011	23,390
redis	token_bucket	7,721	23,101	26,864	27,676	25,393

Latency at peak load (concurrency=250)

Backend	Algorithm	p50	p99	p99.9
in_memory	fixed_window	7.2 µs	24.2 µs	85.5 µs
in_memory	sliding_window	8.8 µs	9.4 µs	14.6 µs
in_memory	token_bucket	6.7 µs	7.1 µs	11.5 µs
redis	fixed_window	6.7 ms	25.0 ms	58.6 ms
redis	sliding_window	7.9 ms	27.7 ms	50.5 ms
redis	token_bucket	7.2 ms	28.1 ms	53.7 ms

Redis memory per client key

Algorithm	Memory	Data structure
Fixed window	64 B	String (counter) — O(1)
Token bucket	141 B	Hash (3 fields) — O(1)
Sliding window	6.4 MB	Sorted set (1 member/request) — O(N)

Key observations

In-memory throughput is ~8x higher than Redis (~70k vs ~8k ops/s at concurrency=1) because Redis operations pay a network round-trip (~100 µs) even on localhost.
Redis throughput scales 3x with concurrency (8k → 29k ops/s from c=1 → c=50) because asyncio coroutines overlap I/O waits. It plateaus as Redis's single-threaded command processing saturates.
In-memory throughput is flat across concurrency because asyncio is single-threaded — all coroutines serialize through asyncio.Lock.
Sliding window is the most memory-expensive at O(N) per client. With 50k max_requests, a single key uses 6.4 MB. Fixed window and token bucket use O(1) space regardless of request volume.

For the full benchmark suite, methodology, multi-client results, and chart generation, see BENCHMARKS.md.

Testing

# Run all tests (Redis required — via local instance, testcontainers, or REDIS_HOST env)
pytest -v

# In-memory tests only (no Redis needed)
pytest -v tests/test_fixed_window.py tests/test_sliding_window.py tests/test_token_bucket.py

# With Docker Compose (spins up Redis automatically)
docker compose up --build

The test suite includes:

Behavioral tests — allow/deny, window expiry, token refill, weight decay, reset
Redis integration tests — same behavioral coverage against a live Redis with Lua scripts
Concurrency tests — 50-way asyncio.gather across all algorithms
Middleware tests — 429 responses, X-RateLimit-* headers, Retry-After, fail-open/closed
Edge-case tests — zero/negative config validation

64 tests total, all passing in CI with a Redis service container.

Development

# Clone and install dev dependencies
git clone https://github.com/freq31/rate_limiter.git
cd rate_limiter
pip install -r requirements.txt

# Lint and type-check
ruff check .
black --check rate_limiter tests
mypy rate_limiter

# Run benchmarks
python -m benchmarks.run --ops 50000 --concurrency 1,10,50,100,250

# Generate charts
python -m benchmarks.plot benchmarks/results/<csv_file>.csv

Project structure

rate_limiter/
├── __init__.py                 # Public API re-exports
├── main.py                     # RateLimiterOrchestrator (public API) + Factory
├── app.py                      # FastAPI demo application
├── settings.py                 # Pydantic settings for the demo app
├── algorithms/
│   ├── base.py                 # Algorithm ABC + AlgorithmFactory
│   ├── fixed_window.py         # FixedWindowInMemory, FixedWindowInRedis
│   ├── sliding_window.py       # SlidingWindowInMemory, SlidingWindowInRedis
│   └── token_bucket.py         # TokenBucketInMemory, TokenBucketInRedis
├── backend/
│   ├── base.py                 # RateLimiter ABC
│   ├── memory.py               # InMemoryRateLimiter
│   ├── redis.py                # RedisRateLimiter (BYO client)
│   ├── middleware.py           # RateLimiterMiddleware (ASGI)
│   ├── request.py              # Rules, Client, AlgorithmType, RateLimiterType
│   └── response.py             # Response model
└── scripts/
    ├── fixed_window.py         # Lua: INCR + EXPIRE
    ├── sliding_window.py       # Lua: sorted set log
    └── token_bucket.py         # Lua: hash + TIME + refill math
tests/                          # 64 tests — behavioral, integration, concurrency, middleware
benchmarks/                     # Async microbenchmark harness + chart generation

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
rate_limiter		rate_limiter
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
BENCHMARKS.md		BENCHMARKS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rate-limiter

Features

Algorithm trade-offs

Quick start

Installation

In-memory rate limiting

Distributed rate limiting with Redis

FastAPI middleware

Architecture

Redis Lua scripts

Benchmarks

Throughput (ops/s) — hot key, 1 client

Latency at peak load (concurrency=250)

Redis memory per client key

Key observations

Testing

Development

Project structure

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

rate-limiter

Features

Algorithm trade-offs

Quick start

Installation

In-memory rate limiting

Distributed rate limiting with Redis

FastAPI middleware

Architecture

Redis Lua scripts

Benchmarks

Throughput (ops/s) — hot key, 1 client

Latency at peak load (concurrency=250)

Redis memory per client key

Key observations

Testing

Development

Project structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages