🔥 Ember

The fastest Python web framework — engineered for raw speed and concurrency, with Cython hot paths, an io_uring event loop, multi-process workers, and built-in TTL + single-flight caching.

📖 Documentation · 🐙 GitHub · 🤝 Contributing

Packages

This repository is a uv workspace publishing three independent PyPI packages:

Package	Purpose	Install
ember-api	Full framework (router, middleware, AI, sessions, CLI, workers)	`pip install ember-api`
ember-cache	TTL + single-flight cache, plus Redis & Memcached backends	`pip install ember-cache`
emberloop	io_uring event loop + Cython HTTP/1.1 protocol layer	`pip install emberloop`

ember-api depends on the other two — installing it pulls them in automatically. All three live under the shared ember.* import namespace.

Why Ember

	Ember	FastAPI	Express	NestJS
Protocol	llhttp + Cython + io_uring	ASGI / uvicorn	Node.js http	Node.js http
Workers	Fork + SO_REUSEPORT	Single process	cluster	cluster
SSE streaming	Native, zero-copy	via starlette	manual	manual
AI primitives	Built-in	none	none	none

Benchmarks

All numbers come from a single Intel i7-14700 box with 0% error rate. Each framework was benched in isolation with k6. Reproduce them yourself: Hello-world bench · CRUD bench.

1. Hello-world — `GET /hello → "Hello, World!"`

Single worker, 200 virtual users, 20 seconds, no database.

Framework	RPS	p50 (ms)	p99 (ms)	peak RSS
Fiber (Go)	140,993	1.21	3.96	9 MB
Ember	112,177	1.68	4.35	25 MB
Express (Node)	26,357	7.09	13.57	131 MB
NestJS (Node)	23,528	8.08	13.75	158 MB
FastAPI (Python)	17,517	9.45	30.86	49 MB

Ember is the only Python framework in this league — 6.4× FastAPI, 4.3× Express, 4.8× NestJS, and within 80% of Go Fiber's throughput. Idle RSS is ~22 MB, peak 25 MB — about half of FastAPI and 5× lighter than Node frameworks.

2. CRUD — PostgreSQL, mixed reads + writes

A more realistic test: each request hits a real PostgreSQL 16 database. The workload mix is 65% paginated list / 25% get-by-id / 10% create, sustained at 200 virtual users for 40 seconds, single thread (workers=1), 0% errors across all three. Ember's read routes use the built-in TTLCache(ttl=1.0) primitive (TTL caching + single-flight request coalescing); Express and FastAPI run their stock pool/handler code with no app-side caching.

Framework	RPS	avg (ms)	p50 (ms)	p95 (ms)	p99 (ms)
Ember	20,961	8.31	7	19	26
Express	7,233	24.12	26	38	51
FastAPI	1,932	90.35	80	195	275

Ember serves 2.9× the throughput of Express and 10.8× of FastAPI on the same hardware, with a p99 tail 2× tighter than Express and 10× tighter than FastAPI. The decisive win is one line of code:

from ember.cache import TTLCache

@app.get("/tasks", cache=TTLCache(ttl=1.0))
async def list_tasks(request):
    ...

That single argument enables TTL caching and single-flight request coalescing — when N concurrent users hit the same URL, exactly one request runs the handler and the rest receive the same response, collapsing thundering-herd reads onto a single PostgreSQL roundtrip.

Features

Cython hot paths — headers, router, request, response, protocol all compiled to C
Multi-process workers — fork-based with SO_REUSEPORT, kernel load-balancing
workers=1 in-process — no supervisor overhead when you don't need it (~22 MB saved)
Tunable io_uring buffer pool — Ember.run(io_uring_num_bufs=, io_uring_buf_size=) to scale RAM down to 2 MB pool (default) or up to 32 MB for high-concurrency workloads
Lazy AI / cache / middleware imports — import ember no longer pulls numpy/redis/memcached for plain HTTP apps
AI-first routing — @app.ai_route() with streaming, tool calling, conversation context
SSE streaming — SSEResponse, sse_stream(), TokenStreamResponse for LLM token output
Pluggable caching — StaticCache, RedisCache, MemcachedCache via single decorator arg
Token rate limiting — TokenBucket, GlobalTokenBucket, RateLimitMiddleware
Model routing — ModelRouter with fallback, cost, and latency strategies
Semantic cache — vector-search cache for AI responses
Built-in middleware — CORS, Bearer auth, API key
Blueprints — modular route groups with URL prefixes
Cross-platform — Linux/macOS multi-process, Windows single-process fallback

Roadmap

What's coming next — see the full roadmap for design notes and throughput targets.

WebSockets + OpenTelemetry + Prometheus + access logs — v0.2.0, Q3 2026
OpenAPI auto-gen + Pydantic v2 bodies + type stubs — v0.3.0, Q4 2026
HTTP/2 multiplexing — v0.4.0, Q1 2027
Free-threaded Python (PEP 703) — v0.5.0, Q2 2027 — target ~500k RPS / worker
prepare-chained io_uring fast path — v0.6.0, Q3 2027 — 3× cached-route RPS, p99 < 1 ms
High-performance Cython ORM — v0.7.0, Q4 2027 — Postgres / MySQL / SQLite / MSSQL / Oracle / CockroachDB / Redis; target 80k RPS on SELECT WHERE id=$1 (vs ~25k with asyncpg)
Stable 1.0 — Q1 2028 — semver lock, signed releases, SBOM, multi-arch wheels (x86_64 / ARM64 / macOS, GIL + t ABIs)

Industry adoption track — auth (OAuth2 / OIDC / JWT / mTLS), OpenTelemetry, Kubernetes probes & graceful drain, Helm chart, FastAPI/Flask migration guides, signed releases, SBOM, third-party security audit. Lands across v0.2 → v1.0 — see Industry Adoption Track.

Install

pip install ember-api
# With all performance extras:
pip install "ember-api[fast]"       # uvloop + orjson
pip install "ember-api[cache]"      # Redis + Memcached backends
pip install "ember-api[all]"        # everything

Build Cython extensions from source (optional, for maximum speed):

pip install cython
python setup.py build_ext --inplace

Hello World

from ember import Ember

app = Ember()

@app.get("/")
async def index():
    return {"hello": "world"}

app.run(host="0.0.0.0", port=8000, workers=4)

Full CRUD Example

from ember import Ember, Blueprint, Request, JSONResponse, StaticCache, RedisCache

app = Ember()

tasks: dict = {}

@app.get("/tasks", cache=RedisCache(ttl=30))
async def list_tasks(request: Request) -> JSONResponse:
    page  = int(request.args.get("page", 1))
    limit = int(request.args.get("limit", 10))
    items = list(tasks.values())[(page - 1) * limit : page * limit]
    return JSONResponse({"tasks": items, "total": len(tasks)})

@app.get("/tasks/{task_id:str}")
async def get_task(request: Request, task_id: str) -> JSONResponse:
    task = tasks.get(task_id)
    if not task:
        return JSONResponse({"error": "not found"}, status_code=404)
    return JSONResponse(task)

@app.post("/tasks")
async def create_task(request: Request) -> JSONResponse:
    data = await request.json()
    import uuid
    task_id = str(uuid.uuid4())
    tasks[task_id] = {"id": task_id, **data}
    return JSONResponse(tasks[task_id], status_code=201)

app.run(host="0.0.0.0", port=8000, workers=4)

AI / LLM Routes

from ember import Ember, Request, SSEResponse, ConversationContext, sse_stream
import asyncio

app = Ember()

async def token_stream(prompt: str):
    for word in f"You asked: {prompt}".split():
        await asyncio.sleep(0.05)
        yield word + " "

@app.ai_route("/v1/chat", methods=["POST"], streaming=True)
async def chat(request: Request, context: ConversationContext) -> SSEResponse:
    body = await request.json()
    prompt = body.get("message", "")
    context.add_message("user", prompt)
    return sse_stream(token_stream(prompt))

app.run(host="0.0.0.0", port=8000)

Caching

from ember import Ember, RedisCache, MemcachedCache, StaticCache

app = Ember()

# Static in-memory cache (no TTL, perfect for health checks)
@app.get("/health", cache=StaticCache())
async def health():
    return {"status": "ok"}

# Redis cache — shared across all workers
task_cache = RedisCache(url="redis://localhost:6379", ttl=30)

@app.get("/tasks", cache=task_cache)
async def list_tasks(request):
    ...

# Memcached cache
@app.get("/posts", cache=MemcachedCache(host="localhost", port=11211, ttl=60))
async def list_posts(request):
    ...

Ember auto-connects on server start and auto-disconnects on stop — no lifecycle code needed.

Middleware

from ember import Ember, CORSMiddleware, BearerAuthMiddleware, APIKeyMiddleware

app = Ember()

app.add_middleware(CORSMiddleware(allow_origins=["https://myapp.com"]))
app.add_middleware(BearerAuthMiddleware(verify_fn=lambda token: token == "secret"))
app.add_middleware(APIKeyMiddleware(api_keys=["key-1", "key-2"], header="x-api-key"))

Blueprints

from ember import Ember, Blueprint, JSONResponse

app   = Ember()
admin = Blueprint()

@admin.get("/users")
async def list_users():
    return JSONResponse({"users": []})

@admin.post("/users")
async def create_user(request):
    data = await request.json()
    return JSONResponse(data, status_code=201)

app.add_blueprint(admin, prefixes={"*": "/admin"})
app.run(port=8000)

Limits & Rate Limiting

from ember import Ember, RouteLimits, TokenLimits, ServerLimits

app = Ember(
    server_limits=ServerLimits(keep_alive_timeout=30),
)

@app.post(
    "/upload",
    limits=RouteLimits(max_body_size=50 * 1024 * 1024),  # 50 MB
)
async def upload(request):
    body = await request.body()
    return {"size": len(body)}

@app.ai_route(
    "/v1/chat",
    methods=["POST"],
    token_limits=TokenLimits(tokens_per_minute=10_000, max_prompt_tokens=4_096),
)
async def chat(request, context):
    ...

Running Tests

pip install "ember-api[dev]"
pytest

Links

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github		.github
docs		docs
examples		examples
packages		packages
taskbench		taskbench
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔥 Ember

Packages

Why Ember

Benchmarks

1. Hello-world — `GET /hello → "Hello, World!"`

2. CRUD — PostgreSQL, mixed reads + writes

Features

Roadmap

Install

Hello World

Full CRUD Example

AI / LLM Routes

Caching

Middleware

Blueprints

Limits & Rate Limiting

Running Tests

Links

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🔥 Ember

Packages

Why Ember

Benchmarks

1. Hello-world — GET /hello → "Hello, World!"

2. CRUD — PostgreSQL, mixed reads + writes

Features

Roadmap

Install

Hello World

Full CRUD Example

AI / LLM Routes

Caching

Middleware

Blueprints

Limits & Rate Limiting

Running Tests

Links

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Hello-world — `GET /hello → "Hello, World!"`

Packages