The fastest Python web framework β engineered for raw speed and concurrency, with Cython hot paths, an io_uring event loop, multi-process workers, and built-in TTL + single-flight caching.
π Documentation Β· π GitHub Β· π€ Contributing
This repository is a uv workspace publishing three independent PyPI packages:
| Package | Purpose | Install |
|---|---|---|
| ember-api | Full framework (router, middleware, AI, sessions, CLI, workers) | pip install ember-api |
| ember-cache | TTL + single-flight cache, plus Redis & Memcached backends | pip install ember-cache |
| emberloop | io_uring event loop + Cython HTTP/1.1 protocol layer | pip install emberloop |
ember-api depends on the other two β installing it pulls them in automatically. All three live under the shared ember.* import namespace.
| Ember | FastAPI | Express | NestJS | |
|---|---|---|---|---|
| Protocol | llhttp + Cython + io_uring | ASGI / uvicorn | Node.js http | Node.js http |
| Workers | Fork + SO_REUSEPORT | Single process | cluster | cluster |
| SSE streaming | Native, zero-copy | via starlette | manual | manual |
| AI primitives | Built-in | none | none | none |
All numbers come from a single Intel i7-14700 box with 0% error rate. Each framework was benched in isolation with k6. Reproduce them yourself: Hello-world bench Β· CRUD bench.
Single worker, 200 virtual users, 20 seconds, no database.
| Framework | RPS | p50 (ms) | p99 (ms) | peak RSS |
|---|---|---|---|---|
| Fiber (Go) | 140,993 | 1.21 | 3.96 | 9 MB |
| Ember | 112,177 | 1.68 | 4.35 | 25 MB |
| Express (Node) | 26,357 | 7.09 | 13.57 | 131 MB |
| NestJS (Node) | 23,528 | 8.08 | 13.75 | 158 MB |
| FastAPI (Python) | 17,517 | 9.45 | 30.86 | 49 MB |
Ember is the only Python framework in this league β 6.4Γ FastAPI, 4.3Γ Express, 4.8Γ NestJS, and within 80% of Go Fiber's throughput. Idle RSS is ~22 MB, peak 25 MB β about half of FastAPI and 5Γ lighter than Node frameworks.
A more realistic test: each request hits a real PostgreSQL 16 database. The
workload mix is 65% paginated list / 25% get-by-id / 10% create, sustained
at 200 virtual users for 40 seconds, single thread (workers=1), 0% errors
across all three. Ember's read routes use the built-in TTLCache(ttl=1.0)
primitive (TTL caching + single-flight request coalescing); Express and
FastAPI run their stock pool/handler code with no app-side caching.
| Framework | RPS | avg (ms) | p50 (ms) | p95 (ms) | p99 (ms) |
|---|---|---|---|---|---|
| Ember | 20,961 | 8.31 | 7 | 19 | 26 |
| Express | 7,233 | 24.12 | 26 | 38 | 51 |
| FastAPI | 1,932 | 90.35 | 80 | 195 | 275 |
Ember serves 2.9Γ the throughput of Express and 10.8Γ of FastAPI on the same hardware, with a p99 tail 2Γ tighter than Express and 10Γ tighter than FastAPI. The decisive win is one line of code:
from ember.cache import TTLCache
@app.get("/tasks", cache=TTLCache(ttl=1.0))
async def list_tasks(request):
...That single argument enables TTL caching and single-flight request coalescing β when N concurrent users hit the same URL, exactly one request runs the handler and the rest receive the same response, collapsing thundering-herd reads onto a single PostgreSQL roundtrip.
- Cython hot paths β headers, router, request, response, protocol all compiled to C
- Multi-process workers β fork-based with
SO_REUSEPORT, kernel load-balancing workers=1in-process β no supervisor overhead when you don't need it (~22 MB saved)- Tunable io_uring buffer pool β
Ember.run(io_uring_num_bufs=, io_uring_buf_size=)to scale RAM down to 2 MB pool (default) or up to 32 MB for high-concurrency workloads - Lazy AI / cache / middleware imports β
import emberno longer pulls numpy/redis/memcached for plain HTTP apps - AI-first routing β
@app.ai_route()with streaming, tool calling, conversation context - SSE streaming β
SSEResponse,sse_stream(),TokenStreamResponsefor LLM token output - Pluggable caching β
StaticCache,RedisCache,MemcachedCachevia single decorator arg - Token rate limiting β
TokenBucket,GlobalTokenBucket,RateLimitMiddleware - Model routing β
ModelRouterwith fallback, cost, and latency strategies - Semantic cache β vector-search cache for AI responses
- Built-in middleware β CORS, Bearer auth, API key
- Blueprints β modular route groups with URL prefixes
- Cross-platform β Linux/macOS multi-process, Windows single-process fallback
What's coming next β see the full roadmap for design notes and throughput targets.
- WebSockets + OpenTelemetry + Prometheus + access logs β
v0.2.0, Q3 2026 - OpenAPI auto-gen + Pydantic v2 bodies + type stubs β
v0.3.0, Q4 2026 - HTTP/2 multiplexing β
v0.4.0, Q1 2027 - Free-threaded Python (PEP 703) β
v0.5.0, Q2 2027 β target ~500k RPS / worker prepare-chainedio_uringfast path βv0.6.0, Q3 2027 β 3Γ cached-route RPS, p99 < 1 ms- High-performance Cython ORM β
v0.7.0, Q4 2027 β Postgres / MySQL / SQLite / MSSQL / Oracle / CockroachDB / Redis; target 80k RPS onSELECT WHERE id=$1(vs ~25k withasyncpg) - Stable 1.0 β Q1 2028 β semver lock, signed releases, SBOM, multi-arch wheels (x86_64 / ARM64 / macOS, GIL +
tABIs)
Industry adoption track β auth (OAuth2 / OIDC / JWT / mTLS), OpenTelemetry, Kubernetes probes & graceful drain, Helm chart, FastAPI/Flask migration guides, signed releases, SBOM, third-party security audit. Lands across v0.2 β v1.0 β see Industry Adoption Track.
pip install ember-api
# With all performance extras:
pip install "ember-api[fast]" # uvloop + orjson
pip install "ember-api[cache]" # Redis + Memcached backends
pip install "ember-api[all]" # everythingBuild Cython extensions from source (optional, for maximum speed):
pip install cython
python setup.py build_ext --inplacefrom ember import Ember
app = Ember()
@app.get("/")
async def index():
return {"hello": "world"}
app.run(host="0.0.0.0", port=8000, workers=4)from ember import Ember, Blueprint, Request, JSONResponse, StaticCache, RedisCache
app = Ember()
tasks: dict = {}
@app.get("/tasks", cache=RedisCache(ttl=30))
async def list_tasks(request: Request) -> JSONResponse:
page = int(request.args.get("page", 1))
limit = int(request.args.get("limit", 10))
items = list(tasks.values())[(page - 1) * limit : page * limit]
return JSONResponse({"tasks": items, "total": len(tasks)})
@app.get("/tasks/{task_id:str}")
async def get_task(request: Request, task_id: str) -> JSONResponse:
task = tasks.get(task_id)
if not task:
return JSONResponse({"error": "not found"}, status_code=404)
return JSONResponse(task)
@app.post("/tasks")
async def create_task(request: Request) -> JSONResponse:
data = await request.json()
import uuid
task_id = str(uuid.uuid4())
tasks[task_id] = {"id": task_id, **data}
return JSONResponse(tasks[task_id], status_code=201)
app.run(host="0.0.0.0", port=8000, workers=4)from ember import Ember, Request, SSEResponse, ConversationContext, sse_stream
import asyncio
app = Ember()
async def token_stream(prompt: str):
for word in f"You asked: {prompt}".split():
await asyncio.sleep(0.05)
yield word + " "
@app.ai_route("/v1/chat", methods=["POST"], streaming=True)
async def chat(request: Request, context: ConversationContext) -> SSEResponse:
body = await request.json()
prompt = body.get("message", "")
context.add_message("user", prompt)
return sse_stream(token_stream(prompt))
app.run(host="0.0.0.0", port=8000)from ember import Ember, RedisCache, MemcachedCache, StaticCache
app = Ember()
# Static in-memory cache (no TTL, perfect for health checks)
@app.get("/health", cache=StaticCache())
async def health():
return {"status": "ok"}
# Redis cache β shared across all workers
task_cache = RedisCache(url="redis://localhost:6379", ttl=30)
@app.get("/tasks", cache=task_cache)
async def list_tasks(request):
...
# Memcached cache
@app.get("/posts", cache=MemcachedCache(host="localhost", port=11211, ttl=60))
async def list_posts(request):
...Ember auto-connects on server start and auto-disconnects on stop β no lifecycle code needed.
from ember import Ember, CORSMiddleware, BearerAuthMiddleware, APIKeyMiddleware
app = Ember()
app.add_middleware(CORSMiddleware(allow_origins=["https://myapp.com"]))
app.add_middleware(BearerAuthMiddleware(verify_fn=lambda token: token == "secret"))
app.add_middleware(APIKeyMiddleware(api_keys=["key-1", "key-2"], header="x-api-key"))from ember import Ember, Blueprint, JSONResponse
app = Ember()
admin = Blueprint()
@admin.get("/users")
async def list_users():
return JSONResponse({"users": []})
@admin.post("/users")
async def create_user(request):
data = await request.json()
return JSONResponse(data, status_code=201)
app.add_blueprint(admin, prefixes={"*": "/admin"})
app.run(port=8000)from ember import Ember, RouteLimits, TokenLimits, ServerLimits
app = Ember(
server_limits=ServerLimits(keep_alive_timeout=30),
)
@app.post(
"/upload",
limits=RouteLimits(max_body_size=50 * 1024 * 1024), # 50 MB
)
async def upload(request):
body = await request.body()
return {"size": len(body)}
@app.ai_route(
"/v1/chat",
methods=["POST"],
token_limits=TokenLimits(tokens_per_minute=10_000, max_prompt_tokens=4_096),
)
async def chat(request, context):
...pip install "ember-api[dev]"
pytest- π Documentation
- π Source code on GitHub
- π€ Contributing guide
- π Report a bug
- π‘ Request a feature
MIT β see LICENSE.