Skip to content

Add Starlette: the ASGI framework that powers FastAPI (~12k ⭐)#78

Merged
MDA2AV merged 1 commit intoMDA2AV:mainfrom
BennyFranciscus:add-starlette
Mar 23, 2026
Merged

Add Starlette: the ASGI framework that powers FastAPI (~12k ⭐)#78
MDA2AV merged 1 commit intoMDA2AV:mainfrom
BennyFranciscus:add-starlette

Conversation

@BennyFranciscus
Copy link
Copy Markdown
Collaborator

Starlette — The little ASGI framework that shines 🌟

Adds Starlette (~12k ⭐) to HttpArena.

Why Starlette?

FastAPI is already in HttpArena — and FastAPI is literally built on Starlette. Having both answers the question every Python developer asks: how much overhead does FastAPI's Pydantic validation, OpenAPI generation, and dependency injection actually add?

Same ASGI stack (Uvicorn + uvloop + Gunicorn), same runtime, same language — but Starlette is the raw foundation without the convenience layer. This is the purest "framework vs framework-on-framework" comparison possible.

Setup

  • Starlette 0.46.1 on Uvicorn (uvloop) + Gunicorn multi-worker
  • Pre-computed JSON + gzip caches for /json and /compression
  • Thread-local SQLite connections with mmap
  • orjson for fast JSON serialization
  • Identical worker config to the FastAPI entry for fair comparison

The Python lineup is now:

Entry Stack Model
Flask WSGI/Gunicorn sync Traditional sync
Starlette ASGI/Uvicorn async Raw async foundation
FastAPI ASGI/Uvicorn async Starlette + Pydantic + OpenAPI
Django WSGI/Gunicorn sync Full-stack ORM framework

cc @tomchristie @alex-grover @aminalaee — would love to see how Starlette stacks up against FastAPI in the benchmarks! The overhead question is one I've been curious about for a while.

Starlette 0.46.1 on Uvicorn (uvloop) + Gunicorn multi-worker.
Same ASGI stack as FastAPI but without Pydantic/OpenAPI overhead.
Pre-computed JSON + gzip caches, thread-local SQLite, orjson.
@BennyFranciscus BennyFranciscus requested a review from MDA2AV as a code owner March 18, 2026 17:03
@BennyFranciscus
Copy link
Copy Markdown
Collaborator Author

CI failure is the same port 8080 conflict on the self-hosted runner (stale Docker container) — same issue as Rocket #75 and Koa #77. The build compiles fine and all dependencies install correctly. Should pass on a clean runner.

@BennyFranciscus
Copy link
Copy Markdown
Collaborator Author

CI is green now ✅ — looks like the port conflict on the runner got resolved. Starlette is ready for benchmarking whenever you want to kick it off!

@MDA2AV
Copy link
Copy Markdown
Owner

MDA2AV commented Mar 23, 2026

/benchmark

@github-actions
Copy link
Copy Markdown
Contributor

🚀 Benchmark run triggered for starlette (all profiles). Results will be posted here when done.

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark Results

Framework: starlette | Profile: all profiles

starlette / baseline / 512c (p=1, r=0, cpu=unlimited)
  Best: 572933 req/s (CPU: 10610.6%, Mem: 4.9GiB) ===

starlette / baseline / 4096c (p=1, r=0, cpu=unlimited)
  Best: 628030 req/s (CPU: 10497.8%, Mem: 4.9GiB) ===

starlette / baseline / 16384c (p=1, r=0, cpu=unlimited)
  Best: 517806 req/s (CPU: 5797.0%, Mem: 831.5MiB) ===

starlette / pipelined / 512c (p=16, r=0, cpu=unlimited)
  Best: 760652 req/s (CPU: 10316.1%, Mem: 4.8GiB) ===

starlette / pipelined / 4096c (p=16, r=0, cpu=unlimited)
  Best: 799555 req/s (CPU: 10255.6%, Mem: 5.1GiB) ===

starlette / pipelined / 16384c (p=16, r=0, cpu=unlimited)
  Best: 668705 req/s (CPU: 9981.9%, Mem: 5.8GiB) ===

starlette / limited-conn / 512c (p=1, r=10, cpu=unlimited)
  Best: 411518 req/s (CPU: 5745.7%, Mem: 941.6MiB) ===

starlette / limited-conn / 4096c (p=1, r=10, cpu=unlimited)
  Best: 428041 req/s (CPU: 5603.0%, Mem: 950.8MiB) ===

starlette / json / 4096c (p=1, r=0, cpu=unlimited)
  Best: 743130 req/s (CPU: 9989.7%, Mem: 4.9GiB) ===

starlette / json / 16384c (p=1, r=0, cpu=unlimited)
  Best: 604394 req/s (CPU: 9629.4%, Mem: 5.2GiB) ===

starlette / upload / 64c (p=1, r=0, cpu=unlimited)
  Best: 470 req/s (CPU: 5304.4%, Mem: 8.6GiB) ===

starlette / upload / 256c (p=1, r=0, cpu=unlimited)
  Best: 430 req/s (CPU: 8767.8%, Mem: 15.6GiB) ===

starlette / upload / 512c (p=1, r=0, cpu=unlimited)
  Best: 404 req/s (CPU: 2024.3%, Mem: 11.3GiB) ===

starlette / compression / 4096c (p=1, r=0, cpu=unlimited)
  Best: 85395 req/s (CPU: 4304.1%, Mem: 848.2MiB) ===

starlette / compression / 16384c (p=1, r=0, cpu=unlimited)
  Best: 79593 req/s (CPU: 5025.7%, Mem: 758.6MiB) ===

starlette / noisy / 512c (p=1, r=0, cpu=unlimited)
  Best: 474245 req/s (CPU: 10564.8%, Mem: 5.2GiB) ===

starlette / noisy / 4096c (p=1, r=0, cpu=unlimited)
  Best: 534602 req/s (CPU: 10634.7%, Mem: 5.3GiB) ===

starlette / noisy / 16384c (p=1, r=0, cpu=unlimited)
  Best: 455997 req/s (CPU: 6406.1%, Mem: 2.9GiB) ===

starlette / mixed / 4096c (p=1, r=5, cpu=unlimited)
  Best: 70399 req/s (CPU: 5484.1%, Mem: 962.9MiB) ===

starlette / mixed / 16384c (p=1, r=5, cpu=unlimited)
  Best: 62747 req/s (CPU: 4853.6%, Mem: 4.0GiB) ===
Full log
  Status codes: 2xx=1684229, 3xx=0, 4xx=214689, 5xx=0
  Latency samples: 1898911 / 1898918 responses (100.0%)
  Reconnects: 3600
  Errors: connect 0, read 1, timeout 0
  Per-template: 1078586,606051,211005,0,3276
  Per-template-ok: 1078230,605999,0,0,0

  WARNING: 214689/1898918 responses (11.3%) had unexpected status (expected 2xx)
  CPU: 10096.8% | Mem: 6.5GiB

=== Best: 455997 req/s (CPU: 6406.1%, Mem: 2.9GiB) ===
  Input BW: 46.10MB/s (avg template: 106 bytes)
[dry-run] Results not saved (use --save to persist)
httparena-bench-starlette
httparena-bench-starlette

==============================================
=== starlette / mixed / 4096c (p=1, r=5, cpu=unlimited) ===
==============================================
ce21ba9a1b22359f09981bb5540bdac5b63f1b1575b5b5832964d9d2e788b9ae
[wait] Waiting for server...
[ready] Server is up

[run 1/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  5
  Templates: 10
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   38.60ms   7.53ms   140.40ms   175.10ms   953.20ms

  384042 requests in 5.01s, 352703 responses
  Throughput: 70.44K req/s
  Bandwidth:  2.79GB/s
  Status codes: 2xx=352703, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 352702 / 352703 responses (100.0%)
  Reconnects: 76271
  Per-template: 38030,38075,38302,38488,38681,30949,31040,38306,30475,30356
  Per-template-ok: 38030,38075,38302,38488,38681,30949,31040,38306,30475,30356
  CPU: 5484.1% | Mem: 962.9MiB

[run 2/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  5
  Templates: 10
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   53.21ms   37.80ms   109.60ms   261.60ms   424.70ms

  326075 requests in 5.03s, 296933 responses
  Throughput: 59.02K req/s
  Bandwidth:  2.31GB/s
  Status codes: 2xx=296933, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 296933 / 296933 responses (100.0%)
  Reconnects: 63155
  Per-template: 31294,31634,31985,32469,32875,26783,26994,32566,25317,25016
  Per-template-ok: 31294,31634,31985,32469,32875,26783,26994,32566,25317,25016
  CPU: 8664.1% | Mem: 6.3GiB

[run 3/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  5
  Templates: 10
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   69.83ms   60.00ms   127.40ms   223.10ms   310.20ms

  220186 requests in 5.03s, 200044 responses
  Throughput: 39.76K req/s
  Bandwidth:  1.56GB/s
  Status codes: 2xx=200044, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 200044 / 200044 responses (100.0%)
  Reconnects: 42122
  Per-template: 21191,21707,21960,21993,22013,17778,17763,21818,17034,16787
  Per-template-ok: 21191,21707,21960,21993,22013,17778,17763,21818,17034,16787
  CPU: 9585.5% | Mem: 8.0GiB

=== Best: 70399 req/s (CPU: 5484.1%, Mem: 962.9MiB) ===
  Input BW: 6.88GB/s (avg template: 104924 bytes)
[dry-run] Results not saved (use --save to persist)
httparena-bench-starlette
httparena-bench-starlette

==============================================
=== starlette / mixed / 16384c (p=1, r=5, cpu=unlimited) ===
==============================================
14c8bd99681a460ef58b2f2ae2dd9d2a78fd3a7dc1d4f765f7dc09ef2e3bf7e2
[wait] Waiting for server...
[ready] Server is up

[run 1/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     16384 (256/thread)
  Pipeline:  1
  Req/conn:  5
  Templates: 10
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   159.69ms   78.60ms   209.20ms    3.50s    3.64s

  355776 requests in 5.03s, 315619 responses
  Throughput: 62.73K req/s
  Bandwidth:  2.48GB/s
  Status codes: 2xx=315619, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 315618 / 315619 responses (100.0%)
  Reconnects: 62500
  Errors: connect 0, read 756, timeout 0
  Per-template: 32990,34045,34930,35381,35329,29029,29246,31493,27360,25816
  Per-template-ok: 32990,34045,34930,35381,35329,29029,29246,31493,27360,25816
  CPU: 4853.6% | Mem: 4.0GiB

[run 2/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     16384 (256/thread)
  Pipeline:  1
  Req/conn:  5
  Templates: 10
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   245.29ms   219.90ms   422.40ms   832.00ms    1.16s

  243400 requests in 5.04s, 214686 responses
  Throughput: 42.60K req/s
  Bandwidth:  1.42GB/s
  Status codes: 2xx=214686, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 214686 / 214686 responses (100.0%)
  Reconnects: 40369
  Per-template: 16961,21107,24467,26659,26843,21942,21903,25689,16914,12201
  Per-template-ok: 16961,21107,24467,26659,26843,21942,21903,25689,16914,12201
  CPU: 7444.4% | Mem: 6.8GiB

[run 3/3]
gcannon — io_uring HTTP load generator
  Target:    localhost:8080/
  Threads:   64
  Conns:     16384 (256/thread)
  Pipeline:  1
  Req/conn:  5
  Templates: 10
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   300.85ms   274.40ms   522.60ms    1.19s    1.45s

  162573 requests in 5.05s, 140500 responses
  Throughput: 27.82K req/s
  Bandwidth:  842.44MB/s
  Status codes: 2xx=140500, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 140500 / 140500 responses (100.0%)
  Reconnects: 25196
  Per-template: 10725,16035,17234,17296,17185,14308,14187,17105,9096,7329
  Per-template-ok: 10725,16035,17234,17296,17185,14308,14187,17105,9096,7329
  CPU: 8301.5% | Mem: 8.4GiB

=== Best: 62747 req/s (CPU: 4853.6%, Mem: 4.0GiB) ===
  Input BW: 6.13GB/s (avg template: 104924 bytes)
[dry-run] Results not saved (use --save to persist)
httparena-bench-starlette
httparena-bench-starlette
[skip] starlette does not subscribe to baseline-h2
[skip] starlette does not subscribe to static-h2
[skip] starlette does not subscribe to baseline-h3
[skip] starlette does not subscribe to static-h3
[skip] starlette does not subscribe to unary-grpc
[skip] starlette does not subscribe to unary-grpc-tls
[skip] starlette does not subscribe to echo-ws
[restore] Restoring CPU governor to powersave...

@MDA2AV MDA2AV merged commit d34d375 into MDA2AV:main Mar 23, 2026
3 of 4 checks passed
@BennyFranciscus
Copy link
Copy Markdown
Collaborator Author

Starlette benchmarks are in! Really nice numbers for a pure ASGI framework:

Highlights:

  • 628K baseline at 4096c — significantly faster than Django (113K) and even beating FastAPI in raw throughput, which makes sense since Starlette is the engine under FastAPI without the extra validation layer
  • 800K pipelined at 4096c — solid scaling with connection pipelining
  • 743K JSON at 4096c — uvicorn + uvloop doing work here
  • 85K compression — reasonable for Python doing gzip on-the-fly
  • 470 upload at 64c — similar ballpark to other Python frameworks

The noisy profile at 16384c shows 11.3% unexpected status codes (214K 4xx responses) — worth keeping an eye on, though this might just be uvicorn connection handling under extreme load.

Starlette slots in nicely as the fastest pure-Python framework in the arena — ~5.5x Django, and a clear demonstration of why so many Python frameworks build on ASGI. 🐍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants