Spike: per-route AsyncHandler control

## Context

The MS-R1 matrix comparing sync vs async dispatch across every handler shape revealed a clean split:

| Workload | Sync (AsyncHandlers=false) | Async (=true) |
|---|---|---|
| Pure-CPU (plaintext, JSON, params, body) | **+30-33% RPS** | baseline |
| Chain middleware | **+23-26% RPS** | baseline |
| DB-integrated (celerisredis/pg/mc) | baseline | **+30-60% RPS** |
| 3rd-party drivers (goredis, pgx, gomc) | baseline | **+30-200% RPS** |

Numbers (celeris-iouring, 12c ARM64, 256 conns, 8s per cell):
- /plaintext: 428,582 (sync) vs 287,838 (async) — sync +48.9%
- celerismc: 65,265 (sync) vs 101,902 (async) — async +56.1%
- goredis: 24,623 (sync) vs 75,416 (async) — async +206%

Real apps mix: some routes hit a DB, others return canned responses or small computed results. Today the choice is all-or-nothing per server via Config.AsyncHandlers. That forces a user with 80% plaintext routes + 20% DB routes to either lose 30% on plaintext or lose 2x on DB routes.

## Proposal

Per-route opt-in / opt-out of async dispatch. Rough API surface:

- Server default stays at Config.AsyncHandlers (whatever the user configured).
- Per-route: Route.Async(true/false) overrides the default for that specific handler chain.
- Per-group: RouteGroup.Async(true/false) inherits to its children.
- Defaults: all routes inherit the server default unless explicitly overridden.

Usage sketch:

\`\`\`go
srv := celeris.New(celeris.Config{Engine: celeris.Epoll})
srv.GET(\"/ping\", pingHandler)               // sync, inherits server default
srv.GET(\"/users/:id\", userHandler).Async()  // async, this route only
api := srv.Group(\"/api\").Async()            // async for all /api/*
api.GET(\"/products\", productHandler)         // async
api.GET(\"/cached\", cachedHandler).Async(false) // opt out of async
\`\`\`

## Engine implementation questions (spike scope)

1. **Dispatch decision point**. The engine currently decides sync vs async per-connection at drainRead time based on Loop.async. Per-route means we decide per-request after routing. Options:
   - Always spawn a goroutine for the HTTP1 path, but have the dispatch goroutine check the route's async flag and either (a) dispatch to a worker goroutine if route is async, or (b) run inline if sync. Adds one extra goroutine spawn to compensate.
   - Peek the request path early in drainRead (before ProcessH1), look up the route, decide dispatch path. Breaks encapsulation.
   - Always-async mode: dispatch every request to a goroutine, but have the goroutine check route flag and call-and-return vs chain. The \"sync\" case becomes goroutine-does-everything. Close to net/http's model.

2. **Middleware inheritance**. If a route is async but inherits middleware from a sync group, which mode wins? Proposal: most-specific wins.

3. **WebSocket / detached handlers**. Detach implies async by construction. Per-route flag can't affect detached flows.

4. **H2 conns**. Today H2 always runs inline. Per-route async on H2 needs additional design — may be out of scope for v1.5.

## Open questions for spike

- Is the 1.5-2µs async-dispatch cost reducible to a point where we can just make everything async? pprof points at sync.Cond + goroutine wake — can any of this be optimized to <500ns per request?
- Does routing-before-dispatch introduce enough overhead that the sync path loses its advantage anyway?
- What's the right default for .Async()? Opt-in or opt-out?

## Exit criteria

- Design doc proposing an API and dispatch model.
- Prototype showing a mixed-workload benchmark (pure-CPU + DB routes on same server) outperforming both \"all-sync\" and \"all-async\" configs by at least 10% aggregate.
- Decision on whether to ship in v1.5.0 or punt to v1.6.0 based on prototype results.

## Related

- PR #236 (v1.4.0): introduced Config.AsyncHandlers as a server-wide flag.
- Full MS-R1 matrix data: 56 HTTP-layer cells + 60 driver cells + middleware benchmarks captured on 2026-04-19.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spike: per-route AsyncHandler control #239

Context

Proposal

Engine implementation questions (spike scope)

Open questions for spike

Exit criteria

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Workload	Sync (AsyncHandlers=false)	Async (=true)
Pure-CPU (plaintext, JSON, params, body)	+30-33% RPS	baseline
Chain middleware	+23-26% RPS	baseline
DB-integrated (celerisredis/pg/mc)	baseline	+30-60% RPS
3rd-party drivers (goredis, pgx, gomc)	baseline	+30-200% RPS

Spike: per-route AsyncHandler control #239

Description

Context

Proposal

Engine implementation questions (spike scope)

Open questions for spike

Exit criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions