Middleware Integration

Middleware integration for `unifiedAdmission` + `adaptiveConcurrency`

Added in 0.9.2 (2026-05-29).

Both unifiedAdmission (0.9.0) and adaptiveConcurrency (pre-0.8) expose a release() lifecycle callback that must be invoked exactly once when the request lifecycle ends. Miss any one of the hooks (finish, close, or the error path) and concurrency slots leak silently until the adaptive limit collapses to zero and your server stops admitting anything.

Before 0.9.2 you had to wire release() to your framework's request lifecycle by hand. As of 0.9.2 the library does it for you.

What's in the box

22 new exports across 11 frameworks. The user passes a prebuilt UnifiedAdmitter (from unifiedAdmission(...)) or ConcurrencyGuard (from adaptiveConcurrency(...)) to the adapter, and the adapter owns the release.

Framework	unifiedAdmission	adaptiveConcurrency
express	`expressUnifiedAdmission`	`expressAdaptiveConcurrency`
fastify	`fastifyUnifiedAdmission`	`fastifyAdaptiveConcurrency`
koa	`koaUnifiedAdmission`	`koaAdaptiveConcurrency`
nest	`nestUnifiedAdmissionMiddleware`	`nestAdaptiveConcurrencyMiddleware`
hono	`honoUnifiedAdmission`	`honoAdaptiveConcurrency`
fetch	`withUnifiedAdmission`	`withAdaptiveConcurrency`
next	`nextUnifiedAdmission`	`nextAdaptiveConcurrency`
remix	`remixUnifiedAdmission`	`remixAdaptiveConcurrency`
sveltekit	`sveltekitUnifiedAdmission`	`sveltekitAdaptiveConcurrency`
elysia	`elysiaUnifiedAdmission`	`elysiaAdaptiveConcurrency`
trpc	`trpcUnifiedAdmission`	`trpcAdaptiveConcurrency`

The express pattern (canonical)

import express from "express";
import {
  expressUnifiedAdmission,
  unifiedAdmission,
  adaptiveConcurrency,
  rateLimit,
  gcra,
  tokenBucket,
} from "throttlekit";

const admitter = unifiedAdmission({
  rate: rateLimit({ strategy: gcra({ limit: 60, periodMs: 60_000 }) }),
  concurrency: adaptiveConcurrency({ minLimit: 4, maxLimit: 128 }),
  cost: rateLimit({ strategy: tokenBucket({ capacity: 100_000, refillPerSec: 1667 }) }),
});

const app = express();
app.use(expressUnifiedAdmission({ admitter, dropOn5xx: false }));
app.post("/completions", (req, res) => res.json({ ok: true }));

The middleware wires res.on("finish") + res.on("close") with the first-fire-wins pattern:

close before finish ⇒ release({dropped: true}) — client hangup, handler threw without an error middleware, or a server-side timeout.
finish first ⇒ release({dropped: false}) (normal completion), or dropped: true when dropOn5xx: true and the status is 5xx.

The second event is a no-op (idempotent release).

The web-platform pattern (fetch, next, remix, sveltekit)

These frameworks return a Response — the lifecycle is the body stream. The adapter wraps the body so release fires when the stream drains, errors, or is cancelled.

import { withUnifiedAdmission, unifiedAdmission, adaptiveConcurrency } from "throttlekit";

const admitter = unifiedAdmission({
  concurrency: adaptiveConcurrency({ minLimit: 4, maxLimit: 128 }),
});

export default {
  fetch: withUnifiedAdmission(
    async (request) => new Response("ok"),
    { admitter },
  ),
};

The wrap pattern (hono, trpc, elysia)

These use a try/finally around the user's next() or body. The release fires with dropped = thrown (plus the dropOn5xx rule for normal returns).

// Hono
import { honoUnifiedAdmission } from "throttlekit/hono";
app.use("*", honoUnifiedAdmission({ admitter }));

// tRPC
import { trpcUnifiedAdmission } from "throttlekit/trpc";
const ratelimited = t.procedure.use(
  t.middleware(trpcUnifiedAdmission<MyCtx>({
    admitter,
    key: ({ ctx }) => ctx.user.id,
  })),
);

// Elysia (wrap function)
import { elysiaUnifiedAdmission } from "throttlekit/elysia";
const admit = elysiaUnifiedAdmission({ admitter });
app.get("/", (ctx) => admit(ctx, async () => "ok"));

`dropped` decision matrix

dropped is a property of the response state, not the handler outcome:

Event	`dropped`	Why
Response finished normally (any status)	`false` (default)	The runtime delivered a response; lifecycle completed.
Response finished, status >= 500, `dropOn5xx: true`	`true`	User opted in to treating 5xx as overload.
Handler threw, error middleware wrote response	`false`	`finish` fired — response was delivered.
Handler threw, no error middleware wrote anything	`true`	`close` fires when socket times out; no `finish`.
Client hung up mid-stream	`true`	`close` fires without `finish`.
Server-side timeout fires	`true`	Triggers `close`.
Consumer cancelled response body (fetch-style)	`true`	Stream `cancel` callback.
Stream errored mid-flight	`true`	Stream pump's `catch`.

Forward-compat

The adapters accept any ConcurrencyGuard, not just the in-process implementation. distributedAdaptiveConcurrency (0.10.0) drops in behind the same middleware — its DistributedConcurrencyGuard keeps acquire() synchronous, so passing it as the guard gives every route a fleet-shared ceiling with no call-site change:

import { expressAdaptiveConcurrency, distributedAdaptiveConcurrency, RedisConcurrencyCoordinator } from "throttlekit";

const guard = distributedAdaptiveConcurrency({
  coordinator: new RedisConcurrencyCoordinator({ client, aggregate: "median" }),
  nodeId: process.env.HOSTNAME!, key: "inference-cluster",
});
app.use(expressAdaptiveConcurrency({ guard })); // same adapter, now a fleet-wide bound

See Distributed adaptive concurrency. Behind a ThrottleKit server the same fleet-shared ceiling is reachable over the existing Admit RPC with no client change — see Scaling & the Fleet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Middleware Integration

Middleware integration for `unifiedAdmission` + `adaptiveConcurrency`

What's in the box

The express pattern (canonical)

The web-platform pattern (fetch, next, remix, sveltekit)

The wrap pattern (hono, trpc, elysia)

`dropped` decision matrix

Forward-compat

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Middleware Integration

Middleware integration for unifiedAdmission + adaptiveConcurrency

What's in the box

The express pattern (canonical)

The web-platform pattern (fetch, next, remix, sveltekit)

The wrap pattern (hono, trpc, elysia)

dropped decision matrix

Forward-compat

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Middleware integration for `unifiedAdmission` + `adaptiveConcurrency`

`dropped` decision matrix