v0.2.0
Changelog
All notable changes to pounce will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[Unreleased]
Added
- (Reserved for future changes)
0.2.0 — 2026-02-13
Security hardening, production features, observability, and developer experience.
Added
Security Hardening
- Proxy header validation —
_proxy.pyvalidates and appliesX-Forwarded-For,
X-Forwarded-Proto, andX-Forwarded-Hostheaders only from trusted peers
(ServerConfig.trusted_hosts). Untrusted proxy headers are silently stripped to
prevent IP spoofing. Supports H1 and H2 bridges - CRLF response header sanitization —
_sanitize_headers()in the ASGI bridge
strips\rand\ncharacters from all response header names and values before
serialization. Prevents header injection attacks from ASGI apps. Active on both
HTTP/1.1 and HTTP/2 - Slowloris protection —
header_timeout(default: 10s) limits the time to receive
complete request headers. Uses a separate timeout fromkeep_alive_timeoutfor the
initial header read vs inter-request idle period. CLI:--header-timeout - Narrowed exception handling — Replaced broad
except Exceptionand
contextlib.suppress(Exception)blocks in worker with specific exception types
(OSError,ConnectionError,h11.LocalProtocolError). Prevents silent swallowing
of unexpected errors - HEAD compression guard — Compression is disabled for HEAD responses to preserve
theContent-Lengthheader (compressor would mismatch sizes) - Bodyless response guard — Compression is disabled for 204 and 304 responses
(RFC 9110 §6.4.1) to prevent compressor flush bytes from producing a body
Network Completeness
- Unix domain socket support —
ServerConfig.udsfor UDS binding, with stale
socket cleanup on startup and shutdown. All workers share a single UDS fd.
CLI:--uds /run/pounce.sock.net/listener.pyimplements_bind_unix_socket()
andcleanup_unix_socket() - Streaming body size enforcement —
max_request_sizeis now enforced for chunked
and streaming request bodies (not just Content-Length). Applies to both H1 (via
_run_with_body_reader) and H2 (per-stream byte tracking) - UDS peername handling — Worker correctly handles Unix socket peername (string path
or empty) instead of assuming a(host, port)tuple - 503 backpressure response — When
max_connectionsis reached, new connections
receive503 Service UnavailablewithRetry-After: 5instead of silent close
Observability
- Request ID generation —
_request_id.pygenerates UUID4 hex IDs for every
request. Trusted proxies'X-Request-IDheaders are honoured. IDs are injected into
the ASGI scope (scope["extensions"]["request_id"]), response headers (X-Request-ID),
and access logs (text and JSON). Works across H1 and H2 - Built-in health endpoint —
_health.pyresponds toGETat
ServerConfig.health_check_path(e.g./health) before ASGI dispatch. Returns JSON
with status, uptime, worker ID, and active connections. Excluded from access logs.
CLI:--health-check-path /health - Prometheus metrics —
metrics.pyprovidesPrometheusCollectorimplementing
LifecycleCollector. Trackshttp_requests_total,http_request_duration_seconds
(histogram),http_connections_active,http_requests_in_flight, and
http_bytes_sent_total. Thread-safe viathreading.Lock. Export in Prometheus text
exposition format viacollector.export() - Built-in
/metricsendpoint — Configurable Prometheus scrape endpoint
(ServerConfig.metrics_path, default/metrics) with zero external dependencies - Access log request IDs — Text format appends
[<12-char-id>]; JSON format
includes fullrequest_idfield
Static File Serving
_static.py— Zero-copy sendfile, pre-compressed files (.gz,.br,.zst),
ETags, and range requests. Configurable viaServerConfig.static_files,
static_precompressed,static_cache_control
Middleware & Extensibility
- Server-level middleware —
ServerConfig.middlewareaccepts a list of ASGI3
middleware callables applied before the app - ASGI lifespan state sharing — Lifespan state propagated to worker scopes for
spec-compliant shared app state
Graceful Operations
- Zero-downtime graceful reload — SIGHUP triggers rolling worker restart with
connection draining.reload_timeoutconfigurable - Connection draining — Enhanced graceful shutdown with
shutdown_timeoutfor
Kubernetes and orchestration platforms
WebSocket & Protocol
- WebSocket permessage-deflate — RFC 7692 compression for WebSocket connections.
ServerConfig.websocket_compression(default: True)
Developer Experience
- Development error pages —
_debug.pyprovides rich HTML tracebacks with syntax
highlighting (Rosettes), local variables, and request context. Production-safe
(debug=Falsereturns plain 500) - Hot reload utilities —
_hot_reload.pyfor in-process module reimport without
full process restart.ServerConfig.reload_include,reload_dirsfor configurable
file watching
Production Integrations
- OpenTelemetry —
_otel.pynative distributed tracing with OTLP export.
ServerConfig.otel_endpoint,otel_service_name - Sentry —
_sentry.pyoptional error tracking.sentry_dsn,sentry_environment,
sentry_release - Per-IP rate limiting —
_rate_limiter.pytoken bucket algorithm.
rate_limit_enabled,rate_limit_requests_per_second,rate_limit_burst - Request queueing —
_request_queue.pybounded queue with load shedding (503).
request_queue_enabled,request_queue_max_depth
Lifecycle & Logging
- Structured lifecycle logging —
lifecycle_loggingconfig for connection/request
events with correlation IDs.log_slow_requests_thresholdfor slow request detection
H1/H2 Feature Parity
- All security and observability features wired for both HTTP/1.1 and HTTP/2 handlers
Tests
- New test modules:
test_request_id,test_health,test_proxy,test_security,
test_metrics,test_metrics_endpoint,test_h2_bridge,test_listener_uds,
test_bridge,test_static,test_middleware,test_graceful_reload,test_hot_reload,
test_connection_draining,test_debug_error_pages,test_lifecycle_logging,
test_lifespan_state,test_otel,test_rate_limiter,test_request_queue,
test_sentry,test_websocket_compression - Integration tests for static files, WebSocket compression, lifespan state
0.1.0 — 2026-02-09
Initial release of Pounce — a free-threading-native ASGI server for Python 3.14t.
Added
Configurable Reload Watch
ServerConfig.reload_include— extra file extensions to watch beyond the built-in set
(.py,.yaml,.toml, etc.). Pass a tuple of extensions like(".html", ".css", ".md")
to trigger reloads on non-Python file changesServerConfig.reload_dirs— extra directories to watch alongside the current working
directory. Useful when templates or static assets live outside the project root- CLI flags:
--reload-include ".html,.css,.md"and--reload-dir ./templates(repeatable) - Extensions without a leading dot are auto-prefixed (e.g.
"html"becomes".html") _reload.pyfunctions (_should_watch,_snapshot,detect_changes,watch_for_changes)
accept anextensions/extra_extensionsparameter for runtime customizationparse_extensions()andparse_dirs()helpers extracted in_cli.pyfor testability
Hot Reload with Module Reimport
reimport_app()in_importer.pyclears project-local modules fromsys.modules,
deletes stale.pycbytecode caches, and callsimportlib.invalidate_caches()before
reimporting — code changes on disk take effect without a full process restart- Single-worker and multi-worker reload paths both reimport when
app_pathis provided ServerandSupervisoracceptapp_path: str | Noneto enable reimport on reload_clear_local_modules()resolves paths withos.path.realpath()for macOS symlink safety
Connection Lifecycle Events
- Structured, immutable event types for every stage of a connection's lifecycle:
ConnectionOpened,RequestStarted,ResponseCompleted,RequestFailed,
ConnectionClosed— all frozen dataclasses with nanosecond monotonic timestamps LifecycleCollectorprotocol — any object with arecord(event)method can receive
lifecycle events.NoopCollector(default) discards events with zero overhead.
BufferedCollectorstores events in a thread-safe deque for inspectionServerandSupervisornow accept an optionallifecycle_collectorparameter and
forward it to everyWorkerthey spawn. This enables external systems (e.g. Purr's
StackCollector) to receive connection-level telemetry from all workers through a
single collector instance- Events are designed for aggregation and observability, not logging — use them to build
latency distributions, connection counts, error rate dashboards, or full-stack event
traces
Per-Worker Lifecycle Scopes
- Worker sends
pounce.worker.startupscope to the ASGI app before accepting connections,
andpounce.worker.shutdownafter closing — both run on the worker's own event loop so
async resources (httpx clients, DB pools) bind to the correct loop - Timeout protection: 30s startup, 10s shutdown — apps that don't recognise the scope type
time out gracefully instead of hanging _worker_lifecycle_receivereturnshttp.disconnectimmediately so apps that route
unknown scopes to their HTTP handler unblock quickly- If startup hook fails, the worker does not accept connections (prevents serving with
uninitialised state); shutdown hook failure is non-fatal tests/unit/test_worker_lifecycle.py— 6 tests covering startup/shutdown delivery,
ordering, startup failure, shutdown failure, and unknown-scope handling
ASGI 3.0 Compliance Suite
tests/integration/test_asgi_compliance.py— 41 tests validating pounce against the
ASGI 3.0 HTTP Connection Scope and Lifespan specs: scope completeness, all HTTP methods,
header lowercasing, path decoding, query strings, request body protocol, response
streaming, keep-alive, Connection: close, error handling, lifespan lifecycle
Phase 4: It's Fast — performance optimization, correctness fixes, benchmark infrastructure.
POST Request Body Reading (Correctness Fix)
- Worker now reads POST/PUT/PATCH request bodies correctly. Restructured
_handle_request
to collect body events from the initial h11 parse batch and, for bodies spanning multiple
socket reads, runs a concurrent body reader task alongside the ASGI app - Removed xfail markers from
test_post_body_echoandtest_large_body - Added tests for PUT body, streaming multi-chunk body
App Factory Support
pounce "myapp:create_app()"works end-to-end — the importer already supported factory
detection; CLI, integration tests, and example app now verify the full pipeline- Added
examples/factory_app.pydemonstrating the factory pattern
Optional httptools Backend (pounce[fast])
protocols/h1_httptools.py— C-accelerated HTTP/1.1 parser implementing the same
ProtocolHandlerinterface asH1Protocol(h11). Uses httptools callbacks for parsing
and hand-crafted response serialization for speed- Worker auto-detects httptools at import time;
pip install pounce[fast]is the opt-in - Full unit test suite for the httptools backend (skips when not installed)
pyproject.tomladdsfastoptional extra:httptools>=0.6
Benchmark Suite
benchmarks/run_benchmark.py— reproducible benchmark runner that starts pounce, drives
load with wrk or hey, captures results as structured JSON, prints markdown summary table- Comparison mode:
--compareruns the same workload against uvicorn - Workloads: hello-world (overhead), JSON (serialize), POST echo (body reading)
- Dedicated benchmark apps in
benchmarks/apps/
Profiling Infrastructure
benchmarks/profile_hotpath.sh— wraps py-spy for flame graph generation under loadbenchmarks/profile_memory.py— RSS tracking with optional tracemalloc integration
Hot-Path Optimizations
- Pre-computed ASGI spec dict constant (avoid per-request dict allocation)
- Bodyless fast-path receive: skip asyncio.Queue for GET/HEAD requests
- Write coalescing: head + first body chunk combined into single write for responses < 16KB
- Single-pass header lookup for compression negotiation
- Skip empty body writes (avoid zero-length syscalls)
CI
.github/workflows/ci.yml— GitHub Actions pipeline: lint (ruff check + format), type
check (ty), and tests on a 2x2 matrix (ubuntu/macos x Python 3.14/3.14t). Includes GIL
status verification on free-threaded builds. 15-minute timeout per the py-free-threading
CI guide
Changed
- Removed
from __future__ import annotationsfrom all 43 source, test, example, and
benchmark files — not needed on Python 3.14 (PEP 563 import is a no-op) - Registered
timeoutpytest marker inpyproject.toml(silences 6 warnings)
Phase 3: It's Complete — full protocol support, TLS, WebSocket, HTTP/2, modern HTTP features.
TLS Termination
net/tls.py—create_tls_context()for stdlibssl.SSLContextwith secure defaults
(TLSv1.2+, no compression), ALPN protocol advertisement (h2,http/1.1), optional
truststoreintegration for system certificate storesis_tls_configured()helper for conditional context creation- CLI flags:
--ssl-certfile,--ssl-keyfile TLSErroradded to error hierarchy- Startup banner shows
tls: enabledwhen active
WebSocket Protocol
protocols/ws.py—WSProtocolsans-I/O wrapper around wsproto for server-side
WebSocket framing. Manual101 Switching ProtocolsHTTP response construction
(wsproto 1.x expects HTTP upgrade handled externally)build_ws_accept_key()for RFC 6455Sec-WebSocket-Acceptcomputationbuild_101_response()for raw HTTP upgrade response bytesasgi/ws_bridge.py—build_ws_scope(),create_ws_receive(),create_ws_send()
for full ASGI WebSocket lifecycle (websocket.connect,websocket.accept,
websocket.send,websocket.close)- New event types:
WebSocketConnected,WebSocketDataReceived,WebSocketDisconnected
HTTP/2 Protocol
protocols/h2.py—H2Connectionsans-I/O wrapper around the h2 library. Stream
multiplexing, per-stream event types (H2RequestReceived,H2BodyReceived,
H2StreamReset,H2GoAway,H2WindowUpdated,H2WebSocketRequest), flow control,
GOAWAY handlingasgi/h2_bridge.py—build_h2_scope(),create_h2_receive(),create_h2_send()
for per-stream ASGI dispatch with concurrent stream tasks- ALPN negotiation in worker:
selected_alpn_protocol() == "h2"→ H2 connection handler SETTINGS_ENABLE_CONNECT_PROTOCOLfor RFC 8441 WebSocket over HTTP/2
Protocol Negotiation
- Worker dynamically branches connections based on ALPN result (H2) or HTTP/1.1 upgrade
headers (WebSocket), falling through to standard HTTP/1.1 keep-alive loop _is_websocket_upgrade()helper: detectsConnection: Upgrade+Upgrade: websocket
WebSocket over HTTP/2 (RFC 8441)
- Extended CONNECT detection in
H2Connection.receive_data()::method = CONNECT+
:protocol = websocketemitsH2WebSocketRequestevent _handle_h2_websocket_stream()in worker manages WS framing within H2 streams
Priority Signals (RFC 9218)
_priority.py—parse_priority()forPriorityheader parsing (urgency 0-7,
incremental boolean),StreamPrioritydataclass,PrioritySchedulermin-heap for
urgency-based DATA frame scheduling
103 Early Hints
- H2 ASGI bridge:
status == 103inhttp.response.startsends informational headers
without marking response as started (allows multiple early hints before final response) - H1 ASGI bridge: silently skips
status == 103(browser support inconsistent over H1)
Dev Reload
_reload.py— file watcher with polling:_snapshot(),detect_changes(),
watch_for_changes()with configurable interval and stop event- Excludes
__pycache__,.git,.venv,node_modules, etc. - Watches
.py,.yaml,.toml,.json,.cfg,.iniextensions - Single-worker mode: restart loop (shutdown → recreate socket → restart asyncio)
- Multi-worker mode:
Supervisor.restart_workers()drains all workers, clears shutdown
event, respawns fresh workers - CLI flag:
--reload ReloadErroradded to error hierarchy- Startup banner shows
reload: enabledwhen active
Keep-Alive Tuning
max_requests_per_connectionconfig field (0 = unlimited): enforced in the HTTP/1.1
keep-alive loop — closes connection after N requests- CLI flags:
--keep-alive-timeout,--max-requests-per-connection - Config validation:
keep_alive_timeout > 0,max_requests_per_connection >= 0 - Startup banner shows non-default keep-alive and max-requests values
Package Wiring
protocols/__init__.py— re-exportsWSProtocol,H2Connection, all H2 event typesasgi/__init__.py— re-exports WS and H2 bridge functionsnet/__init__.py— re-exportscreate_tls_context,is_tls_configured
Tests (408 passing — unit + integration + compliance)
- TLS: context creation, secure defaults, ALPN, missing cert handling, truststore
- WebSocket:
WSProtocolframing,build_ws_accept_key,build_101_response,
build_ws_scope,_is_websocket_upgradeheader detection - HTTP/2:
H2Connectioninit, request/response lifecycle, multiplexed streams,
stream reset, GOAWAY - Priority Signals:
parse_priority,PrioritySchedulerurgency ordering - Dev Reload:
_snapshot,detect_changes, file creation/modification/deletion,
exclude patterns - Compression: updated for Brotli exclusion (GIL-incompatible on 3.14t)
- Config: validation for
keep_alive_timeoutandmax_requests_per_connection - Supervisor:
restart_workers()event clearing and worker joining - CLI: Phase 3 flag parsing (TLS, reload, keep-alive, max-requests)
- Package exports: Phase 3 protocol, ASGI, net, and error exports
- Error hierarchy:
TLSErrorandReloadError
Phase 2: It Scales — multi-worker mode with automatic GIL detection.
Runtime Detection
_runtime.py—is_gil_enabled()wrappingsys._is_gil_enabled()with safe fallback
for Python < 3.13;detect_worker_mode()returning"thread"(nogil) or"process"
(GIL);default_worker_count()fromos.cpu_count()
Supervisor
supervisor.py—Supervisorclass that spawns N workers asthreading.Thread(on
nogil / 3.14t) ormultiprocessing.Process(on GIL builds). Health monitoring via
watchdog loop (1s interval), crash detection and automatic restart with budget (max 5
restarts per 60s window), graceful shutdown coordination viathreading.Event, per-worker
connection limit calculation, SIGINT/SIGTERM signal forwarding
Worker Enhancements
- External
threading.Eventshutdown bridge — supervisor sets a threading event, the
worker's_bridge_shutdowntask polls it every 250ms and bridges to asyncio via
loop.call_soon_threadsafe - Per-worker connection backpressure — rejects connections when at capacity
- Worker ID for log differentiation (
pounce.worker.0,pounce.worker.1, etc.) - Thread-safe
shutdown()method usingcall_soon_threadsafe
Network
create_listeners(config, count)— multi-socket creation strategy: per-worker
independent sockets withSO_REUSEPORTon Linux (kernel-level distribution), shared
socket fallback on macOS (single fd, all workers accept)
Server Orchestration
- Single-worker fast path (
workers=1) — skips supervisor entirely, no overhead - Multi-worker path delegates to
Supervisorfor lifecycle management - ASGI lifespan runs once in main thread before workers spawn
- Startup banner now shows GIL status (
nogil/GIL) and worker mode - Socket deduplication on cleanup for shared-fd safety
Configuration
workers=0auto-detect semantics viaresolve_workers()(defaults toos.cpu_count())__post_init__validation for workers (>= 0) and port (0-65535)- CLI
--workers 0for auto-detect with updated help text
Error Hierarchy
SupervisorError— worker spawn failures, crash-restart exhaustionWorkerError— worker-level failures reported to supervisor
Benchmarks
benchmarks/hello_app.py— minimal ASGI app for throughput benchmarkingbenchmarks/sse_app.py— SSE streaming app for stress testingbenchmarks/test_throughput.py— automated throughput scaling benchmark (single-worker
baseline ~6-7k req/s, multi-worker validated via shared-socket workers)benchmarks/test_memory.py— thread vs process RSS comparison (thread workers use
shared interpreter, ~3MB delta for 4 workers)benchmarks/test_sse_stress.py— SSE stress test: 100 concurrent streams held 10s,
~20k events delivered, RSS growth < 3MB (no memory leak)benchmarks/test_chirp_compat.py— chirp App compatibility verification (chirp hello-world
served through pounce Worker without modification)benchmarks/README.md— instructions for wrk/hey benchmarking
Tests (253 + 7 benchmark tests, all passing)
- Unit tests for runtime detection: GIL state, worker mode, CPU count fallback
- Unit tests for supervisor: init, mode detection, socket validation, shutdown, spawn/stop,
respawn budget, restart window pruning, per-worker connection limits - Unit tests for listener multi-socket: create_listeners, strategy detection, SO_REUSEPORT
vs shared, count validation - Unit tests for worker: external shutdown bridge, internal shutdown, worker ID, backpressure
- Integration tests for multi-worker: concurrent requests across workers, graceful shutdown,
worker liveness, supervisor mode reporting - Integration tests for server: _close_sockets deduplication, shared-fd handling
- Updated conftest and test_server to use explicit
worker_id=0 - Updated package export tests for Phase 2 modules
Phase 1: It Runs — the minimal viable ASGI server.
Primitives
_errors.py—PounceErrorhierarchy with HTTP status code mapping:ParseError
(400),TimeoutError(408),LimitError(413/431),AppError(500),LifespanError
(500)_timing.py—monotonic_ns(),elapsed_ms()clock utilities;ServerTimingbuilder
for theServer-TimingHTTP header_importer.py— resolve"module:attribute"and"module:factory()"strings to ASGI
callables with clear error messages_compression.py—Accept-Encodingnegotiation (zstd > gzip > identity, respects
q-values), per-requestZstdCompressor(stdlibcompression.zstd) andGzipCompressor
(stdlibzlib) instances_types.py— ASGI 3.0 type aliases:Scope,Receive,Send,ASGIAppconfig.py—ServerConfigfrozen dataclass with bind address, timeouts, limits,
compression,root_path,server_timing, access log, and h11 tuning fields
Protocol Layer
protocols/_base.py—ProtocolHandlerruntime-checkable Protocol; typed event
dataclasses:RequestReceived,BodyReceived,ConnectionClosed,Upgraded;
ProtocolEventunion typeprotocols/h1.py— sans-I/O HTTP/1.1 handler wrapping h11: request parsing, response
serialization, keep-alive cycling, malformed-input detection
ASGI Bridge
asgi/bridge.py—build_scope()(HTTP scope from protocol events + config),
create_receive()(async body stream from queue),create_send()(streaming-first
writes with optional compression and Server-Timing injection)asgi/lifespan.py—run_lifespan()async context manager: startup/shutdown events,
failure handling, timeout, graceful no-lifespan fallback
Network and Worker
net/listener.py— socket creation withSO_REUSEADDR/SO_REUSEPORT, non-blocking
bind, clear error messages for EADDRINUSE/EACCESlogging.py— stdlib logging configuration; structured access log format:
{client} - "{method} {path} HTTP/1.1" {status} {bytes} {duration}msworker.py— asyncio event loop accepting connections through the full pipeline:
parse → scope → negotiate compression → ASGI app → response → access log. Keep-alive
cycling, error responses (400/500), configurable timeouts
Server and CLI
server.py— full lifecycle orchestration: CONFIG → BIND → LIFESPAN → SERVE → SHUTDOWN.
Signal handling (SIGINT/SIGTERM), startup banner with version/URL/workers/features_cli.py—pounce myapp:appCLI via argparse:--host,--port,--workers,
--log-level,--root-path,--no-compression,--server-timing,--no-access-log__init__.py— public API:pounce.run(),ServerConfig, ASGI type re-exports
Package Wiring
protocols/__init__.py— re-exportsH1Protocol,ProtocolHandler, all event typesasgi/__init__.py— re-exportsbuild_scope,create_receive,create_send,
run_lifespannet/__init__.py— re-exportscreate_listener- Top-level
__init__.py— re-exportsASGIApp,Scope,Receive,Send
Tests (188 passing)
- Unit tests for all primitives: errors, timing, importer, protocol events, config
- Unit tests for H1 protocol: parsing, serialization, keep-alive, malformed input
- Unit tests for compression: negotiation, roundtrip, browser Accept-Encoding strings
- Unit tests for ASGI bridge: scope construction, streaming send, compression/timing injection
- Unit tests for lifespan: happy path, failure, no-lifespan apps, shutdown timeout
- Unit tests for listener: socket properties, non-blocking, reuseaddr
- Unit tests for logging: format correctness
- Unit tests for package exports: all
__init__.pyre-exports verified - Integration tests for worker: hello world, echo, streaming, error handling, malformed input
- Integration tests for server: start/respond lifecycle, lifespan events
- Integration tests for CLI: parser defaults/overrides, invalid app handling, public API imports
- Shared
conftest.pywith lifespan-aware test apps andstart_worker/send_raw_requesthelpers
Infrastructure
- Project scaffolding:
pyproject.tomlwith ruff, ty, pytest, poe task runner py.typedPEP 561 marker_Py_mod_gil = 0free-threading declaration