Skip to content

Speed up multipart header parsing and callback dispatch#295

Merged
Kludex merged 1 commit into
mainfrom
perf/hot-path-optimizations
Jun 4, 2026
Merged

Speed up multipart header parsing and callback dispatch#295
Kludex merged 1 commit into
mainfrom
perf/hot-path-optimizations

Conversation

@Kludex
Copy link
Copy Markdown
Owner

@Kludex Kludex commented Jun 4, 2026

Summary

Two hot-path optimizations to MultipartParser, both behaviour-preserving for valid input:

  1. find-based header parsing. HEADER_FIELD and HEADER_VALUE now jump straight to the delimiter with data.find() and validate the whole field-name span at once via bytes.translate(None, TOKEN_CHARS), instead of scanning byte by byte. This mirrors how PART_DATA and the querystring parser already use data.find.
  2. Dropped per-callback logger.debug. BaseParser.callback was issuing two logger.debug(...) calls on every single callback (per part-data chunk, per header, per field), which dominated dispatch cost even when logging was disabled.

A couple of incidental cleanups: boundary_length = len(boundary) is hoisted out of the per-iteration PART_DATA path, and TOKEN_CHARS is exposed as bytes (with TOKEN_CHARS_SET derived from it) for the bulk validation.

Benchmarks

Measured against main on CPython 3.13 and 3.14 (best-of-N, ns/op), using the existing tests/test_benchmarks.py workloads:

Benchmark Python 3.13 Python 3.14
large_form ~3.1x faster ~2.6x faster
simple_form ~2.2x faster ~1.95x faster
querystring ~2.4x faster ~1.4x faster
file_upload ~1.1x faster ~1.04x faster
worstcase ~1.1x faster ~1.03x faster

Header-heavy and small forms benefit most; body-dominated cases (file_upload, worstcase) already used find for boundary scanning, so they only pick up the callback-dispatch savings.

Correctness

  • All existing tests pass.
  • Differentially fuzzed the new parser against main's implementation: identical callback event streams and identical errors across randomized bodies x every chunk-split strategy (including byte-by-byte), at both small and default header limits.
  • Verified downstream: ran Starlette 1.2.1's full tests/test_formparsers.py (40 tests) against this branch - all pass.

The one observable difference is on a malformed-input error path: a header name that is both over max_header_size and contains an invalid character may report the invalid-character error rather than the size-limit error (both still raise MultipartParseError). This was deemed acceptable.

AI Disclaimer

This PR was developed with the assistance of either Claude or Codex. I've reviewed and verified the changes.

Parse header field names and values with bytes.find/translate to jump to
the delimiter instead of scanning byte by byte, and drop the per-callback
logger.debug calls from the hot path. This roughly halves parse time for
header-heavy and small forms (large_form ~3x, simple_form ~2x, querystring
~2.4x on CPython 3.13/3.14), with no behaviour change for valid input.
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Jun 4, 2026

Merging this PR will improve performance by 34.1%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
✅ 4 untouched benchmarks

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation test_querystring_large_form 1,229.4 µs 916.8 µs +34.1%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing perf/hot-path-optimizations (0a64a18) with main (9d3ead5)

Open in CodSpeed

@Kludex Kludex merged commit 6732164 into main Jun 4, 2026
15 checks passed
@Kludex Kludex deleted the perf/hot-path-optimizations branch June 4, 2026 07:48
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0a64a18255

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1252 to +1253
field = data[i:end]
if field.translate(None, TOKEN_CHARS):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce header size before bulk-validating field names

For a malformed multipart part whose header field never contains : and is delivered in a large write() chunk, this slices and translates the entire remaining chunk before advance_header_size(end - i) runs below. That means the default max_header_size no longer bounds the work or temporary allocation for oversized header names; a request with megabytes of token characters in a header line will be scanned/copied in full instead of failing once the 4 KiB limit is crossed.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant