Make the long_boundary benchmark dominated by the patched code path#280
Merged
Make the long_boundary benchmark dominated by the patched code path#280
Conversation
Two changes so the partial-boundary fallback overhead dominates total parser work, making the speedup visible under instrumentation mode: - body pattern switches from string.printable (contains \r every ~100 bytes) to b'abcdefgh' (no \r). With no boundary[0] candidates in the body, the inner state machine never engages and the parser's per-chunk cost is dominated by the find-miss fallback path. - body size grows from 2 MiB to 8 MiB so the absolute work in that fallback is large enough to clear CodSpeed's instruction-count noise floor. Local wall-clock comparison of unpatched vs patched: 661 ms -> 7.5 ms (87x) at 16 MiB, 333 ms -> 6.6 ms at 8 MiB.
Merging this PR will degrade performance by 72.91%
Performance Changes
Comparing Footnotes
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
LONG_BOUNDARY_BODYcontent fromstring.printable(contains\r) tob"abcdefgh"(no\r)Why
The existing
test_multipart_long_boundarydoes measure the partial-boundary fallback path, but in instrumentation mode CodSpeed reports the change from PR #275 as essentially flat. After tracing it through valgrind-equivalent instruction counting, two issues:string.printablebody,\r(the internalboundary[0]) appears every ~100 bytes. Each occurrence triggers the inner state-machine block which advancesindexpast 0 then resets - that loop runs identically in patched and unpatched versions. It dominates the parser's instruction count and washes out the fallback delta.A body without
\rlets the parser stay in the find-miss fallback path on every chunk: the unpatched Pythonwhileloop scans the whole 16 KiB look-back region per chunk, while the patchedbytes.finddoes the same work in C. Bumping the body to 8 MiB gives 515 chunks of that work.Local wall-clock (unpatched main vs PR #275):
\r\r\rThis should give CodSpeed instrumentation a clear, dominant signal on PR #275.
Test plan
pytest tests/test_benchmarks.pypasses (6 tests)scripts/checkpassesbytes.find#275 and verify the speedup is detectedAI Disclaimer
This PR was developed with the assistance of either Claude or Codex. I've reviewed and verified the changes.