[SamWilsn] performance improvements by SamWilsn · Pull Request #5 · CodSpeedHQ/pyconus-2026-tutorial

SamWilsn · 2026-05-13T16:28:54Z

No description provided.

codspeed-hq · 2026-05-13T16:35:30Z

Merging this PR will degrade performance by 26.88%

❌ 1 regressed benchmark
✅ 5 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Benchmark	`BASE`	`HEAD`	Efficiency
❌	`test_bench_find_matches[solution]`	463.1 ms	633.3 ms	-26.88%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing SamWilsn:sam (c649e01) with main (3cf627a)}

SamWilsn · 2026-05-13T21:26:30Z

@codspeedbot fix this regression

codspeed-hq · 2026-05-13T21:26:35Z

@codspeedbot fix this regression

The fix has been pushed to the PR. Here's a summary of what I did:

Fixed both performance regressions by addressing the root causes:

Histogram (`test_bench_compute_histogram[solution]` — was 19% slower)

Root cause: The previous implementation used mmap for byte-by-byte access (slower than bytes indexing in CPython), a slow list initialization ([0 for _ in range(2**16)]), and a Python-level loop over every byte in the file.

Fix: Replaced the entire approach with numpy vectorized operations. Since numpy is already a project dependency, np.frombuffer + np.bincount eliminates the Python-level loop entirely — the heavy lifting happens in C inside numpy. This should be significantly faster than both the base and the previous PR implementation.

DNA matcher (`test_bench_find_matches[solution]` — was 28% slower)

Root causes:

mmap slicing copied to bytes() redundantly — expensive on large records
Backward scanning with rfind was less cache-friendly
One Future per record instead of batched dispatch — massive threading overhead
d.result() called twice per future
Non-overlapping search (loc + plen) differed from the expected overlapping behavior (pos + 1)

Fix: Replaced with direct file read, forward scanning, batched thread dispatch (grouping multiple records per future), and deferred record_id decoding to only matching records.

All 12 correctness tests pass. The CI benchmarks will confirm the performance improvements.

Summary

Warming up the performance wizard...
Analyzing regressions — Optimized both histogram and DNA solutions to eliminate the regressions
Handling comment — Comment handling finished successfully

skip an array access dna attempt

pull request

08413d3

SamWilsn force-pushed the sam branch from 7ac3a5c to 1094bc5 Compare May 13, 2026 21:09

switch to array, use mmap

5ac9c0a

skip an array access dna attempt

SamWilsn force-pushed the sam branch from 1094bc5 to 5ac9c0a Compare May 13, 2026 22:28

foo

c649e01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SamWilsn] performance improvements#5

[SamWilsn] performance improvements#5
SamWilsn wants to merge 3 commits into
CodSpeedHQ:mainfrom
SamWilsn:sam

SamWilsn commented May 13, 2026

Uh oh!

codspeed-hq Bot commented May 13, 2026 •

edited

Loading

Uh oh!

SamWilsn commented May 13, 2026

Uh oh!

codspeed-hq Bot commented May 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SamWilsn commented May 13, 2026

Uh oh!

codspeed-hq Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will degrade performance by 26.88%

Performance Changes

Uh oh!

SamWilsn commented May 13, 2026

Uh oh!

codspeed-hq Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Histogram (test_bench_compute_histogram[solution] — was 19% slower)

DNA matcher (test_bench_find_matches[solution] — was 28% slower)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codspeed-hq Bot commented May 13, 2026 •

edited

Loading

codspeed-hq Bot commented May 13, 2026 •

edited

Loading

Histogram (`test_bench_compute_histogram[solution]` — was 19% slower)

DNA matcher (`test_bench_find_matches[solution]` — was 28% slower)