Scan memory benchmarks: RSS sampling, mid-scan census, parallel-chains workload#3083
Merged
liquidsec merged 1 commit intoblasthttp-integration-cleanfrom May 8, 2026
Merged
Conversation
Augments the existing peak-tracemalloc subprocess harness with three additional measurement angles that catch what tracemalloc alone misses: RSS time-series sampling (peak / end / retention via psutil), live-event census and lineage retention, and per-checkpoint queue residence breakdown. Also adds two new workloads exercising chain shapes the existing wide-and-shallow web_crawl doesn't reach: a strict deep_chain and a parallel_chains workload that mirrors real-scale concurrent target scanning. The retention metric (median of last 25% of RSS samples) is the one most sensitive to "stuck for the rest of the scan" pathologies. LiveEventTracker uses a WeakSet hooked into BaseEvent.__init__ so mid-scan census is O(live events) rather than O(every Python object).
Contributor
📊 Performance Benchmark Report
📈 Detailed Results (All Benchmarks)
🎯 Performance Summary! 2 regressions ⚠️
22 unchanged ✅🔍 Significant Changes (>10%)
🆕 New Tests
🐍 Python Version 3.11.15 |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## blasthttp-integration-clean #3083 +/- ##
============================================================
- Coverage 91% 90% -0%
============================================================
Files 439 442 +3
Lines 37860 38231 +371
============================================================
+ Hits 34159 34170 +11
- Misses 3701 4061 +360 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds three measurement layers on top of the existing peak-tracemalloc subprocess harness — what tracemalloc alone misses — plus two new workloads exercising chain shapes the wide-and-shallow
web_crawldoesn't reach.Measurement
peak,end, andretention(median of last 25% of samples — the metric most sensitive to "stuck for the rest of the scan" pathologies).LiveEventTracker— aWeakSethooked intoBaseEvent.__init__so mid-scan census is O(live events) instead of O(every Python object). Counts events by type, totals HTTP_RESPONSE body bytes.New workloads
_scan_memory_deep_chain.py: strict 200-page linear chain (page N links only to page N+1). Produces a 600-event-deep parent chain, exposing structural retention that the wide workload masks._scan_memory_parallel_chains.py: N independent seeds running concurrently, mirroring real-scale "many domains in flight" patterns where bodies pile up across chains even when each chain is naturally serial.Why
Tracemalloc undercounts dramatically — it sees only Python-allocated memory, missing Rust extensions (blasthttp, blastdns, asndb, radixtarget), regex/lxml C buffers, and yara. RSS measurements show 2-18x gaps between tracemalloc and actual process memory, depending on workload. The existing
peak_tracemalloc_mbmetric was off by an order of magnitude in some cases.The new metrics also distinguish peak (transient burst) from retention (stuck baseline) — different pathologies need different fixes, and the existing peak-only output didn't separate them.