Intern repeated strings in resolved_hosts and dns_children by liquidsec · Pull Request #3006 · blacklanternsecurity/bbot

liquidsec · 2026-03-31T20:52:14Z

Summary

IP addresses and DNS record type strings (A, AAAA, CNAME, etc.) repeat heavily across events in a scan. In a typical subdomain enumeration, thousands of events resolve to the same handful of CDN/cloud IPs, and every event carries its own copy of those strings.

sys.intern() deduplicates them so all events sharing the same IPs or rdtype keys reference a single string object. This reduces memory ~10-30% on those fields depending on how much IP overlap exists in the scan.

Changes

dnsresolve.py — intern rdtype keys and host values in dns_children and resolved_hosts
httpx.py — intern IPs added to _resolved_hosts
gowitness.py — intern IPs added to _resolved_hosts

IP addresses and DNS record type strings (A, AAAA, CNAME, etc.) repeat heavily across events. sys.intern() deduplicates them so all events sharing the same IPs/rdtypes reference the same string object, reducing memory ~10-30% on those fields.

github-actions · 2026-03-31T21:25:24Z

📊 Performance Benchmark Report

Comparing additional-memory-benchmarks (baseline) vs additional-string-interning (current)

📈 Detailed Results (All Benchmarks)

📋 Complete results for all benchmarks - includes both significant and insignificant changes

🧪 Test Name	📏 Base	📏 Current	📈 Change	🎯 Status
Bloom Filter Dns Mutation Tracking Performance	`3.93ms`	`3.93ms`	-0.0% ⚪	✅
Bloom Filter Large Scale Dns Brute Force	`17.57ms`	`17.44ms`	-0.7% ⚪	✅
Large Closest Match Lookup	`334.43ms`	`324.15ms`	-3.1% ⚪	✅
Realistic Closest Match Workload	`176.03ms`	`173.53ms`	-1.4% ⚪	✅
Event Memory Medium Scan	`1776 B/event`	`1776 B/event`	+0.0% ⚪	✅
Event Memory Large Scan	`1760 B/event`	`1760 B/event`	+0.0% ⚪	✅
Event Validation Full Scan Startup Small Batch	`378.94ms`	`369.96ms`	-2.4% ⚪	✅
Event Validation Full Scan Startup Large Batch	`521.73ms`	`526.21ms`	+0.9% ⚪	✅
Make Event Autodetection Small	`25.87ms`	`25.95ms`	+0.3% ⚪	✅
Make Event Autodetection Large	`264.65ms`	`264.51ms`	-0.1% ⚪	✅
Make Event Explicit Types	`11.47ms`	`11.43ms`	-0.3% ⚪	✅
Excavate Single Thread Small	`3.939s`	`3.354s`	-14.8% 🟢🟢	🚀
Excavate Single Thread Large	`9.975s`	`9.255s`	-7.2% ⚪	✅
Excavate Parallel Tasks Small	`4.301s`	`3.591s`	-16.5% 🟢🟢	🚀
Excavate Parallel Tasks Large	`7.774s`	`6.999s`	-10.0% ⚪	✅
Is Ip Performance	`2.91ms`	`2.94ms`	+0.8% ⚪	✅
Make Ip Type Performance	`10.69ms`	`10.67ms`	-0.2% ⚪	✅
Mixed Ip Operations	`4.17ms`	`4.19ms`	+0.4% ⚪	✅
Memory Use Web Crawl	`257.3 MB`	`257.4 MB`	+0.0% ⚪	✅
Memory Use Subdomain Enum	`19.3 MB`	`19.3 MB`	+0.0% ⚪	✅
Typical Queue Shuffle	`54.18µs`	`55.68µs`	+2.8% ⚪	✅
Priority Queue Shuffle	`599.29µs`	`608.09µs`	+1.5% ⚪	✅

🎯 Performance Summary

+ 2 improvements 🚀
  20 unchanged ✅

🔍 Significant Changes (>10%)

Excavate Single Thread Small: 14.8% 🚀 faster
Excavate Parallel Tasks Small: 16.5% 🚀 faster

🐍 Python Version 3.11.15

codecov · 2026-03-31T21:46:13Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91%. Comparing base (590e979) to head (c6d6ac4).
⚠️ Report is 2 commits behind head on additional-memory-benchmarks.

Additional details and impacted files

@@                     Coverage Diff                      @@
##           additional-memory-benchmarks   #3006   +/-   ##
============================================================
- Coverage                            91%     91%   -0%     
============================================================
  Files                               437     437           
  Lines                             37102   37108    +6     
============================================================
+ Hits                              33694   33696    +2     
- Misses                             3408    3412    +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

TheTechromancer approved these changes Apr 1, 2026

View reviewed changes

TheTechromancer merged commit b5816e4 into additional-memory-benchmarks Apr 1, 2026
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Intern repeated strings in resolved_hosts and dns_children#3006

Intern repeated strings in resolved_hosts and dns_children#3006
TheTechromancer merged 1 commit intoadditional-memory-benchmarksfrom
additional-string-interning

liquidsec commented Mar 31, 2026

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

codecov bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

liquidsec commented Mar 31, 2026

Summary

Changes

Uh oh!

github-actions bot commented Mar 31, 2026

📊 Performance Benchmark Report

🎯 Performance Summary

🔍 Significant Changes (>10%)

Uh oh!

codecov bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Mar 31, 2026 •

edited

Loading