Skip to content

Preserve in-scope shared-infra graph edges#3141

Merged
liquidsec merged 6 commits into
dns-children-dedup-keyfrom
dns-children-graph-fidelity
Jun 1, 2026
Merged

Preserve in-scope shared-infra graph edges#3141
liquidsec merged 6 commits into
dns-children-dedup-keyfrom
dns-children-graph-fidelity

Conversation

@liquidsec

@liquidsec liquidsec commented May 31, 2026

Copy link
Copy Markdown
Collaborator

Builds on #3126. That PR dedups DNS children by (rdtype, child), which kills the cross-parent flood from out-of-scope shared infrastructure (e.g. one Cloudflare nameserver emitted 1,518x across in-scope zones) and wins back queue throughput.

The tradeoff it accepts: the dedup is scan-global, so it also collapses in-scope shared infrastructure (a mail server / nameserver / CNAME target that several in-scope domains point at). The graph output then links that shared host to only the first-seen parent, dropping the rest of the real (parent -> child) edges. In the motivating scan ~160 such in-scope edges were lost (vs ~16,155 out-of-scope edges correctly collapsed).

Fix

  • dnsresolve.emit_dns_children: on a dedup hit, re-emit the cross-parent edge as _graph_important only when the child is in-scope. In-scope shared infra keeps every edge; out-of-scope affiliate dups fall through and stay collapsed (the flood is never re-emitted).
  • dnsresolve.handle_event: skip child re-walking for graph-important re-entries so the re-emit doesn't cascade. Resolution still runs (cache hit), so node dns_children stays intact and neo4j node properties aren't clobbered.
  • ScanEgress.forward_event: route graph-important events only to modules that consume them (preserve_graph or accept_dupes), skipping modules that would just drop them at postcheck. Normal scan modules see zero extra events, so dedup dns children by (rdtype, child) not parent host #3126's churn win is preserved; this also makes the existing orphan-resurrection path cheaper.

Tests

  • TestDNSResolveInScopeSharedInfraGraphFidelity (new): three in-scope parents share two in-scope nameservers; asserts every parent->child edge survives in the real output.json graph artifact (the same edges neo4j builds).
  • TestDNSResolveSharedNameserverDedup (out-of-scope flood collapses to one) stays green. Together they pin the intended nuance: collapse the affiliate noise, preserve the in-scope edges.

@github-actions

github-actions Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

📊 Performance Benchmark Report

Comparing dns-children-dedup-key (baseline) vs dns-children-graph-fidelity (current)

📈 Detailed Results (All Benchmarks)

📋 Complete results for all benchmarks - includes both significant and insignificant changes

🧪 Test Name 📏 Base 📏 Current 📈 Change 🎯 Status
Bloom Filter Dns Mutation Tracking Performance 3.96ms 3.98ms +0.6%
Bloom Filter Large Scale Dns Brute Force 17.70ms 18.36ms +3.7%
Large Closest Match Lookup 329.49ms 324.57ms -1.5%
Realistic Closest Match Workload 175.46ms 174.24ms -0.7%
Event Memory Medium Scan 1394 B/event 1394 B/event +0.0%
Event Memory Large Scan 1517 B/event 1519 B/event +0.1%
Event Validation Full Scan Startup Small Batch 370.08ms 369.74ms -0.1%
Event Validation Full Scan Startup Large Batch 506.71ms 506.02ms -0.1%
Make Event Autodetection Small 20.85ms 20.95ms +0.5%
Make Event Autodetection Large 215.19ms 215.77ms +0.3%
Make Event Explicit Types 9.18ms 9.19ms +0.1%
Excavate Single Thread Small 3.080s 3.179s +3.2%
Excavate Single Thread Large 9.170s 9.177s +0.1%
Excavate Parallel Tasks Small 3.290s 3.343s +1.6%
Excavate Parallel Tasks Large 5.980s 6.062s +1.4%
Intercept Throughput Small 983.84ms 985.77ms +0.2%
Intercept Throughput Medium 974.32ms 964.69ms -1.0%
Is Ip Performance 2.01ms 2.02ms +0.4%
Make Ip Type Performance 170.86µs 171.78µs +0.5%
Mixed Ip Operations 2.05ms 2.07ms +1.1%
Memory Use Web Crawl 368.2 MB 354.7 MB -3.7%
Memory Use Subdomain Enum 29.2 MB 29.2 MB +0.0%
Memory Use Deep Chain 8.5 MB 8.5 MB +0.0%
Memory Use Parallel Chains 21.4 MB 22.3 MB +4.1%
Scan Throughput 100 2.893s 2.949s +1.9%
Scan Throughput 1000 20.459s 21.146s +3.4%
Typical Queue Shuffle 4.96µs 4.93µs -0.6%
Priority Queue Shuffle 23.37µs 23.58µs +0.9%

🎯 Performance Summary

No significant performance changes detected (all changes <10%)


🐍 Python Version 3.11.15

liquidsec added 4 commits May 31, 2026 15:01
Re-emit in-scope shared-infra child events as graph-important so neo4j/json
keep every parent->child edge. Route graph-important events only to modules
that consume them (preserve_graph or accept_dupes), so normal scan modules
see no extra churn.
children_emitted is parent-less (#3126), so the in-scope re-emit also fired on
same-host re-processing (SRV/wildcard chains), over-emitting graph-important
events. Track (parent, rdtype, child) for in-scope edges so only genuinely new
cross-parent edges re-emit.
asdf.blacklanternsecurity.com is in-scope shared infra (SRV target of two
in-scope _ldap records); both cross-parent edges are now preserved, so it
emits two DNS_NAME events instead of one.
@codecov

codecov Bot commented May 31, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90%. Comparing base (71dca05) to head (de2d039).
⚠️ Report is 7 commits behind head on dns-children-dedup-key.

Additional details and impacted files
@@                  Coverage Diff                   @@
##           dns-children-dedup-key   #3141   +/-   ##
======================================================
+ Coverage                      90%     90%   +1%     
======================================================
  Files                         441     441           
  Lines                       38855   38920   +65     
======================================================
+ Hits                        34768   34833   +65     
  Misses                       4087    4087           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

emit_dns_children re-emits cross-parent edges for in-scope children as graph-important so neo4j/json keep every edge. Skip that when the child type is omitted: it would be dropped at output anyway, and flagging it breaks the rule that a graph-important event is never omitted.

Also only compute the scope check and parent-aware edge hash for genuinely new children, so already-seen out-of-scope dups don't pay for a scope lookup on every occurrence.
@liquidsec liquidsec marked this pull request as ready for review May 31, 2026 22:58
@ausmaster ausmaster self-requested a review June 1, 2026 16:04
@liquidsec liquidsec merged commit 354aab1 into dns-children-dedup-key Jun 1, 2026
18 checks passed
@ausmaster ausmaster added this to the BBOT 3.0 - blazed_elijah milestone Jun 1, 2026
@liquidsec liquidsec mentioned this pull request Jun 9, 2026
@ausmaster ausmaster deleted the dns-children-graph-fidelity branch June 11, 2026 01:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants