Skip to content

perf: poison non-HTTP flows to avoid repeated parse-fail-clear cycles#42

Open
Zious11 wants to merge 2 commits intodevelopfrom
worktree-poison-non-http-flows
Open

perf: poison non-HTTP flows to avoid repeated parse-fail-clear cycles#42
Zious11 wants to merge 2 commits intodevelopfrom
worktree-poison-non-http-flows

Conversation

@Zious11
Copy link
Copy Markdown
Owner

@Zious11 Zious11 commented Apr 7, 2026

Summary

  • Add per-direction poisoning to HttpFlowState to skip non-HTTP TCP flows after repeated parse failures
  • Use POISON_THRESHOLD (3 consecutive errors) to tolerate mid-stream joins where initial segments are body data
  • Track non_http_flows per flow (not per direction) and poisoned_bytes_skipped for observability
  • Reduces parse_errors from 14 to 3 on http-full.cap fixture (the 3 remaining are legitimate first-attempt failures before threshold is reached)

Fixes #18

Test plan

  • test_parse_error_poisons_direction_after_threshold — 3 errors poison, 4th data skipped
  • test_single_error_does_not_poison — 1 error below threshold, next valid request parses
  • test_poison_request_does_not_affect_response — direction independence
  • test_non_http_flows_counts_per_flow_not_direction — counter accuracy
  • test_poison_cleared_after_flow_close — poison doesn't persist across flow reuse
  • All 135 tests pass
  • cargo fmt clean
  • Code reviewer: fixed double-counting, added threshold, added tests
  • Silent-failure-hunter: added poisoned_bytes_skipped counter, threshold for mid-stream tolerance

Zious11 added 2 commits April 7, 2026 16:20
Add per-direction poisoned flag to HttpFlowState. After the first
parse error on a direction with no prior successful parse, mark it
poisoned and skip all future buffering/parsing for that direction.

- request_poisoned / response_poisoned bools on HttpFlowState
- non_http_flows counter in HttpAnalyzer, surfaced in summarize()
- Updated test to verify poisoned direction skips subsequent data

Reduces parse_errors from 14 to 2 on http-full.cap fixture.

Fixes #18
- Add POISON_THRESHOLD (3 errors) before poisoning to tolerate
  mid-stream joins where first segments are body data
- Fix non_http_flows double-counting: use per-flow counted_as_non_http
  flag so counter increments once per flow, not once per direction
- Add poisoned_bytes_skipped counter for observability of discarded data
- Add tests: threshold behavior, direction independence, flow counter
  accuracy, poison cleared after flow close
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves HTTP analyzer performance on non-HTTP TCP streams by “poisoning” a flow direction after repeated header-parse failures, preventing repeated parse-fail-clear cycles and adding summary counters for observability.

Changes:

  • Add per-direction poisoning state and counters to HttpFlowState (with a POISON_THRESHOLD of 3).
  • Skip future data for poisoned directions while tracking non_http_flows and poisoned_bytes_skipped.
  • Expand HTTP analyzer tests to cover poisoning threshold behavior, per-direction independence, per-flow counting, and reset on flow close.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
src/analyzer/http.rs Introduces per-direction poisoning logic and new summary metrics to avoid repeated parsing on non-HTTP traffic.
tests/http_analyzer_tests.rs Adds/updates unit tests validating poisoning threshold, direction independence, per-flow counting, and cleanup on close.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 332 to +344
Some(Err(e)) => {
if !had_success {
self.parse_errors += 1;
if let Some(state) = self.flows.get_mut(flow_key) {
state.request_error_count = state.request_error_count.saturating_add(1);
if state.request_error_count >= POISON_THRESHOLD {
state.request_poisoned = true;
if !state.counted_as_non_http {
state.counted_as_non_http = true;
self.non_http_flows += 1;
}
}
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code increments request_error_count toward POISON_THRESHOLD, but it never resets the counter after a successful request parse. That means poisoning is based on “total errors so far” rather than consecutive errors as described, and a flow could be poisoned even if valid HTTP was parsed in-between. Reset request_error_count to 0 on the success path (and similarly for responses).

Copilot uses AI. Check for mistakes.
Comment on lines 388 to +401
Some(Err(e)) => {
if !had_success {
self.parse_errors += 1;
if let Some(state) = self.flows.get_mut(flow_key) {
state.response_error_count =
state.response_error_count.saturating_add(1);
if state.response_error_count >= POISON_THRESHOLD {
state.response_poisoned = true;
if !state.counted_as_non_http {
state.counted_as_non_http = true;
self.non_http_flows += 1;
}
}
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as requests: response_error_count is incremented toward POISON_THRESHOLD but never reset after a successful response parse, so poisoning is not actually based on consecutive errors. Reset response_error_count to 0 when a response is successfully parsed/drained.

Copilot uses AI. Check for mistakes.
Comment on lines +337 to +343
if state.request_error_count >= POISON_THRESHOLD {
state.request_poisoned = true;
if !state.counted_as_non_http {
state.counted_as_non_http = true;
self.non_http_flows += 1;
}
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non_http_flows is incremented whenever a direction becomes poisoned, but poisoning can happen for parse errors that still indicate HTTP (e.g., TooManyHeaders). This makes the metric name misleading and can inflate “non-HTTP” counts with malformed/abusive HTTP. Consider renaming the metric to something like poisoned_flows, or only incrementing non_http_flows for error kinds that strongly indicate non-HTTP traffic (and keeping a separate counter for poisoned flows).

Copilot uses AI. Check for mistakes.
Comment on lines +305 to +309
let valid = b"GET /index.html HTTP/1.1\r\nHost: example.com\r\n\r\n";
analyzer.on_data(&fk, Direction::ClientToServer, valid, 0);

assert_eq!(analyzer.parse_error_count(), 3); // no new errors (poisoned, not retried)
assert!(analyzer.method_counts().get("GET").is_none()); // never parsed
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests validate poisoning behavior, but they don’t assert the new poisoned_bytes_skipped observability counter. Adding an assertion here (e.g., that it increases by the length of the skipped valid request) would prevent regressions where data is skipped without being accounted for (or vice versa).

Suggested change
let valid = b"GET /index.html HTTP/1.1\r\nHost: example.com\r\n\r\n";
analyzer.on_data(&fk, Direction::ClientToServer, valid, 0);
assert_eq!(analyzer.parse_error_count(), 3); // no new errors (poisoned, not retried)
assert!(analyzer.method_counts().get("GET").is_none()); // never parsed
let valid = b"GET /index.html HTTP/1.1\r\nHost: example.com\r\n\r\n";
let poisoned_bytes_skipped_before = analyzer.poisoned_bytes_skipped();
analyzer.on_data(&fk, Direction::ClientToServer, valid, 0);
assert_eq!(analyzer.parse_error_count(), 3); // no new errors (poisoned, not retried)
assert!(analyzer.method_counts().get("GET").is_none()); // never parsed
assert_eq!(
analyzer.poisoned_bytes_skipped(),
poisoned_bytes_skipped_before + valid.len()
);

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: poison non-HTTP flows to avoid repeated parse-fail-clear cycles

2 participants