Skip to content

refactor(dpi/http): case-fold header name without allocating#301

Merged
domcyrus merged 1 commit into
domcyrus:mainfrom
obchain:refactor/http-header-casefold-no-alloc
May 21, 2026
Merged

refactor(dpi/http): case-fold header name without allocating#301
domcyrus merged 1 commit into
domcyrus:mainfrom
obchain:refactor/http-header-casefold-no-alloc

Conversation

@obchain
Copy link
Copy Markdown
Contributor

@obchain obchain commented May 21, 2026

Summary

The header loop in analyze_http (src/network/dpi/http.rs) lowercased every header line into a fresh String only to compare the result against the two ASCII literals "host" and "user-agent". For an HTTP request with N header lines this allocates N Strings per packet, on the DPI hot path.

HTTP/1.1 §3.2 makes field-names case-insensitive, so compare in place with str::eq_ignore_ascii_case — no allocation, behavior preserved:

if let Some((key, value)) = line.split_once(':') {
    let key = key.trim();
    let value = value.trim();
    if key.eq_ignore_ascii_case("host") {
        info.host = Some(value.to_string());
    } else if key.eq_ignore_ascii_case("user-agent") {
        info.user_agent = Some(value.to_string());
    }
}

Same shape as the SSDP fix in #290 — the HTTP path is hotter (every detected packet on a TCP HTTP flow), and realistic header counts are 8–20+ per request, so the savings are larger.

Closes #300.

Why a regression test

The existing tests only exercised the canonical Host: capitalisation, so dropping the case-insensitive compare would not have failed any test. The new test_http_mixed_case_host_and_user_agent_headers exercises four case mixes per header name (Host, host, HOST, hOsT and User-Agent, user-agent, USER-AGENT, User-AGENT) so a future refactor that drops the case fold will fail this test instead of silently regressing — same shape as the regression test added in #290 (SSDP) and #278 (NetBIOS).

Verification

  • cargo test --lib: 362 passed, 0 failed (5 in network::dpi::http, incl. the new test).
  • cargo clippy --all-targets -- -D warnings: clean.
  • cargo fmt --check: clean.

Notes

Real-world HTTP traffic uses varied capitalisation for header names: most clients send Host / User-Agent, but curl --header lets the user send any casing, HTTP/2 → HTTP/1.1 gateways often lower-case the names, and RFC 7230 §3.2 explicitly makes the field-name case-insensitive. The previous to_lowercase() already accepted any casing — this change keeps that behavior without the per-header allocation.

The header loop in `analyze_http` lowercased every header line via
`to_lowercase()` only to compare the result against the two ASCII
literals "host" and "user-agent". That allocates a fresh `String` per
header line per packet, on the DPI hot path.

HTTP/1.1 §3.2 makes field-names case-insensitive, so compare in place
with `str::eq_ignore_ascii_case` — no allocation, behavior preserved.

Add `test_http_mixed_case_host_and_user_agent_headers` covering four
case mixes per header name so a future refactor that drops the
case-insensitive compare fails this test instead of silently
regressing — same shape as the regression tests added in domcyrus#290 (SSDP)
and domcyrus#278 (NetBIOS).

Refs domcyrus#300
@laundmo
Copy link
Copy Markdown

laundmo commented May 21, 2026

This PR is lacking any proof that this is a performance gain and or that the compiler does not optimize the current code to the same as the proposed code.

@domcyrus
Copy link
Copy Markdown
Owner

@laundmo That's a fair point but I do actually think that this code is actually slightly cleaner and more correct (ASCII-only matching vs. full Unicode case-folding. Hence I think this is a good change.

@domcyrus
Copy link
Copy Markdown
Owner

@obchain Thanks for you PR, this LGTM!

@domcyrus domcyrus merged commit 5d4500c into domcyrus:main May 21, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor(dpi/http): per-header to_lowercase() allocation in analyze_http

3 participants