Skip to content

perf(dpi): drop redundant lines().collect::<Vec<&str>>() in HTTP + SSH parsers#339

Open
obchain wants to merge 1 commit into
domcyrus:mainfrom
obchain:perf/dpi-drop-lines-collect
Open

perf(dpi): drop redundant lines().collect::<Vec<&str>>() in HTTP + SSH parsers#339
obchain wants to merge 1 commit into
domcyrus:mainfrom
obchain:perf/dpi-drop-lines-collect

Conversation

@obchain
Copy link
Copy Markdown
Contributor

@obchain obchain commented May 27, 2026

Summary

analyze_http and analyze_ssh both collected text.lines() into a Vec<&str> and then walked it sequentially. Neither path needed random access — http.rs only indexed [0] and skip(1)'d the rest, and ssh.rs just iterated.

Drive the Lines iterator directly instead:

  • src/network/dpi/http.rslines.next()? for the request/status line (which also subsumes the old is_empty() guard); the remaining iterator drives the header loop.
  • src/network/dpi/ssh.rs — iterate text.lines() straight into the banner loop. The is_empty() guard was dead: is_likely_ssh already requires payload.len() >= 4 plus a banner/packet-structure check upstream.

Per-packet on the DPI hot path, this drops one heap allocation per parse. An HTTP request with N headers used to allocate a Vec of N+1 entries; now zero. SSH banner parsing is a smaller win but identical shape.

Closes #338.

Behavior

No change. The first-line next()? returns identically to indexing [0] after the old is_empty() guard, and the header/banner iterators yield the same &str items in the same order. Verified by the existing tests:

cargo test --lib network::dpi::http    # 5 passed
cargo test --lib network::dpi::ssh     # 17 passed
cargo test --lib                       # 371 passed
cargo fmt --check                      # clean
cargo clippy --all-targets -- -D warnings  # clean

Test plan

  • All HTTP DPI unit tests pass
  • All SSH DPI unit tests pass
  • Full cargo test --lib
  • cargo fmt --check
  • cargo clippy --all-targets -- -D warnings

…SSH parsers

Both `analyze_http` and `analyze_ssh` were materializing the full line
iterator into a `Vec<&str>` and then walking it sequentially:

  let lines: Vec<&str> = text.lines().collect();
  if lines.is_empty() { return None; }
  // …then index `lines[0]` and iterate `lines.iter().skip(1)`,
  // or just `for line in lines { … }`.

Neither path needs random access. Drive the `Lines` iterator directly:

- `http.rs` pulls the request/status line via `next()?` (which also
  subsumes the old `is_empty()` guard) and iterates the remainder for
  headers.
- `ssh.rs` iterates `text.lines()` straight into the banner loop; the
  `is_empty()` guard is dead because `is_likely_ssh` already requires
  `payload.len() >= 4` plus a banner / packet-structure check upstream.

Per-packet on the DPI hot path, this saves one heap allocation per
parse (one `Vec<&str>` plus the slice it points into). An HTTP request
with N headers used to allocate a Vec of N+1 entries; now it allocates
zero. SSH banner parsing was a smaller win but the same shape of fix.

No behavior change — verified by the existing HTTP and SSH unit tests
(5 + 17, all passing) plus the full lib sweep (371 passed).

Closes domcyrus#338
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf(dpi): drop redundant lines().collect::<Vec<&str>>() in HTTP + SSH banner parsers

1 participant