Skip to content

refactor(dpi/bittorrent): single-allocation hex_encode for info-hash render#307

Merged
domcyrus merged 1 commit into
domcyrus:mainfrom
0xghost42:refactor/304-bittorrent-hex-encode
May 21, 2026
Merged

refactor(dpi/bittorrent): single-allocation hex_encode for info-hash render#307
domcyrus merged 1 commit into
domcyrus:mainfrom
0xghost42:refactor/304-bittorrent-hex-encode

Conversation

@0xghost42
Copy link
Copy Markdown
Contributor

Summary

hex_encode was building one heap-allocated String per byte via format!("{b:02x}") and then collecting them into the final String. A 20-byte BitTorrent info-hash therefore went through 20 intermediate heap allocations on every peer-handshake analyze.

Replace the iterator with a single String::with_capacity(bytes.len() * 2) and a per-byte write!. Writing into a pre-sized String does not reallocate, so the helper now performs exactly one heap allocation per call regardless of the input length. The lowercase {:02x} output contract is unchanged.

Tests

Three regression tests added:

  • The exact-size 20-byte info-hash case (the only size BitTorrent uses in the wild) to lock the lowercase, fixed-width output.
  • The empty-slice base case.
  • A mix of single-digit bytes (0x00..=0x0f) to lock the zero-padding contract that {:02x} provides.

Local checks:

  • cargo test --lib — 364 passed.
  • cargo clippy --all-targets -- -D warnings — clean.
  • cargo fmt --check — clean.

Closes #304

…render

`hex_encode` was building one heap-allocated `String` per byte via
`format!("{b:02x}")` and then collecting them into the final `String`.
A 20-byte BitTorrent info-hash therefore went through 20 intermediate
heap allocations on every peer-handshake analyze.

Replace the iterator with a single `String::with_capacity(bytes.len() * 2)`
and a per-byte `write!`. Writing into a pre-sized `String` does not
reallocate, so the helper now performs exactly one heap allocation per
call, regardless of the input length. The lowercase `{:02x}` output
contract is unchanged.

Adds three regression tests covering:

- The exact-size 20-byte info-hash case (the only size BitTorrent uses
  in the wild) to lock the lowercase, fixed-width output.
- The empty-slice base case.
- A mix of single-digit bytes (0x00..=0x0f) to lock the zero-padding
  contract that `{:02x}` provides.

Closes domcyrus#304
@laundmo
Copy link
Copy Markdown

laundmo commented May 21, 2026

This PR is lacking any proof that this is a performance gain and or that the compiler does not optimize the current code to a single string write.

@domcyrus
Copy link
Copy Markdown
Owner

@laundmo Thanks for your comment, I also think that without a benchmark the perf gain does not stand. On the other hand this does look to be a bit more correct to me.

@domcyrus
Copy link
Copy Markdown
Owner

@0xghost42 LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dpi(bittorrent): hex_encode allocates one String per byte for info-hash rendering

3 participants