Add preliminary harmony-crypto & harmony-identity crates#1
Conversation
|
bugbot run |
|
@greptile |
Greptile SummaryThis PR introduces the Overall the implementation is sound — correct encrypt-then-MAC ordering, constant-time HMAC comparison via
Confidence Score: 2/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Sender
participant Identity
participant HKDF
participant Fernet
Note over Sender,Fernet: Identity encrypt (Reticulum-compatible)
Sender->>Identity: encrypt(rng, plaintext)
Identity->>Identity: generate ephemeral X25519 keypair
Identity->>Identity: X25519 ECDH → shared secret
Identity->>HKDF: derive_key_256(shared_secret, salt=address_hash)
HKDF-->>Identity: 64-byte Fernet key
Identity->>Fernet: encrypt(rng, fernet_key, plaintext)
Fernet->>Fernet: random IV (16 bytes)
Fernet->>Fernet: AES-256-CBC encrypt (PKCS7)
Fernet->>Fernet: HMAC-SHA256 over IV + ciphertext
Fernet-->>Identity: token = IV + ciphertext + HMAC
Identity->>Identity: zeroize fernet_key
Identity-->>Sender: ephemeral_pub (32B) + token
Note over Sender,Fernet: PrivateIdentity decrypt
Sender->>Identity: decrypt(ciphertext)
Identity->>Identity: extract ephemeral_pub (first 32 bytes)
Identity->>Identity: X25519 ECDH → shared secret
Identity->>HKDF: derive_key_256(shared_secret, salt=address_hash)
HKDF-->>Identity: 64-byte Fernet key
Identity->>Fernet: decrypt(fernet_key, token)
Fernet->>Fernet: constant-time HMAC verify
Fernet->>Fernet: AES-256-CBC decrypt + unpad
Fernet-->>Identity: plaintext
Identity->>Identity: zeroize fernet_key
Identity-->>Sender: plaintext
Last reviewed commit: 5a63244 |
|
bugbot run |
| if ciphertext.len() < 33 { | ||
| return Err(IdentityError::DecryptionFailed); | ||
| } |
There was a problem hiding this comment.
minimum ciphertext check should be 96 bytes (32-byte ephemeral key + 64-byte minimum Fernet token), not 33
| if ciphertext.len() < 33 { | |
| return Err(IdentityError::DecryptionFailed); | |
| } | |
| if ciphertext.len() < 96 { | |
| return Err(IdentityError::DecryptionFailed); | |
| } |
| /// Encrypt plaintext with ChaCha20-Poly1305. | ||
| /// | ||
| /// - `key`: 32-byte symmetric key | ||
| /// - `nonce`: 12-byte nonce (must never be reused with the same key) | ||
| /// - `plaintext`: data to encrypt | ||
| /// - `aad`: additional authenticated data (authenticated but not encrypted) | ||
| /// | ||
| /// Returns ciphertext with appended 16-byte Poly1305 tag. |
There was a problem hiding this comment.
add documentation warning that nonce MUST NEVER be reused with the same key in ChaCha20-Poly1305, as reuse completely breaks security
|
bugbot run |
|
As much as I could run tests/reviews one more time to be extra sure, I'm excited to move on, there's a lot more to build out! SO MERGING CHOO CHOO HERE WE GO |
|
bugbot run |
| fn eq(&self, other: &Self) -> bool { | ||
| self.address_hash == other.address_hash | ||
| } | ||
| } |
There was a problem hiding this comment.
Identity equality ignores actual cryptographic key material
Low Severity
The PartialEq impl for Identity compares only the 16-byte address_hash, not the actual encryption_key or verifying_key. This means two Identity values with different public keys but a colliding truncated hash would be considered equal, even though they behave differently for verify() and encrypt(). Comparing the full 64-byte public key material would be equally efficient and semantically correct, avoiding a potential foot-gun where code assumes a == b implies cryptographic interchangeability.
|
bugbot run |
|
@greptile |
| /// Serialize the private key to 64 bytes: `[32B X25519 secret][32B Ed25519 secret]`. | ||
| pub fn to_private_bytes(&self) -> [u8; PRIVATE_KEY_LENGTH] { | ||
| let mut bytes = [0u8; PRIVATE_KEY_LENGTH]; | ||
| bytes[..32].copy_from_slice(self.encryption_secret.as_bytes()); | ||
| bytes[32..].copy_from_slice(self.signing_key.as_bytes()); | ||
| bytes | ||
| } |
There was a problem hiding this comment.
to_private_bytes returns unprotected key material
to_private_bytes returns a plain [u8; 64] containing raw private key bytes. Unlike PrivateIdentity itself (which is ZeroizeOnDrop), the returned array carries no automatic zeroization guarantee. Any caller that stores the returned bytes, passes them to another function, or simply lets the binding go out of scope without explicitly calling zeroize() will leave live private key material on the stack/heap until it is overwritten by unrelated data.
The function should return a zeroizing wrapper (e.g., zeroize::Zeroizing<[u8; PRIVATE_KEY_LENGTH]>) so that the secret bytes are automatically erased when the returned value is dropped:
use zeroize::Zeroizing;
pub fn to_private_bytes(&self) -> Zeroizing<[u8; PRIVATE_KEY_LENGTH]> {
let mut bytes = Zeroizing::new([0u8; PRIVATE_KEY_LENGTH]);
bytes[..32].copy_from_slice(self.encryption_secret.as_bytes());
bytes[32..].copy_from_slice(self.signing_key.as_bytes());
bytes
}Zeroizing<T> implements Deref<Target = T>, so callers that read the bytes via as_slice() or indexing continue to work without change, while the secret material is guaranteed to be wiped on drop.
| impl PartialEq for Identity { | ||
| fn eq(&self, other: &Self) -> bool { | ||
| self.address_hash == other.address_hash | ||
| } | ||
| } |
There was a problem hiding this comment.
PartialEq uses truncated 128-bit hash instead of full public key
The PartialEq implementation compares identities only by address_hash, which is a 16-byte (128-bit) truncation of SHA-256. This is weaker than comparing the full 64-byte public key for two reasons:
-
Cryptographic strength: A birthday collision in 128-bit space requires roughly 2^64 attempts — feasible for a nation-state-level adversary targeting a specific identity. If equality is used in any security-sensitive context (access control, trust checks, deduplication of allowed peers), a crafted identity that collides in the address hash would pass these checks despite having different — and attacker-controlled — public keys.
-
Semantic mismatch: The address hash is a routing address (Reticulum's network-layer notion of location); cryptographic identity equality should be based on the actual public key material. Using address hashes for equality silently conflates two different concepts, which can confuse future users of the API.
The comparison should use the full combined public key bytes:
impl PartialEq for Identity {
fn eq(&self, other: &Self) -> bool {
self.to_public_bytes() == other.to_public_bytes()
}
}
impl Eq for Identity {}If two Identity values are equal under this definition, their address_hash fields will also match (since the hash is derived deterministically from the keys). The converse is not guaranteed.
| let mut derived = hkdf::derive_key_256( | ||
| shared_secret.as_bytes(), | ||
| Some(&self.address_hash), | ||
| ); |
There was a problem hiding this comment.
Missing HKDF info context — no domain separation
derive_key_256 is called with an empty info slice (&[] is the default in the implementation). The HKDF info parameter is the standard mechanism for domain separation: it binds the derived key to a specific protocol, version, and usage. Without it, any other code that performs an ECDH exchange over the same X25519 keys and then applies HKDF-SHA256 with the same salt would derive the identical Fernet key, even if that code is meant for a completely different purpose.
Consider adding a well-known, constant context string:
let mut derived = hkdf::derive_key_256(
shared_secret.as_bytes(),
Some(&self.address_hash),
b"harmony-identity-v1-fernet", // add an info parameter to derive_key_256
);Note: if Reticulum Python uses empty info, and the interoperability test vectors rely on that, changing the info would break wire compatibility. In that case, keep the empty info but document this explicitly so that future protocol versions (with a different info string) remain clearly separated. The same pattern applies to the decrypt path at line 202.
| mod vectors { | ||
| pub const X25519_PRIVATE: &str = | ||
| "20373f976f707a256b450d7bd63b7230978fa623395a9bc91edd890a5a38654b"; | ||
| pub const ED25519_PRIVATE: &str = | ||
| "f25ade2a1327293ccb885f4bfdaabdd4a64708ba40c7a4410bdcf1ddaccd7dfa"; | ||
| pub const X25519_PUBLIC: &str = | ||
| "e080f2b82f9a08581008c8391e0a5e1e29da4b49d5364e7dbb1a20caf4a1ea57"; | ||
| pub const ED25519_PUBLIC: &str = | ||
| "685cf6f3f7887d5b10b6efcbdcecaf5efe908e7b002c57c53547d6df1f1a962c"; | ||
| pub const ADDRESS_HASH: &str = "af0de81878d6685b62d3459968c2d9f2"; | ||
| pub const COMBINED_PUBLIC: &str = | ||
| "e080f2b82f9a08581008c8391e0a5e1e29da4b49d5364e7dbb1a20caf4a1ea57\ | ||
| 685cf6f3f7887d5b10b6efcbdcecaf5efe908e7b002c57c53547d6df1f1a962c"; | ||
|
|
||
| pub const SIGN_MESSAGE: &str = "6861726d6f6e7920696e7465726f702074657374206d657373616765"; | ||
| pub const SIGNATURE: &str = | ||
| "f4074a5045d06b779004c08f4dfd0daf6923c8746e2493ddc70bf882e98abcea\ | ||
| 3a3fe75da5e4990c53dd0c293f501447312dba6f900710d9435e34a66294aa09"; | ||
|
|
||
| pub const ENCRYPT_PLAINTEXT: &str = | ||
| "68656c6c6f2066726f6d207265746963756c756d20707974686f6e"; | ||
| pub const ENCRYPT_CIPHERTEXT: &str = | ||
| "2fbe26fc3ce9f604e09d26881e1a805c21ea379cfd40b8741254bc92e270a20a\ | ||
| ed8fb66b9d56b90d2483989aba56b5b27007c59738505076bf4169abe0492260\ | ||
| 1d1f98d358d3e4d70707320c594ae1d0e58b297164d662ca4d7c0fa92cfe50ad\ | ||
| 9ddf653538bd70f1d0d8a8aabff15f2e"; | ||
|
|
||
| pub const ID2_X25519_PRIVATE: &str = | ||
| "400b83cab205e3383838868344ed4e52c94da309d1a6d6098289768104877353"; | ||
| pub const ID2_ED25519_PRIVATE: &str = | ||
| "e6bdab8f004a72904d8c62c692a5e8cf1db023f7d6f973072fcf56265bac7175"; | ||
| pub const ID2_ADDRESS_HASH: &str = "d9b3e6139565245d1a434975430956d2"; | ||
|
|
||
| pub const CROSS_ENCRYPT_PLAINTEXT: &str = | ||
| "63726f73732d6964656e7469747920656e6372797074696f6e2074657374"; | ||
| pub const CROSS_ENCRYPT_CIPHERTEXT: &str = | ||
| "6bee97688961561418bb283b163d9037307c42579f92728f45e5fddc44c78114\ | ||
| 6bfa9f0820e7e28f9ef3754cd19a7d9b66feb5c632cb6bf64123a1272053c2dc\ | ||
| e1eef7ff4a61e18b0b2d7eadf069410bf35bb286afa2eb55903bdf458b250df7\ | ||
| 9f6060a48ce828f850e46d95f5aca45a"; | ||
| } |
There was a problem hiding this comment.
Hardcoded private key bytes in source — add prominent "test-only" warning
vectors::X25519_PRIVATE, vectors::ED25519_PRIVATE, vectors::ID2_X25519_PRIVATE, and vectors::ID2_ED25519_PRIVATE are real X25519 and Ed25519 private keys committed to the repository. Anyone who can read this file now possesses those private keys. If these values are ever accidentally used in a production deployment (e.g., as "default" seed keys, for smoke-testing against a live node, or via copy-paste), all traffic encrypted to those identities would be trivially decryptable.
Add a clear comment block at the top of the vectors module:
/// ⚠ WARNING: These are TEST-ONLY private keys.
/// They are committed to a public repository and MUST NEVER be used
/// outside of unit/integration tests. Any identity or data protected
/// by these keys should be considered fully compromised.
mod vectors {There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| let token = harmony_crypto::fernet::encrypt(rng, &derived, plaintext)?; | ||
|
|
||
| // Zeroize derived key | ||
| derived.zeroize(); |
There was a problem hiding this comment.
Derived key not zeroized on error path in encrypt
Low Severity
In Identity::encrypt, the ? operator on the fernet::encrypt call causes an early return that skips derived.zeroize(), leaving 64 bytes of derived key material on the stack. The sibling PrivateIdentity::decrypt method correctly avoids this by capturing the result into a variable, calling derived.zeroize(), and only then returning — the encrypt path is inconsistent with this safer pattern.
Additional Locations (1)
8-task TDD plan for Bead #1 (harmony-641): PQ tunnel state machine. Covers handshake, frame encryption, session lifecycle, keepalive, and adversarial tests. Includes signing_key() accessor prerequisite. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three coupled API contract changes from PR #267 review: 1. Rename `RuntimeEvent::SendUnicastToDevice.target` → `destination_hash`. The runtime resolves the field via `path_table().get(...)`, but the path table is keyed by Reticulum destination hashes (SHA256(name_hash || identity_address_hash)[:16]) populated from announces, NOT raw device identity hashes. The previous name was load-bearing-wrong: real announced peers would miss the lookup and surface as "target unknown" drops even when reachable. The runtime is generic plumbing; computing the destination hash from identity-hash + destination-name is a harmony-client concern (Phase 3b, ZEB-227 — clients know their own destination naming: DM inbox, voice channel, file sync, etc.). 2. Type `RuntimeAction::UnicastReceived.source` and `NodeAction::DeliverLocally.source` as `Option<[u8; 16]>` instead of `[u8; 16]` with `[0u8; 16]` as a sentinel for "unknown until Phase 3b wires terminal-link tracking." Sentinels can be misinterpreted as a real authenticated identity. The Option makes the unknown state explicit in the type and forces consumers to handle it. Construction site at `node.rs:route_packet` sets `None` (was `[0u8; 16]`); the runtime dispatch arm forwards the Option as-is. 3. Defer `SendUnicastToDevice` resolution from `push_event` to `tick`. The previous code did `path_table().get(...)` synchronously inside `push_event`, racing against any inbound announces queued in the same batch (announces sit in `router_queue` and aren't applied to the path table until `tick` drains them). Now `push_event` enqueues into `pending_unicast_sends`, and `tick` drains that queue AFTER the router queue inside the Tier 0 (Router) block — so any same-batch announces are applied first. Tests updated to use the new field names + Option<source>: - `runtime_event_send_unicast_to_device_constructs_and_compares` - `runtime_action_unicast_received_constructs_and_compares` - `dispatch_router_actions_surfaces_deliver_locally_as_unicast_received` - `send_unicast_to_device_emits_send_on_interface_for_known_target` - `send_unicast_to_device_drops_for_unknown_target` - `unicast_round_trip_a_to_b_surfaces_as_unicast_received` Round-trip test placeholder assertion is now `received.0 == None` instead of `[0u8; 16]`. The seeded path-table value in the outbound test is unchanged in semantics (the seed key was always a destination hash by contract — only the field name shifts). Refs: PR #267 review (Qodo #1, CodeRabbit, CodeAnt).
…26) (#267) * docs(zeb-226): runtime unicast IPC implementation plan (Phase 3a) ZEB-216 Sub-B Phase 3a — adds RuntimeEvent::SendUnicastToDevice + RuntimeAction::UnicastReceived to harmony-runtime, wiring to the existing Reticulum unicast plumbing (NodeAction::DeliverLocally inbound, NodeAction::SendOnInterface outbound). 5 implementation tasks + 1 process task. Each implementation task is TDD-shaped (failing test → impl → gates → commit) with bite-sized steps. Two open implementation questions (target→interface routing, source identity resolution) are surfaced as explicit investigation steps with file:line pointers — the implementer reads the actual code to choose the cleanest approach. * feat(zeb-226): add RuntimeEvent::SendUnicastToDevice variant Phase 3a, step 1 of 5. Adds the inbound side of the unicast IPC contract that harmony-client's DM outbox (ZEB-216 Phase 3b) will consume. Variant is wired in subsequent steps; this commit is just the type addition + constructor smoke test + a placeholder no-op arm in the event dispatcher (translation to SendOnInterface lands in Task 4). Scope is intentionally minimal — the surrounding tree's pre-existing fmt drift (covered by chore-cargo-fmt-tree-cleanup PR #266) and clippy warnings (separate concern, not in this PR's diff) are NOT addressed here. Each task in the plan ships as its own commit; only crates/harmony-runtime/src/runtime.rs changes in this one. Test: runtime::tests::runtime_event_send_unicast_to_device_constructs_and_compares * feat(zeb-226): add RuntimeAction::UnicastReceived variant Phase 3a, step 2 of 5. Adds the outbound side of the unicast IPC contract — the runtime emits this from the next tick() when a unicast packet arrives at a locally-registered destination. Variant is wired in step 3 (currently DeliverLocally is silently dropped at runtime.rs:2350-2351 with the comment "Other router actions are diagnostics — drop for now" — that's the call site Task 3 replaces). Scope is intentionally minimal — only crates/harmony-runtime/src/runtime.rs changes in this commit. Pre-existing fmt drift (PR #266) and pre-existing clippy warnings are NOT addressed here. Test: runtime::tests::runtime_action_unicast_received_constructs_and_compares * feat(zeb-226): surface DeliverLocally as UnicastReceived to client Phase 3a, step 3 of 5. Replaces the silent drop at runtime.rs:2350 ("Other router actions are diagnostics — drop for now") with an explicit DeliverLocally handler that emits RuntimeAction::UnicastReceived in the next tick(). Source identity resolution: Option A — added a `source: [u8; 16]` field to NodeAction::DeliverLocally so the identity travels with the action. Reticulum data packets don't carry sender identity in their wire header (identity is bound at the link handshake layer, in Link::remote_identity), and the Node does not currently track terminal-link state for delivered packets. The construction site at process_data_packet populates [0u8; 16] as a placeholder; real link → identity binding is deferred to ZEB-227 (harmony-client Phase 3b, which owns the link-state side). This commit makes the type contract honest end-to-end so the placeholder can be replaced without rippling through the surface. This activates the inbound half of harmony-client's ZEB-216 Sub-B DM transport. The drop comment was a deliberate placeholder waiting for the client-facing consumer — Phase 3a is that consumer. Scope is intentionally minimal: only crates/harmony-runtime/src/runtime.rs and crates/harmony-reticulum/src/node.rs. All existing DeliverLocally matchers in the reticulum test suite use shape-matching (`..`), so adding a field is non-breaking. Pre-existing fmt drift (PR #266) and pre-existing clippy warnings are NOT addressed here. * feat(zeb-226): translate SendUnicastToDevice to SendOnInterface Phase 3a, step 4 of 5. Wires the outbound half of the unicast IPC contract: RuntimeEvent::SendUnicastToDevice (client request) is processed during push_event() and translated to RuntimeAction::SendOnInterface via the inner Reticulum router's path-table lookup (Node::path_table().get(&target) + Node::route_packet on hit). On hit, route_packet() runs IFAC masking and the existing send_on_interface plumbing unchanged; the resulting SendOnInterface NodeAction is forwarded through the existing dispatch_router_actions translator into pending_direct_actions so it surfaces at the next tick(). Unknown targets are logged at WARN and dropped — retry/expiration semantics live in harmony-client's outbox (ZEB-216 Sub-B Phase 3b, ZEB-227). The two new tests cover both paths: a seeded-path-table happy path asserting one SendOnInterface to the resolved interface with packet bytes intact (weight=None, directed), and an empty path table asserting drop-with-no-emission. Adds a small symmetric API on Node: `path_table_mut()` alongside the existing read-only `path_table()` accessor, used by the new runtime test to seed a route without driving a full announce flow. The doc comment flags it as test/bootstrap-only — production code should still go through process_inbound so cooperation scoring, reverse-table bookkeeping, and replay detection stay in sync. Scope is intentionally minimal: the dispatch arm in runtime.rs and one accessor in harmony-reticulum's node.rs. Pre-existing fmt drift (PR #266) and pre-existing clippy warnings are NOT addressed here. * test(zeb-226): end-to-end unicast round-trip A → B Phase 3a, step 5 of 5 (final implementation task). Validates the full plumbing wired across Tasks 1-4 by running two independent NodeRuntime instances and hand-bridging A's outbound SendOnInterface action into B's InboundPacket event: A: SendUnicastToDevice event (Task 1) → SendOnInterface action (Task 4) B: InboundPacket event → DeliverLocally NodeAction → UnicastReceived action (Task 3) Shape α (the recommended shape from the task brief): the test exercises the full chain with B's local destination registered directly on its router and A's path table seeded via the path_table_mut() test seam from Task 4. Wire bytes are constructed once with Packet::to_bytes(), shipped through A's outbound dispatch, then fed verbatim into B's inbound queue. Known limitation documented inline: source resolves to the [0u8; 16] placeholder because Reticulum's process_data_packet does not yet bind link → identity. Real link-identity binding lands in ZEB-227 / Phase 3b — flip the source assertion to a real-identity check then. The round-trip still proves payload bytes flow end-to-end and the four wiring touchpoints from Tasks 1-4 stay connected. Test name: unicast_round_trip_a_to_b_surfaces_as_unicast_received Scope: test-only addition to runtime.rs. No production code touched. Pre-existing fmt drift (PR #266) and clippy warnings unchanged. * fix(zeb-226): tighten unicast IPC contracts (dest_hash + Option<source>) Three coupled API contract changes from PR #267 review: 1. Rename `RuntimeEvent::SendUnicastToDevice.target` → `destination_hash`. The runtime resolves the field via `path_table().get(...)`, but the path table is keyed by Reticulum destination hashes (SHA256(name_hash || identity_address_hash)[:16]) populated from announces, NOT raw device identity hashes. The previous name was load-bearing-wrong: real announced peers would miss the lookup and surface as "target unknown" drops even when reachable. The runtime is generic plumbing; computing the destination hash from identity-hash + destination-name is a harmony-client concern (Phase 3b, ZEB-227 — clients know their own destination naming: DM inbox, voice channel, file sync, etc.). 2. Type `RuntimeAction::UnicastReceived.source` and `NodeAction::DeliverLocally.source` as `Option<[u8; 16]>` instead of `[u8; 16]` with `[0u8; 16]` as a sentinel for "unknown until Phase 3b wires terminal-link tracking." Sentinels can be misinterpreted as a real authenticated identity. The Option makes the unknown state explicit in the type and forces consumers to handle it. Construction site at `node.rs:route_packet` sets `None` (was `[0u8; 16]`); the runtime dispatch arm forwards the Option as-is. 3. Defer `SendUnicastToDevice` resolution from `push_event` to `tick`. The previous code did `path_table().get(...)` synchronously inside `push_event`, racing against any inbound announces queued in the same batch (announces sit in `router_queue` and aren't applied to the path table until `tick` drains them). Now `push_event` enqueues into `pending_unicast_sends`, and `tick` drains that queue AFTER the router queue inside the Tier 0 (Router) block — so any same-batch announces are applied first. Tests updated to use the new field names + Option<source>: - `runtime_event_send_unicast_to_device_constructs_and_compares` - `runtime_action_unicast_received_constructs_and_compares` - `dispatch_router_actions_surfaces_deliver_locally_as_unicast_received` - `send_unicast_to_device_emits_send_on_interface_for_known_target` - `send_unicast_to_device_drops_for_unknown_target` - `unicast_round_trip_a_to_b_surfaces_as_unicast_received` Round-trip test placeholder assertion is now `received.0 == None` instead of `[0u8; 16]`. The seeded path-table value in the outbound test is unchanged in semantics (the seed key was always a destination hash by contract — only the field name shifts). Refs: PR #267 review (Qodo #1, CodeRabbit, CodeAnt). * fix(zeb-226): rename path_table_mut → path_table_mut_for_tests + doc(hidden) Task 4 added `pub fn path_table_mut(&mut self) -> &mut PathTable` to `Node` so harmony-runtime tests can seed routing entries directly. The doc said "primarily intended for tests / bootstrap" but the accessor was publicly available in production builds with no visibility signal beyond the doc comment. Renamed to `path_table_mut_for_tests` and added `#[doc(hidden)]` so the test-only intent is loud at the call site and excluded from rustdoc. The `_for_tests` suffix surfaces the constraint without a feature flag (which would have required Cargo.toml changes outside the agreed scope for this branch). Production code that touches the path table directly (skipping cooperation scoring, reverse-table bookkeeping, and replay detection) will now read as obviously wrong during review. Updated three call sites in harmony-runtime tests: - `send_unicast_to_device_emits_send_on_interface_for_known_target` - `unicast_round_trip_a_to_b_surfaces_as_unicast_received` - (plus their inline doc-comment references) Refs: PR #267 review (Qodo #4, CodeRabbit). * docs(zeb-226): plan doc lint cleanups + scope correction Three minor cleanups from PR #267 review: 1. File Structure section claimed "Single-crate change. All work in `crates/harmony-runtime/`" but Task 3's investigation explicitly anticipated harmony-reticulum changes (Option C: add a `source` field to `NodeAction::DeliverLocally`). The implementer chose exactly that path. Reword the claim to match what actually happened: primary work in harmony-runtime, companion change in harmony-reticulum for the DeliverLocally.source field and the path_table_mut_for_tests accessor. 2. MD031 (markdownlint): five fenced code blocks were missing a blank line between the closing fence and the following prose ("Expected: all three exit 0.") or between preceding prose and the opening fence. Added the blank lines. 3. ENGLISH_WORD_REPEAT_BEGINNING_RULE: the Step 3 "If this fails… If the issue…" pair triggered the linter. Reworded to "When this fails…" / "Should the second cause apply…". No content changes — only lint and scope-claim corrections. Refs: PR #267 review (CodeRabbit minor). * fix(zeb-226): bound unicast drain + defer-on-miss when router has backlog Round-10 review (CodeRabbit Major + Cursor Low). Two related issues in the pending_unicast_sends drain inside tick(): 1. Race (correctness): if router_max_per_tick caps the router-queue drain, an announce that would create the route for a queued unicast may not have applied to path_table yet. The unicast was dropped as "unknown" even though next tick's announce drain would populate the route. 2. Budget (latency): every other Tier 1 queue respects router_max_per_tick; the unicast drain was unbounded. A burst of 1000 unicasts blocked the tick loop. Fix combines both. After the router queue drain, capture router_queue_drained, then bound the unicast drain by the same limit AND defer-on-miss when the router still has work. Drops also count against the budget so a flood of unknown targets can't bypass the cap. Deferred entries are re-queued at the front to preserve arrival-order priority over later-arriving sends. New tests: - unicast_drain_respects_router_max_per_tick: 5 known-target unicasts with cap=2 → 2/2/1 across three ticks. - unicast_defers_when_router_queue_has_backlog: 2 TimerTicks + cap=1 forces router_queue non-empty at unicast-drain time; unknown-target unicast is deferred (NOT dropped) on tick 1, then drops cleanly on tick 2 after router fully drained. Adds a #[doc(hidden)] pending_unicast_sends_len() accessor for test introspection. * fix(zeb-226): forward destination_hash through UnicastReceived Round-10 review (Cursor Medium). The dispatch arm in dispatch_router_actions previously discarded NodeAction::DeliverLocally::destination_hash with `_`. But the client (harmony-client / Phase 3b) registers MULTIPLE destination types per its spec (DM inbox, voice channel, file sync, ...) and needs the destination hash to dispatch each inbound packet to the right handler. Without the hash, the client can't tell which destination kind a packet was addressed to. Fix: add `destination_hash: [u8; 16]` to RuntimeAction::UnicastReceived and forward the field from the DeliverLocally action. Field order: destination_hash, source, packet — destination first so the client can match-bind it without destructuring the whole variant. Tests: - runtime_action_unicast_received_constructs_and_compares: pin a distinct destination_hash so a future shape change trips it. - dispatch_router_actions_surfaces_deliver_locally_as_unicast_received: assert destination_hash flows through (was previously discarded). - unicast_round_trip_a_to_b_surfaces_as_unicast_received: assert the round-trip carries B's registered destination hash. * docs(zeb-226): align plan terminology with landed API Round-10 review (CodeRabbit Minor). The plan still used pre-execution naming that the landed API has since refined: - "target" → "destination_hash" everywhere it referred to the SendUnicastToDevice field (the field is a Reticulum DESTINATION hash, not a raw device identity hash; the doc now flags this distinction explicitly). - "source: [u8; 16]" → "source: Option<[u8; 16]>" everywhere it referred to the UnicastReceived field. The Option type was tightened in round 9 to make the unknown-source case unforgeable — sentinels can be misinterpreted as real identities. Phase 3a emits None until ZEB-227 wires link-identity binding. - Adds destination_hash to UnicastReceived per round 10 — the client registers MULTIPLE destination kinds (DM inbox, voice, file sync) and needs the dest hash to dispatch by kind. Open Questions section now records that source-resolution Option C ("add field to DeliverLocally") was chosen during execution, and the type was tightened to Option<[u8; 16]> per round-9 review. * fix(zeb-226): handle RuntimeAction::UnicastReceived in harmony-node dispatch Round 10 follow-up. The variant added in commit e079c6a (Task 2) broke harmony-node's exhaustive match in event_loop.rs::dispatch_action. The harmony-runtime test gate caught nothing because it only runs against harmony-runtime; the workspace check was needed to surface it. Per "test drift is our fault" — broken builds in the workspace caused by our changes belong in this PR, not a follow-up. Added a no-op arm that logs at debug and drops, with a comment explaining harmony-node has no client-facing IPC channel for unicast delivery. The real consumer is harmony-client (Phase 3b, ZEB-227) which embeds NodeRuntime directly and consumes RuntimeAction::UnicastReceived through its own tick loop, bypassing harmony-node entirely. `cargo check --workspace` is now clean (one pre-existing dead-code warning in SpeculationConfig remains — unrelated, present on origin/main). * fix(zeb-226): defer branch in unicast drain must consume budget CodeRabbit Major / Greptile P1 (round 11): the unicast drain loop in NodeRuntime::tick() was incrementing processed_unicasts ONLY in the route-success and drop-with-warn branches, NOT in the defer-on-backlog branch. So when router_queue had backlog (router_queue_drained=false), deferred unicasts re-queued without consuming budget — the loop could walk the entire pending_unicast_sends in a single tick regardless of router_max_per_tick. The fix moves processed_unicasts += 1 to the TOP of the loop so all three branches (route, defer, drop) consume one budget slot each. The existing test unicast_drain_respects_router_max_per_tick exercised only the route branch (known target, empty router queue) and unicast_defers_when_router_queue_has_backlog used cap=1 with 1 unicast — neither caught the defer-bypass. New regression test unicast_drain_defers_count_against_router_max_per_tick exercises the specific bypass: cap=2, 3 router events queued (forces drained=false), 3 unknown unicasts followed by 1 known unicast. With the fix, the 2 unknown defers consume budget and the known target is never reached (0 SendOnInterface). Without the fix, the loop walks past all 3 unknowns (defers don't count) and routes the known target (1 SendOnInterface) — bypassing the cap. Verified by temporarily reverting just the increment move while keeping the new test: test fails with "left: 1 right: 0" matching the predicted bug signature exactly. * fix(zeb-226): tighten test-seam visibility (Greptile P2) Two test-only methods were `pub fn` + `#[doc(hidden)]`, which only hides from rustdoc. Downstream crates could still call them in production builds. Round-11 Greptile P2: - `NodeRuntime::pending_unicast_sends_len` (same-crate consumers only): downgraded to `pub(crate)`. Simplest fix; the seam never needed to escape harmony-runtime. - `Node::path_table_mut_for_tests` (cross-crate — harmony-runtime's tests call it): gated behind `cfg(any(test, feature = "test-seams"))`. Added a new internal `test-seams` feature on harmony-reticulum and activated it via harmony-runtime's `[dev-dependencies]` block. Cargo's resolver-2 isolation (verified in workspace Cargo.toml) keeps the dev-only feature activation out of normal builds: `cargo tree -e features` for harmony-node (which depends on harmony-reticulum but not harmony-runtime tests) shows no `test-seams`. Workspace `cargo check`, `cargo build --lib`, and full test suites all pass. Approach chosen for path_table_mut_for_tests: feature-flag (option 1 in the implementer brief). Considered: - pub(crate): rejected — cross-crate consumer needs access - redesign as `seed_path_for_tests` taking owned args: less elegant and would force a wider API rewrite for the same outcome - keep pub + loud `_for_tests` suffix (fallback): not needed; the feature-flag approach cleanly compiles and runs in the workspace * docs(zeb-226): describe defer-then-drop semantics on SendUnicastToDevice Round-10's race+budget fix changed SendUnicastToDevice from log+drop-on-miss to defer-then-drop, but the variant doc still claimed "the request is logged and dropped" on lookup miss. Update to describe: - Defer when the router queue still has backlog (announces that would create the route may be queued one event behind). - Final WARN+drop only after the router queue fully drains. - Both branches consume router_max_per_tick budget (round-11 fix), so a defer-only tick cannot walk the entire pending queue. - Retry/expiration lives in harmony-client's outbox; callers MUST NOT assume immediate drop on miss. Doc-only — no behavior change. Addresses CodeRabbit Minor finding from round-12 review. * fix(zeb-226): move PacketDropped logging back to inline unicast call site Round-9 added a top-level NodeAction::PacketDropped arm in dispatch_router_actions to surface SendUnicastToDevice failures from route_packet. Cursor's round-12 review pointed out that arm also fires on routine inbound process_data_packet drops — broadcast traffic addressed to unregistered local destinations returns PacketDropped { NoLocalDestination }, which was silently dropped pre-round-9 and now spams WARN logs on non-transport nodes. Switch to Qodo's original alternative (option B): handle PacketDropped specifically in the SendUnicastToDevice route-success branch where we have full context (destination_hash), and let the top-level arm fall through to the silent diagnostic catch-all. - Iterate node_actions before dispatch_router_actions, log any PacketDropped at WARN with destination_hash_first_4 + reason + interface_name, then move into dispatch. - Remove the broad NodeAction::PacketDropped match arm in dispatch_router_actions; comment notes why. - Inner PacketDropped arm inside the AnnounceNeeded branch is unchanged — it's announce-specific and was never the noise source. 127 harmony-runtime + 328 harmony-reticulum tests still pass; no existing tests asserted the removed log message. Addresses Cursor Low finding from round-12 review. * fix(zeb-226): cfg(test) gate on pending_unicast_sends_len Round 12 cleanup. The accessor added in round 10 (Greptile P2 → round 11 made it pub(crate)) has no production callers by design — it exists solely for in-crate test introspection of the unicast drain. Without cfg(test), non-test builds fire a dead_code warning. Adding #[cfg(test)] is the accurate gate (matches the actual usage shape) and removes the warning without needing #[allow(dead_code)] on a function that genuinely IS used (just only in tests). `cargo clippy --manifest-path crates/harmony-runtime/Cargo.toml --no-deps --lib --tests` is now clean of the dead_code warning. Pre-existing PR-#266-territory warnings (ReplicaStore, single_match, unused mut, unused stranger_hash, field_reassign_with_default in test helpers) remain — outside this PR's scope. * docs(zeb-226): document per-destination FIFO + cross-destination best-effort ordering CodeRabbit Major (round 13) flagged that the unicast drain doesn't preserve strict FIFO across destinations and suggested break-on-first-defer to enforce it. Pushing back: that change would be a regression for the actual use case. Scenario the change would break: - harmony-client sends DM 1 to recipient A (offline, route unknown) - harmony-client sends DM 2 to recipient B (online, route exists) - Current behavior: DM 2 sends immediately; DM 1 defers until A's route resolves or drops - Proposed strict-FIFO behavior: DM 2 blocks behind DM 1 until A resolves OR A drops — deliverable message waits for undeliverable What we DO preserve and what we DON'T: - Per-destination FIFO IS preserved by the existing code: same-dest entries that all defer get re-queued at the front in arrival order; on the next tick they hit the same path-exists state and process in order. - Cross-destination ordering is intentionally routability-driven, not arrival-driven. The runtime is plumbing; ordering above the runtime is the client's concern. Updated the SendUnicastToDevice doc to make this explicit so future readers (and bot reviewers) don't re-flag it. Added an "Ordering contract" section that: 1. States per-destination FIFO is preserved 2. States cross-destination is best-effort 3. Explains WHY (the offline-blocks-online regression) 4. Tells callers needing cross-destination order where to enforce it (harmony-client outbox layer) 5. Notes the CodeRabbit suggestion was rejected for this reason Doc-only change. No behavior change.
…le unregister test Two bot-review findings on PR #268: 1. Cache-before-rate-limit (Qodo correctness bug). process_announce was inserting into identity_table BEFORE the transport-mode rate- limiting check, so an announce that drops with DropReason::AnnounceRateLimited still mutated the cache. Dropped traffic should have NO observable side effects. Moved the insert to after the rate-limit check (and any other early-drop branches), on the function's happy-path acceptance branch. Added regression test process_announce_does_not_cache_identity_for_rate_limited_announces to lock the ordering. 2. Stronger unregister test (CodeRabbit nitpick). Extended node_runtime_unregister_local_destination_returns_bool with an observable-behavior assertion: after unregister, inbound packets addressed to the dest must NOT surface as RuntimeAction::UnicastReceived. Verifies that the boolean return matches reality at the action layer. Bug #1 from the same review (unbounded identity_table growth) is the known issue tracked as ZEB-235 (Medium, Backlog). The bot-suggested fix (path_table.retain coupling) would break the DM use case — DM identities are stable for years; paths expire in 7d/1d/6h, so coupling would force contacts to lose verifiability every time a route aged out. Independent identity-cache expiry (longer TTL) plus cardinality cap is the right design and stays in ZEB-235.
…B-156 R1) Round-1 bot finding from Qodo on PR #269: `StorageTier::remove` delegates to `ContentStore`'s `BookStore::remove` impl, which clears every cache metadata slot — including the pinned set. So an unguarded burn of a pinned CID would silently drop user-protected data and unpin it in the same call, violating the pin-retention contract if the caller forgot to unpin first. The doc comment said callers must unpin first, but nothing enforced it. The harmony-client burn path always unpins before removing, but future callers could trip the foot-gun. Fix: add an `is_pinned` guard at the top of `StorageTier::remove`. If the CID is still pinned, return `None` (safe no-op) without touching any cache state. Matches Qodo's suggested approach #1 ("safe no-op on pinned CID"). Doc comment updated to describe the guard and the "pinned-as-refusal" semantic. `BookStore::remove` on `ContentStore` is unchanged — it still clears the pinned set when invoked directly. The guard lives at the `StorageTier::remove` layer (which is the only public surface anyway, via `NodeRuntime::remove_content`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(runtime): narrow remove_content delegate for ZEB-156 burn-with-expunge ZEB-156 in harmony-client distinguishes Burn from Unpin by adding immediate cache eviction (W-TinyLFU bypass) for the burned root's descendants. The existing NodeRuntime API exposes pin_content / unpin_content as narrow delegates but does not expose a direct cache.remove primitive — &mut StorageTier access is deliberately gated to prevent callers from reaching the eviction-buffer + broadcast machinery. Add two matching narrow delegates following the exact pin_content / unpin_content pattern: - StorageTier::remove(cid) — calls ContentStore::remove (cleans up every cache metadata slot via the existing BookStore impl). No tier-side accounting (disk_index/archive_index) needs touching — the burn path is for the in-RAM cache layer that client builds use. - NodeRuntime::remove_content(cid) — narrow delegate to StorageTier::remove, mirrors pin_content / unpin_content's visibility and doc style. Both return Option<Vec<u8>>: the bytes that were stored, or None if the CID was not present (no-op safe). Doc comments explicitly call out the W-TinyLFU bypass — this method is for the "no really, destroy this" burn verb specifically, NOT for routine eviction (which runs through the admission/probation/protected dance and emits EvictionPush actions; this delegate is silent on purpose). Consumed by harmony-client PR for ZEB-156 (root-pin-set cascade correctness for Burn after Unpin's keep-set fix lands). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(content): refuse to remove pinned CIDs in StorageTier::remove (ZEB-156 R1) Round-1 bot finding from Qodo on PR #269: `StorageTier::remove` delegates to `ContentStore`'s `BookStore::remove` impl, which clears every cache metadata slot — including the pinned set. So an unguarded burn of a pinned CID would silently drop user-protected data and unpin it in the same call, violating the pin-retention contract if the caller forgot to unpin first. The doc comment said callers must unpin first, but nothing enforced it. The harmony-client burn path always unpins before removing, but future callers could trip the foot-gun. Fix: add an `is_pinned` guard at the top of `StorageTier::remove`. If the CID is still pinned, return `None` (safe no-op) without touching any cache state. Matches Qodo's suggested approach #1 ("safe no-op on pinned CID"). Doc comment updated to describe the guard and the "pinned-as-refusal" semantic. `BookStore::remove` on `ContentStore` is unchanged — it still clears the pinned set when invoked directly. The guard lives at the `StorageTier::remove` layer (which is the only public surface anyway, via `NodeRuntime::remove_content`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…clarification, resolver style) - relay.rs PUT: 5xx/non-429 errors now mark cooldown + continue rotating instead of aborting the pool; last HTTP error status is surfaced only after all relays are exhausted (finding #3) - relay.rs GET: empty available_relays() (all on cooldown) now returns Err(NoRelaysAvailable) immediately instead of Ok(None), preventing false negative-caching of keys that were simply unreachable (finding #4) - resolver.rs: alloc::vec::Vec::new() → Vec::new() style cleanup (finding #5) - derive.rs: inline comment on Case B salt arm pointing to security rationale in doc comment + spec §5.3 (finding #6 clarification only, no logic change) Round-1 fixes confirmed landed: publisher collect-then-publish schedule, wrap_bep44_envelope u64 seq, parse_and_verify 8-byte LE read all verified correct in b58952b (findings #1 and #2 already resolved). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ne DHT HTTP-relay (#270) * docs(zeb-322): impl plan for harmony-pkarr crate (Phase 2a of cross-WAN connectivity initiative) * docs(zeb-322): scope gates to -p harmony-pkarr (pre-existing main breakage tracked in ZEB-324) * feat(zeb-322): harmony-pkarr crate skeleton + PkarrError * fix(zeb-322): address code-quality review (extern crate alloc cfg + unused workspace pkarr dep) * feat(zeb-322): week-aligned epoch math + tolerance window Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(zeb-322): HKDF-SHA256 per-case key derivation + reference vectors Add derive.rs with PkarrCase enum (Invite/Identity/Community), derive_ephemeral_key via HKDF-SHA256 with domain-separated salts, and 5 unit tests including 3 pinned reference-vector tests that lock the v1 keying scheme against future drift. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(zeb-322): PkarrRoutingRecord wire format + inner-sig + skew verify + pin test Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(zeb-322): mock pkarr-relay axum server behind test-fixtures feature Adds MockPkarrRelay (PUT/{key} + GET/{key} in-memory store) as an axum server in src/testing.rs, gated on #[cfg(any(test, feature = "test-fixtures"))]. Updates Cargo.toml with reqwest added to test-fixtures feature, optional axum/tokio/hyper/reqwest deps with explicit features, and dev-dependencies for tokio + reqwest so tests run without the feature flag. Two unit tests pass: round-trip PUT-then-GET and GET-missing-returns-404. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(zeb-322): HTTP relay client + pool rotation + cooldown * feat(zeb-322): PkarrPublisher background task + BEP44 envelope sign Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(zeb-322): PkarrResolver with LRU cache + parallel epoch-window queries Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(zeb-322): add axum+hyper as dev-deps so workspace nextest builds without test-fixtures testing.rs is compiled under cfg(test), but axum was only available via the optional test-fixtures feature. cargo nextest run --workspace builds tests without that feature active, so axum was unresolved. Adding both as dev-dependencies makes the test build self-contained regardless of features. * chore(zeb-322): update Cargo.lock for axum+hyper dev-deps in harmony-pkarr * fix(zeb-322): round-1 review fixes (relay all_404, resolver silent-drop, publisher schedule, seq u64, sign_new key match) - relay.rs: add polled_any flag so empty available_relays() returns Err(NoRelaysAvailable) instead of Ok(None) - resolver.rs: RPK1/RPK2 silent-drop discipline — parse_and_verify and verify_inner_sig failures now cache negative and return Ok(None); add verify_inner_sig call; update seq parsing to u64/8-byte - publisher.rs: defer next_publish_at update until after publish attempt (success→epoch slot, failure→60s backoff); change seq from u32 to u64 with 8-byte LE envelope format - record.rs: sign_new key-match guard — returns Err(IdentityMismatch) if signing key doesn't match harmony_identity_pub[32..64]; add test - derive.rs: doc comment clarifying case-B intentional public-key derivation (CodeRabbit misread) - Fix 6 (InvalidRecord() parens) was already correct — no change needed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(zeb-322): round-2 review fixes (PUT 5xx rotation, relay cooldown clarification, resolver style) - relay.rs PUT: 5xx/non-429 errors now mark cooldown + continue rotating instead of aborting the pool; last HTTP error status is surfaced only after all relays are exhausted (finding #3) - relay.rs GET: empty available_relays() (all on cooldown) now returns Err(NoRelaysAvailable) immediately instead of Ok(None), preventing false negative-caching of keys that were simply unreachable (finding #4) - resolver.rs: alloc::vec::Vec::new() → Vec::new() style cleanup (finding #5) - derive.rs: inline comment on Case B salt arm pointing to security rationale in doc comment + spec §5.3 (finding #6 clarification only, no logic change) Round-1 fixes confirmed landed: publisher collect-then-publish schedule, wrap_bep44_envelope u64 seq, parse_and_verify 8-byte LE read all verified correct in b58952b (findings #1 and #2 already resolved). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(zeb-322): round-3 review fixes — RPK4 skew in resolve + GET 5xx cooldown Two P1 findings from Greptile on a2f251a: 1. RPK4 (±30min skew) was claimed enforced in the PR description but never actually called inside `PkarrResolver::resolve`. Stale records were being accepted, positively cached for 15min, and returned to callers. Add the `verify_skew(now_ms)` check between RPK2 and the positive cache-put; negative-cache on skew failure so we don't spam the relay during the 60s window. 2. `RelayClient::get` did not call `mark_cooldown` on non-success non-429 responses (5xx), while `put` did. A misbehaving relay returning persistent 5xx errors would be retried on every GET with no backoff. Mirror the PUT path: mark cooldown + continue. 32/32 tests pass. * fix(zeb-322): round-4 review fixes (resolve_window partial err + publisher unregister/register race) - resolver: track any_ok_none separately from any_err in resolve_window; prefer Ok(None) when at least one key confirmed absence, even if sibling keys errored transiently — only Err when every key errored - publisher: add Arc<AtomicBool> cancelled field to ActivePublication; register() cancels prior entry before replacing; unregister() cancels removed entry; publish_one() short-circuits before PUT if cancelled; post-publish next_publish_at write-back skips if cancelled (fixes unregister in-flight race + mid-flight register overwrite) - relay: verified CodeRabbit finding already addressed (polled_any && all_404 guard in place since round 2) — no change needed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(zeb-322): round-5 review fixes — cache skew re-check + MockPkarrRelay Drop Two findings from round-4 Cursor review of 1e1edc3: 1. resolver.rs cache_get (MEDIUM): A record cached near the +30min skew edge could be served from the 15min positive cache up to 14min after it falls outside the RPK4 skew window. Re-check verify_skew on cache hit; on failure, evict + treat as miss so the next resolve refetches. 2. testing.rs MockPkarrRelay (LOW): The struct documented "drop to stop the server" but had no Drop impl. Add one that aborts the spawned axum task for prompt cleanup across test boundaries. (oneshot::Sender drop would trigger graceful shutdown via RecvError, but abort() is belt-and- suspenders.) 32/32 tests pass. * fix(zeb-322): round-6 — rotate ephemeral key on every publish (Greptile P1) Change `PkarrPublisher::register` to take an `EphemeralKeyBuilder` (`Arc<dyn Fn(u64) -> SigningKey>`) instead of a fixed `SigningKey`. The key is now re-derived on every publish cycle, so it tracks the current epoch as time advances. Before this fix, callers computed `current_epoch_id(now_ms())` once at registration time and passed the resulting key to `register`. After the first epoch boundary, the publisher kept writing under the previous epoch's DHT key while resolvers (which derive against the current epoch ± tolerance) stopped finding the record — silently breaking discovery for all three pkarr cases (invite, identity, community fallback) after one or two epoch boundaries. Greptile flagged this as P1 on harmony-client PR #158; the architectural fix lives in this crate because `publish_one` is where the key is actually consumed. Also fixes Cursor LOW: removed dead `polled_any` variable in `RelayClient::get` (the empty-relays guard above the loop already ensures we polled at least one relay, so the variable was logically dead). Replaced with a comment explaining the remaining `all_404` guard's purpose. New test `ephemeral_key_builder_invoked_each_publish` registers a builder that returns two distinct keys on successive calls and verifies both surfaces land on the mock relay across forced publish cycles. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>


Note
High Risk
Introduces new cryptographic and identity primitives (AEAD, Fernet, HKDF, key handling) where subtle correctness issues can have security implications; although largely additive, it’s security-sensitive code that will be reused broadly.
Overview
Adds a new Rust workspace (
Cargo.toml,Cargo.lock) with two new crates:harmony-crypto(hashing, HKDF, Reticulum-compatiblefernettoken encryption, and Harmony-native ChaCha20-Poly1305aead, plus sharedCryptoErrortypes) andharmony-identity(Ed25519/X25519 identity generation, serialization, signing/verification, and encrypt/decrypt using ephemeral ECDH + HKDF + Fernet).Includes extensive unit tests and a
reticulum_interoptest suite with Python-derived vectors to validate byte-level compatibility. Also updates.gitignoreto ignore Dolt.beads/*runtime files and removes those tracked artifacts.Written by Cursor Bugbot for commit 5a63244. This will update automatically on new commits. Configure here.