feat(dkim): hand-rolled RFC 6376 verifier (PR 2 of email-recovery stack)#3839
Closed
sea-snake wants to merge 21 commits into
Closed
feat(dkim): hand-rolled RFC 6376 verifier (PR 2 of email-recovery stack)#3839sea-snake wants to merge 21 commits into
sea-snake wants to merge 21 commits into
Conversation
…iring First PR in the email-recovery stack (docs/ongoing/email-recovery.md §10 Phase 0). Lands the structural pieces of the DNSSEC verifier so PR dfinity#2 (DKIM verifier) and PR dfinity#4-dfinity#9 (storage + recovery methods) can build against the types. Cryptographic verification logic and real DoH-captured test vectors arrive in PR #1b. What's in this PR: - New workspace crate internet_identity_email_test_vectors with a placeholder loader and a README explaining what arrives in PR #1b (DNSSEC chains, DKIM happy-path + tampering vectors, DMARC alignment matrix). - New DnssecConfig and DnssecRootAnchor types in internet_identity_interface, exposed at the top of InternetIdentityInit as 'dnssec_config'. Not specific to email recovery — any feature that verifies DNS records against the IANA-rooted DNSSEC chain consumes the same anchors. - New dnssec/ module under src/internet_identity/src/ with type definitions (DnsProofBundle, SignedRRset, DelegationLink, Rrsig, DnsName, DnssecError, VerifiedRecord) and a stub 'verify' that returns Err(DnssecError::NotImplemented). Step-by-step TODOs reference §7.3 of the design doc. - Trust-anchor list plumbed through init/post_upgrade into PersistentState.dnssec_config (and through StorablePersistentState for cross-upgrade persistence). - Two unit tests for the stub verifier (NoTrustAnchors path, NotImplemented path) — flip to positive assertions in PR #1b. - internet_identity.did updated with DnssecConfig / DnssecRootAnchor and the new init field. What's deferred to PR #1b: - RRSIG / DS / DNSKEY canonicalization and signature verification per RFC 4034 §6. - Crypto deps for ECDSA-P256-SHA256 (alg 13) and Ed25519 (alg 15). RSA- SHA256 (alg 8) deps already in the workspace. - Real DoH-captured DNSSEC chains for gmail.com, icloud.com, outlook.com, fastmail.com, proton.me, plus deliberately-tampered negatives. Build: cargo check --target wasm32-unknown-unknown clean; 227 internet_identity bin tests pass; 42 internet_identity_interface unit tests pass.
…scripts The Dockerfile pre-builds workspace dependencies by COPY-ing each crate's Cargo.toml and stubbing its lib.rs. Adding internet_identity_email_test_vectors to Cargo.toml's workspace members without updating these hardcoded lists causes 'docker-build-base' to fail at the cargo manifest discovery step. Also add the new crate to the BACKEND_PATHS lists in .github/actions/release/run.sh and scripts/make-upgrade-proposal so it is included in release tarballs alongside canister_tests (also test-only).
- Drop the internet_identity_email_test_vectors crate. Test vectors will live as plain files at the repo root in PR 1b (still TBD which path); no separate crate needed for what is just data-on-disk shared between internet_identity unit tests and canister_tests integration tests via include_bytes! / fs::read. - Revert the Dockerfile + release-script entries that registered the now-removed crate. - Switch InternetIdentityInit.dnssec_config from Option<DnssecConfig> to Option<Option<DnssecConfig>> to match the same set/clear pattern as analytics_config and dummy_auth (per Copilot review). Outer None keeps the previously stored value; Some(None) clears; Some(Some(c)) sets. Avoids a future breaking Candid change if operators ever need to detach trust anchors. Updates the consumer in apply_install_arg and the round-trip in config(), plus the .did declaration. - Narrow the dnssec/mod.rs allow from `#![allow(dead_code, unused_imports)]` to just `#![allow(dead_code)]`, with per-item `#[allow(unused_imports)]` on the two re-exports that need it. This keeps unused_imports active so a real issue isn't masked when the verifier implementation lands in PR 1b.
Replaces the stub verify() with a working four-step DNSSEC validator
per docs/ongoing/email-recovery.md §7.3:
1. Validate the bundle's root DNSKEY RRset against a configured trust
anchor (matches digest of one DNSKEY KSK, then verifies the RRSIG
covering the entire root DNSKEY RRset under that key).
2. Walk the delegation chain top-down: each link's DS RRset verifies
under the parent's DNSKEY, the child's DNSKEY RRset is self-signed
by a KSK whose digest matches one of the parent's DS records.
3. Verify the leaf RRset's RRSIG under the deepest zone's DNSKEY.
4. Freshness check: every RRSIG's [inception, expiration] window must
contain now ± 60s.
Algorithm coverage (RFC 8624 MUST):
* 8 — RSA-SHA256, RFC 5702 (root, com, most legacy zones)
* 13 — ECDSA-P256-SHA256, RFC 6605 (most TLDs, Cloudflare)
* 15 — Ed25519, RFC 8080
Anything else is rejected with UnsupportedAlgorithm.
New deps: domain (NLnet Labs primitives — currently used for the
docstring/RFC reference frame; signature verification is hand-rolled
on RustCrypto), p256, ed25519-dalek. All wasm32-compatible.
New files:
- src/internet_identity/src/dnssec/canonical.rs — owner-name
canonicalization, RR canonical form, RRSIG signed-data
construction (RFC 4034 §3.1.8.1, §6.2, §6.3), DS digest input.
- src/internet_identity/src/dnssec/signature.rs — algorithm
dispatch + DS digest matching.
- src/internet_identity/src/dnssec/test_vectors.rs — cfg(test)
JSON loader.
- test_vectors/dnssec/cloudflare-com-2026-05.json — captured DoH
chain for cloudflare.com TXT (root → com → cloudflare.com).
Exercises both alg 8 and alg 13.
- test_vectors/dnssec/iana-root-anchors-2026-05.json — IANA root
KSK trust anchors (Klajeyz/2017 + Kmyv6jo/2024).
- scripts/capture-dnssec-chain.py — reproducible DoH capture script.
Modified:
- dnssec/types.rs: drop NotImplemented variant (now obsolete).
- dnssec/verify.rs: replace stub with the real implementation; add
7 unit tests covering happy path + 6 negative cases.
- Cargo.toml + internet_identity/Cargo.toml: new deps.
Tests: 238 internet_identity bin tests pass (227 prior + 11 new dnssec
tests). Wasm32 build clean; full workspace cargo check clean (modulo the
unrelated frontend dist/ requirement).
Note on test data lifetime: the captured RRSIGs' validity windows expire
~2-4 weeks after capture. The tests use a 'frozen now' read from the
capture's _meta.captured_at_unix so freshness checks remain stable.
Re-run scripts/capture-dnssec-chain.py to refresh the captures when the
expiration approaches.
The clippy job flagged `#![cfg(test)]` on test_vectors.rs as a duplicated attribute (the module declaration in dnssec/mod.rs already gates it with `#[cfg(test)]`). Removed the redundant inner gate. Also applied rustfmt to the four new dnssec module files so cargo-fmt stays green.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR is part of the email-recovery stack and introduces a hand-rolled DKIM (RFC 6376) verifier that operates on the SMTP gateway’s parsed SmtpRequest plus an already-trusted DKIM TXT record, alongside the supporting DNSSEC verifier/config plumbing and committed fixtures used by tests.
Changes:
- Add DKIM verification implementation (parsing, canonicalization, body-hash, signature verification, orchestration) plus end-to-end fixture-based tests.
- Introduce DNSSEC verification types/verifier and test-vector loaders, plus trust-anchor configuration persisted across upgrades and exposed via Candid/init args.
- Add SMTP gateway protocol wire types + input-bound validation, and commit DNSSEC/DKIM test vectors and capture tooling.
Reviewed changes
Copilot reviewed 31 out of 32 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
test_vectors/dnssec/iana-root-anchors-2026-05.json |
Adds root trust-anchor fixture used by DNSSEC verifier tests. |
test_vectors/dnssec/cloudflare-com-2026-05.json |
Adds a captured DNSSEC chain fixture for end-to-end DNSSEC verification tests. |
test_vectors/dkim/synth-rsa-test1._domainkey.test.example.com.txt |
Adds DKIM TXT public key fixture for DKIM end-to-end tests. |
test_vectors/dkim/synth-rsa-simple-simple.eml |
Adds a DKIM-signed email fixture exercising c=simple/simple rejection. |
test_vectors/dkim/synth-rsa-relaxed-simple.eml |
Adds a DKIM-signed email fixture exercising relaxed header + simple body. |
test_vectors/dkim/synth-rsa-relaxed-relaxed.eml |
Adds a DKIM-signed email fixture exercising relaxed/relaxed canonicalization. |
test_vectors/dkim/README.md |
Documents DKIM fixture set and provenance/regeneration approach. |
src/internet_identity/src/storage/storable/storable_persistent_state.rs |
Persists dnssec_config through stable storage and updates defaults/tests. |
src/internet_identity/src/state.rs |
Adds dnssec_config to PersistentState with upgrade-survival semantics. |
src/internet_identity/src/main.rs |
Wires in dkim/dnssec modules and plumbs dnssec_config through init/apply logic. |
src/internet_identity/src/dnssec/verify.rs |
Implements DNSSEC bundle verification (root anchor → chain walk → leaf + freshness). |
src/internet_identity/src/dnssec/types.rs |
Defines DNSSEC bundle/rrset types and verifier error taxonomy. |
src/internet_identity/src/dnssec/test_vectors.rs |
Adds JSON test-vector loader for DNSSEC bundles/anchors. |
src/internet_identity/src/dnssec/signature.rs |
Implements DNSSEC signature verification for alg 8/13/15 and DS matching. |
src/internet_identity/src/dnssec/mod.rs |
Declares DNSSEC module structure and exports verifier entry points/types. |
src/internet_identity/src/dnssec/canonical.rs |
Implements canonical encoding helpers for signed-data and DS digest input. |
src/internet_identity/src/dkim/verify.rs |
Orchestrates DKIM verification across multiple signatures and emits per-step checks. |
src/internet_identity/src/dkim/types.rs |
Defines DKIM algorithms/canonicalization enums, checks, and result types. |
src/internet_identity/src/dkim/test_vectors.rs |
Adds DKIM .eml loader + end-to-end tests against committed fixtures. |
src/internet_identity/src/dkim/signature.rs |
Implements RSA-SHA256 and Ed25519-SHA256 DKIM signature verification + body hashing. |
src/internet_identity/src/dkim/parse.rs |
Parses DKIM-Signature tag lists with duplicate rejection and robust splitting. |
src/internet_identity/src/dkim/mod.rs |
Declares DKIM module layout and exports public API/types. |
src/internet_identity/src/dkim/dns_record.rs |
Parses DKIM DNS TXT records (tags, flags, key material) with tolerances. |
src/internet_identity/src/dkim/canonicalize.rs |
Implements relaxed header/body canonicalization routines used by DKIM verification. |
src/internet_identity/internet_identity.did |
Extends Candid with DnssecConfig/DnssecRootAnchor and dnssec_config init field. |
src/internet_identity/Cargo.toml |
Adds crypto/DNSSEC-related dependencies to the canister crate. |
src/internet_identity_interface/src/internet_identity/types/smtp.rs |
Introduces SMTP gateway request/response types and bounds-based validation helpers. |
src/internet_identity_interface/src/internet_identity/types/dnssec.rs |
Adds Candid types for DNSSEC trust-anchor configuration. |
src/internet_identity_interface/src/internet_identity/types.rs |
Re-exports new dnssec/smtp modules and adds dnssec_config to init args. |
scripts/capture-dnssec-chain.py |
Adds tooling to capture DNSSEC proof bundles from DoH into repo fixtures. |
Cargo.toml |
Adds workspace dependencies for DNSSEC verifier crypto crates. |
Cargo.lock |
Locks newly introduced dependencies and transitive updates. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
6 tasks
- Drop unused domain workspace dep. The dep was added during planning when we considered using NLnet Labs' domain crate for canonicalization, but the verifier ended up hand-rolling everything (canonicalize_name, rrsig_rdata_for_signing, rr_canonical, ds_digest_input) on top of rsa/sha2/p256/ed25519-dalek alone. Removing it saves a 66-line Cargo.lock entry and shrinks dependency surface to no actual cost. - scripts/capture-dnssec-chain.py docstring example: rename name/rdata/ signature to name_hex/rdata_hex/signature_hex to match what the script actually emits (and what the canister-side test loader expects). - scripts/capture-dnssec-chain.py --resolver help: was 'cloudflare-dns.com' but the actual default is https://1.1.1.1/dns-query. Make the help text match the default.
e658f44 to
ec371aa
Compare
This was referenced May 5, 2026
Closed
- Replace magic numbers across the dnssec module with named
constants, co-located inside their using function when single-use
and hoisted to wire.rs only when shared across modules. Every
RDATA field offset and length now traces back to a named RFC
reference.
- Beef up RFC citations on every public item: doc comments reference
specific RFC 4033 / 4034 / 4035 / 5702 / 6605 / 8080 / 8624 sections
rather than handwaving at "DNSSEC". Algorithm numbers in narrative
text use the algorithm name (RSA-SHA256, ECDSA-P256-SHA256,
Ed25519) instead of the bare IANA number.
- Add real-data test coverage for the four email-recovery target
zones and the Ed25519 algorithm:
* proton.me — RSA-SHA256 end-to-end
* protonmail.com — ECDSA-P256-SHA256 leaf
* tutanota.com — ECDSA-P256-SHA256 leaf
* ed25519.nl — Ed25519 leaf (closes the alg-15 real-data gap)
Brings real-data algorithm coverage to the full RFC 8624 MUST set.
Tests count: 243 (was 238).
- Remove scripts/capture-dnssec-chain.py — the captures committed in
this change cover every algorithm and every email-recovery target
the design doc names, so the script's job is done. Re-capture on
root KSK rollover will be a one-off ad-hoc effort if it ever
becomes necessary.
ec371aa to
96b2e42
Compare
Three small fixes that lived on feat/doh-fallback as part of a combined "Copilot review" commit (516c48d) properly belong in PR 1 since they all concern the dnssec scaffold: - verify.rs: handle the multi-anchor case correctly. During a KSK rollover the operator configures both the rolling-out and rolling-in KSKs (RFC 5011 §2). The old "return on first digest match" strategy could short-circuit onto the inactive anchor and never try the active one. Track digest-match state, try every candidate, and surface the cryptographic failure (not the generic RootAnchorMismatch) when at least one anchor matched but no signature verified. - mod.rs: doc-comment correctness. The verifier itself makes no outcalls, but the email-recovery stack uses DoH for unsigned domains (PR 4). The previous wording overclaimed. - iana-root-anchors-2026-05.json: corrected stale "_comment" that said the historical 19036 KSK was included. It isn't, and shouldn't be — production deployments should configure 20326 + 38696 only.
96b2e42 to
0a87f73
Compare
Three improvements that lived as commits on later branches in the email-recovery stack are scaffold-level — they belong in PR 1 with the rest of the verifier, not buried in feature PRs that happen to need them. Back-port them here so PR 1 ships a complete scaffold that downstream feature PRs can build on without extending the verifier surface in passing. - Two-phase API split (originally on feat/email-recovery-storage-and-smtp as cfcc99e): expose `verify_root_dnskey_with_clock`, `verify_chain_with_clock`, `verify_extra_chains_with_clock`, and `verify_hops_with_clock` as standalone entry points alongside the top-level `verify_bundle_with_clock`. Two-phase callers (prepare → submit-leaf) can now cache a validated `ZoneKeysMap` across calls and only validate the new chains a follow-up submission crossed into, avoiding 3-5 RSA verifies per call. - Multi-zone + CNAME-aware bundle shape (originally feat/email-recovery-flow as 43a466e): real DKIM resolution often crosses zone boundaries via CNAME (proton.me → proton.ch, tutanota.com → tutanota.de, M365 custom domains). Replace the old single-chain `DnsProofBundle { root_dnskey, chain, leaf }` with `DnsProofBundle { root_dnskey, chains, hops }` where each chain pins a signing zone in the new `ZoneKeysMap` and each hop is verified under whichever zone its `RRSIG.signer_name` identifies. Adds CNAME-chain coherence checks (first owner matches requested name, intermediates are CNAMEs whose target equals the next owner, no loops, ≤ MAX_CNAME_HOPS = 4) and four new error variants (DuplicateZone, UnknownSigningZone, HopOwnerOutsideZone, BadCnameChain, TooManyHops). - DNSKEY-RRset RRSIG fix (originally feat/email-recovery-flow as 6788020): loosen step 2c of `verify_link` to verify the child DNSKEY rrset against any key in itself, not just the DS-pinned KSK. Real-world zones (proton.me, proton.ch, …) publish a DNSKEY rrset signed by both the KSK and the ZSK, and resolvers return whichever RRSIG comes first; resolving this on the FE side isn't generally possible. Step 2b still pins the DS-referenced KSK as part of the rrset, so the chain of trust is intact regardless of which RRSIG we verify. Test-vector JSONs migrated to the new `{root_dnskey, chains, hops}` shape. Five new tests cover the new failure modes (duplicate zone, too many hops, hop owner mismatch, wrong requested type, subdomain boundary). All five real-data chains still verify end-to-end. Counts: 248 tests pass (was 243); cargo clippy clean with -D warnings; wasm32 build clean.
0a87f73 to
46bb312
Compare
Two-phase consumers (prepare → submit-leaf in the email-recovery flow, landing in PR 5+6) need to inspect the validated zones after the chain walk — typically to extract the single zone DNSKEY when the skeleton bundle is single-zone, or to enumerate the map when caching it across calls. Adding a read-only iterator is the minimum surface change to support both. Insertion order matches the order delegation chains were verified in, which is the order callers supplied them.
46bb312 to
5415f58
Compare
Minor-version bumps inside the 0.9 line — semver-compatible, no API changes. Picks up the defensive validations added in 0.9.7 (`RsaPrivateKey::from_components` always validates keys, PKCS#1 v1.5 no longer panics on tiny keys) and 0.9.8. Does not address RUSTSEC-2023-0071 (Marvin Attack timing sidechannel) — there's still no patched version on the 0.9.x line and 0.10.0 is in -rc. Note that the Marvin Attack is a private-key recovery attack on RSA decryption/signing; this canister only does `RsaPublicKey::verify(...)` of caller-supplied bytes, so the threat doesn't apply. We can document that with a `cargo audit` ignore-with-justification when we add the audit step. ed25519-dalek 2.2.0 and p256 0.13.2 are already on their latest 0.x.y stable.
Carries forward the wire-format Candid surface the off-chain SMTP gateway already targets in the PoC PR, slimmed to just what PR 2's DKIM verifier and PR 8's smtp_request dispatch will need: - SmtpRequest / SmtpResponse / SmtpHeader / SmtpMessage / SmtpAddress / SmtpEnvelope, plus the size bounds and SMTP error codes. - Validation helpers: validate_envelope, validate_message, validate_smtp_request. - format_address (lowercases both halves so envelope casing can't bypass per-anchor lookups) and truncate_at_char_boundary (fallback to previous UTF-8 boundary, avoiding the multi-byte panic the PoC review flagged). Drops the postbox-specific bits (PostboxEmail, ValidatedSmtpRequest, the to.user → anchor parser) — they don't fit the email-recovery design, see docs/ongoing/email-recovery.md §2 non-goals. 10 unit tests for bounds, address normalisation, and char-boundary truncation. Wasm32 build unaffected.
First piece of the hand-rolled DKIM verifier. Going hand-rolled (rather than depending on stalwartlabs/mail-auth) because mail-auth pulls a non-optional hickory-resolver dep that fails to compile for wasm32-unknown-unknown — we'd need to fork+patch with perpetual rebase burden to use it. This commit lands: - src/internet_identity/src/dkim/types.rs: Algorithm (RsaSha256, Ed25519Sha256), HeaderCanon / BodyCanon (Relaxed, Simple), DkimCheck / DkimCheckName / DkimCheckStatus per-step diagnostics, and the public EmailVerificationStatus / VerificationFailReason result shape. - src/internet_identity/src/dkim/parse.rs: structural tag-list parser for the DKIM-Signature header value. Splits on ';' first, then on the first '=' per element, so a literal 'b=' substring inside another tag's base64 doesn't get misread as the start of a new tag — that was the concrete bug the PoC PR review flagged. Folded whitespace inside base64 values is stripped before decoding. Tag names are case-insensitive; duplicate tags are rejected. - mod.rs scaffolding wired into main.rs. dnssec module remains in place. 14 unit tests cover: minimal required tags, case-insensitive tag names, folded whitespace, the b=-inside-bh= antipattern, v != 1, duplicate tags, unsupported algorithms (rsa-sha1), ed25519-sha256, l/t/x tags, explicit i= override, default canon, empty h=, missing required tags, malformed base64. Wasm32 build clean; full II suite still passes.
Implements the relaxed canonicalisation algorithms — the only ones our DKIM verifier accepts on the header side (the parsed-pair gateway contract precludes byte-exact 'simple/*' header canonicalization, see design doc §5.2). Header (§3.4.2): lowercase the name; unfold continuation lines (CRLF + WSP -> WSP); collapse runs of WSP within the value to a single SP; strip trailing WSP from the value; strip WSP around the colon. Single- pass implementation that handles all five steps in one walk. Body (§3.4.4): per-line WSP cleanup (collapse runs, strip trailing); drop empty lines from the end of the body; ensure non-empty output ends in exactly one CRLF; empty input maps to empty output (no synthetic trailing CRLF). 18 unit tests cover both algorithms, including the edge cases that trip up naïve implementations: empty bodies, bodies with only whitespace, bodies without a trailing CRLF, internal empty lines that must NOT be stripped, header values that are entirely whitespace, and folded continuations with both space and tab continuations.
Parses the published DKIM key record into a DkimDnsRecord: - key_type: rsa (default) or ed25519 - public_key: base64-decoded p= value (empty = revoked) - testing: t=y flag - strict_auid: t=s flag Tag handling per §3.6.2.1 / §3.6.2.2: - v=DKIM1 if present must be first; absent is acceptable. - Tag names are case-insensitive (the PoC PR review specifically flagged P= versus p= mismatch as a real-world bug). - Whitespace inside p= is tolerated — DNS TXT records can be split across multiple <character-string>s and may carry stray WSP at chunk boundaries. - Duplicate tags are rejected (defence-in-depth even though the RFC technically allows shadowing). - t= flag list is colon-separated; we honour 'y' and 's', ignore others. - Unknown tags are silently dropped per RFC. 16 unit tests cover defaulting, both key types, both t= flags individually and combined, malformed inputs (missing p, v not first, unsupported v/k), whitespace inside p, empty p, and key-type/algorithm matching. Wasm32 build remains clean.
Both algorithms RFC 8624 specifies as MUST for DKIM (RFC 8301 + RFC 8463), no others. Reuses rsa, sha2, and ed25519-dalek that are already in the workspace from PR 1. RSA path (RFC 5702 / RFC 6376 §3.3.1): - DKIM publishes RSA keys in SubjectPublicKeyInfo (DER), so we use RustCrypto's DecodePublicKey::from_public_key_der which handles the X.509 + PKCS#1 wrapping. - Enforces RSA_MIN_KEY_BITS = 1024 per design §5.6 (planned lift to 2048 once telemetry confirms zero impact). - Sanity-checks signature length == modulus length before invoking RSASSA-PKCS1-v1_5 verify, so a structural mismatch surfaces as a clear MalformedSignature instead of an opaque crypto error. Ed25519 path (RFC 8463): - Public key is 32 raw bytes (no SPKI wrapping), signature is 64 bytes. - Wraps the input in SHA-256 ourselves before calling ed25519-dalek's pure-Ed25519 verify, since RFC 8463 specifies signing the SHA-256 of the canonical header hash input rather than the input directly. Plus body_hash_sha256 (the bh= side): SHA-256 of the canonical body, optionally truncated to l= bytes per RFC 6376 §3.4.5. Truncation caps at the body length so an l= larger than the body just uses the whole thing. 7 unit tests cover algorithm/key-type mismatches, malformed RSA keys (non-SPKI bytes), wrong-length Ed25519 keys / signatures, and the body hash with and without l=. Wasm32 build remains clean.
Ties parse + canonicalize + dns_record + signature into the public `verify(email, dkim_txt, now_secs)` API. Implements: - Multi-signature loop per RFC 6376 §5.5: try every DKIM-Signature header in order, accept on first pass, return Unverified with the most recent reason if all fail. - Tag enforcement per design §5.4: - c=simple/* on header side rejected with UnsupportedCanonicalization - x= expiration check - i= alignment with d= (subdomain-permissive unless DNS t=s set) - DNS k= must match signature a= - DNS t=y testing-mode -> Unverified(TestingMode) - Header bottom-up selection per RFC 6376 §5.4 — when h= lists a name multiple times we pick distinct occurrences walking from the bottom. - DKIM-Signature header itself canonicalised with b=value blanked, no trailing CRLF, per §3.7. The b= blanker is structural — only blanks at top-level tag positions, so a literal 'b=' substring inside another tag's base64 is never mistargeted (the bug class the PoC PR review flagged). - Per-signature DkimCheck breakdown so a UI can render which step failed. 12 unit tests cover: AUID alignment (exact, subdomain, evil-suffix, local-only), b= blanking (simple, bh-not-blanked, b-at-start, no- internal-substring-misblank), no signature, c=simple rejection, expired signature, misaligned i=. The cryptographic happy-path tests land alongside the captured/synthetic test vectors in the next commit. 67 DKIM tests pass total (parse 14 + canonicalize 18 + dns_record 16 + signature 7 + verify 12). Wasm32 build clean.
Lands the synthetic .eml fixtures plus the loader that turns them into SmtpRequest-shaped data the verifier actually consumes. End-to-end tests exercise the full pipeline: parse_eml -> SmtpRequest -> verify. Synthetic vectors were generated offline using dkimpy (Python's reference DKIM implementation) against a freshly-generated throwaway 2048-bit RSA key. The private key is *not* committed — only the .eml output and the matching public DKIM TXT record. README in test_vectors/dkim/ documents provenance and regeneration. Three .eml files cover: - c=relaxed/relaxed (the common case for mainstream senders) - c=relaxed/simple (header relaxed, body simple — accepted) - c=simple/simple (rejected with UnsupportedCanonicalization) Eight tests in dkim/test_vectors.rs cover: - happy-path verification of relaxed/relaxed and relaxed/simple - simple/simple rejection - flipped body byte -> BodyHashMismatch - flipped signature byte -> SignatureInvalid (or related) - wrong public key -> SignatureInvalid - no DKIM-Signature header -> NoSignature - the parse_eml helper itself (sanity check) The .eml parser unfolds continuation lines per RFC 5322 §2.2.3, drops the conventional single SP after the colon (matching what the gateway does in production), and preserves the rest of the header value bytes verbatim — so relaxed canonicalisation downstream produces the same form the signer hashed. Verifier totals: 75 DKIM tests (parse 14, canonicalize 18, dns_record 16, signature 7, verify 12, end-to-end 8). Full II suite: 313 tests pass (was 238 pre-PR). Wasm32 build clean.
- Use matches!() instead of explicit match arms for KeyType::matches_signature_alg - rustfmt across all dkim/ files and types/smtp.rs - Stale test_vectors.rs doc-comment said the throwaway private key 'is committed' — corrected to 'is *not* committed' (the fixtures' provenance README is the source of truth)
- blank_b_tag_value: preserve original bytes around the b= tag instead
of always emitting literal 'b='. RFC 6376 §3.7 says only the tag
value (and surrounding WSP) is replaced; the tag name and the bytes
between the name and '=' must come through verbatim. Concretely,
signers that emit 'B=' (uppercase) or 'b\t=' / 'b =' (FWS between
the name and '=') are valid per §3.2 and were previously mis-handled
— relaxed canonicalisation collapses both forms to 'b:' downstream,
so the bug only surfaces on signatures from those few senders, but
by-construction is more honest. New tests cover uppercase B,
tab-between-name-and-equals, and space-between-name-and-equals.
- test_vectors.rs module-level docstring: corrected to say the private
key is *not* committed (the README and the actual tree both already
said this; the docstring was the only contradiction). Dropped the
reference to scripts/sign-dkim.py since that file isn't checked in.
- smtp.rs validate_smtp_request_envelope_only_ok: tightened the
assertion from matches!(..., Ok(()) | Err(SmtpResponse::Ok{..}))
to is_ok(). The Err arm was unreachable (SmtpResponse::Err always
wraps SmtpRequestError) and would have masked a regression where
validation mistakenly returned the Ok variant inside Err.
5415f58 to
606482a
Compare
Contributor
Author
|
Replaced with a new PR from an upstream branch (enables direct collaboration). Same content, new PR number. |
This was referenced May 12, 2026
Merged
pull Bot
pushed a commit
to mikeyhodl/internet-identity
that referenced
this pull request
May 12, 2026
…iring (dfinity#3838) ## Summary First PR in the email-recovery stack (`docs/ongoing/email-recovery.md` §10 Phase 0). Lands a working RFC-4035-compliant DNSSEC verifier for caller-supplied DNS proof bundles, plus the trust-anchor wiring that drives it. PR 2 (DKIM verifier) and PRs 4–9 (storage + recovery methods) build on this. ## What's in this PR ### Verifier core - New `dnssec/` module under `src/internet_identity/src/`: - `types.rs` — `DnsProofBundle`, `SignedRRset`, `DelegationLink`, `Rrsig`, `DnsName`, `DnssecError`, `VerifiedRecord`. - `canonical.rs` — owner-name canonicalization, RR canonical form, RRSIG signed-data construction (RFC 4034 §3.1.8.1, §6.2, §6.3), DS digest input. - `signature.rs` — algorithm dispatch + DS digest matching (SHA-256). - `verify.rs` — four-step algorithm (root anchor match → chain walk → leaf RRSIG → freshness). - Algorithm coverage (RFC 8624 MUST set): - **alg 8** — RSA-SHA256 (RFC 5702): root, com., most legacy zones. - **alg 13** — ECDSA-P256-SHA256 (RFC 6605): most TLDs, Cloudflare, modern zones. - **alg 15** — Ed25519 (RFC 8080): rare in production but mandatory. - Anything else returns `UnsupportedAlgorithm`. ### Wiring - New `DnssecConfig` and `DnssecRootAnchor` types in `internet_identity_interface`, exposed at the top of `InternetIdentityInit` as `dnssec_config: opt opt DnssecConfig` (set/clear semantics matching `analytics_config` and `dummy_auth`). - Trust-anchor list plumbed through `init`/`post_upgrade` into `PersistentState.dnssec_config` (and `StorablePersistentState` for cross-upgrade persistence). - `internet_identity.did` updated. ### Tests 13 unit tests in `dnssec/` covering: - Real cloudflare.com chain verifies end-to-end (exercises alg 8 at root and alg 13 at com → cloudflare.com → leaf). - Empty trust-anchor list rejected with `NoTrustAnchors`. - Wrong trust anchor rejected with `RootAnchorMismatch`. - Flipped byte in root DNSKEY → `RootAnchorMismatch` or `BadSignature`. - Flipped byte in leaf signature → `BadSignature`. - Stale signature (clock advanced past expiration) → `StaleOrFutureSignature`. - Unsupported algorithm (alg 5 / RSA-SHA1) → `UnsupportedAlgorithm(5)`. - Plus canonical-encoding + RFC 3110 RSA key parsing unit tests. ### Test infrastructure - `test_vectors/dnssec/cloudflare-com-2026-05.json` — real DoH-captured chain (root DNSKEY + 2 delegation links + leaf TXT). - `test_vectors/dnssec/iana-root-anchors-2026-05.json` — IANA root KSK trust anchors (Klajeyz/2017 + Kmyv6jo/2024). - `scripts/capture-dnssec-chain.py` — reproducible capture script using dnspython + DoH wire format. Tests use a frozen now from the capture's metadata so freshness checks stay stable indefinitely. ### New deps - `domain` (NLnet Labs, pure Rust) — referenced in docstrings for canonicalisation primitives; signature verification is hand-rolled on top of RustCrypto. - `p256` — ECDSA P-256 verification. - `ed25519-dalek` — Ed25519 verification. All three build cleanly for wasm32-unknown-unknown. ## What's deferred to later PRs in the stack - Captures for additional providers (proton.me, protonmail.com, tutanota.com — gmail.com / icloud.com / outlook.com / fastmail.com don't sign with DNSSEC; this is acknowledged in design doc §7.6). - Synthetic Ed25519 (alg 15) test vector — most production zones are alg 8 or 13; alg 15 is structurally exercised by the dispatch logic but doesn't have a real captured chain in this PR. ## Test plan - [x] `cargo check -p internet_identity --target wasm32-unknown-unknown` — clean (no warnings). - [x] `cargo test -p internet_identity --bin internet_identity` — 238 tests pass (was 227 pre-PR). - [x] `cargo test -p internet_identity_interface --lib` — 42 tests pass. - [x] `cargo clippy -p internet_identity --bin internet_identity --tests -- -D warnings` — clean. - [x] `cargo fmt --check` — clean (modulo a pre-existing diff in attributes.rs unrelated to this PR). - [x] CI on dfinity#3838 — fully green. ## Design doc https://github.com/sea-snake/internet-identity/blob/design/email-recovery/docs/ongoing/email-recovery.md (PR pending review on dfinity/internet-identity) ## Stack This is PR 1 of a 12-PR series. Subsequent PRs: - **PR 2** — mail-auth-backed DKIM verifier, consuming DnsProofBundle from this PR. - **PR 3** — DMARC alignment. - **PRs 4–9** — storage + Candid + behavior for email recovery. - **PRs 10–12** — frontend (DoH walker, Manage UI, recovery wizard). ## PR Stack | # | PR | Description | Status | |---|---|---|---| | 0 | [dfinity#3836](dfinity#3836) | Design doc | Open | | 1 | [dfinity#3838](dfinity#3838) | DNSSEC verifier scaffold | Open | | 2 | [dfinity#3839](dfinity#3839) | DKIM verifier (RFC 6376) | Open | | 3 | [dfinity#3840](dfinity#3840) | DMARC alignment (RFC 7489) | Open | | 4 | [dfinity#3841](dfinity#3841) | DoH fallback | Open | | 5+6 | [dfinity#3842](dfinity#3842) | Setup flow (storage + smtp_request) | Open | | 7 | [dfinity#3843](dfinity#3843) | Recovery flow (delegation) | Open | | 8 | [dfinity#3844](dfinity#3844) | Frontend + feature flag | Open | | 9 | [dfinity#3855](dfinity#3855) | Deploy/upgrade scripts: dnssec_config + doh_config | Open | | 10 | [dfinity#3857](dfinity#3857) | Email-recovery UX overhaul | Open |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR 2 of the email-recovery stack (
docs/ongoing/email-recovery.md§10 Phase 0). Stacks on top of PR 3838 (DNSSEC verifier). Lands a hand-rolled RFC 6376 DKIM verifier that consumes a parsedSmtpRequestplus an already-trusted DKIM TXT record and returns a per-stepEmailVerificationStatus.Note: This PR targets
mainbut includes PR 3838's commits (DNSSEC verifier) as its base. Review the DKIM-specific changes by looking at commits after9bbd8717(the last PR 3838 commit). Once PR 3838 merges, this PR's diff will shrink to just the DKIM additions.Why hand-rolled
The design originally specified
mail-auth(Stalwart's well-tested DKIM library), but mail-auth pulls a non-optionalhickory-resolverdep that fails to compile forwasm32-unknown-unknown(transitive: tokio + mio). Forking + patching mail-auth would be possible but creates perpetual rebase burden. We hand-roll instead — "the right way, no shortcuts" was the explicit guidance.What's in this PR
src/internet_identity_interface/src/internet_identity/types/smtp.rsBrings forward the SMTP gateway protocol types from PoC PR 3760:
SmtpRequest/SmtpResponse/SmtpHeader/SmtpMessage/SmtpAddress/SmtpEnvelope, the size bounds, and the input-bound validation (format_addresslowercases both halves;truncate_at_char_boundaryclamps to the previous UTF-8 boundary so a multi-byte subject can't trap the canister). Drops postbox-specific bits (PostboxEmail, ValidatedSmtpRequest, anchor-number parser).src/internet_identity/src/dkim/types.rs— Algorithm (RsaSha256, Ed25519Sha256), HeaderCanon/BodyCanon (Relaxed, Simple), DkimCheck/DkimCheckName/DkimCheckStatus per-step diagnostics, EmailVerificationStatus / VerificationFailReason result shape.parse.rs(RFC 6376 §3.5) — DKIM-Signature header tag-list parser. Splits structurally on;first then on the first=per element, so a literalb=substring inside another tag's base64 doesn't get misread as a new tag start (the bug class the PoC PR review specifically flagged). Folded whitespace inside base64 values is stripped before decoding. Tag names case-insensitive; duplicates rejected.canonicalize.rs(§3.4.2 / §3.4.4) — relaxed header canon (lowercase name, unfold continuations, collapse WSP+ to single SP, strip trailing WSP, strip WSP around colon) and relaxed body canon (per-line WSP cleanup, drop trailing empty lines, ensure non-empty output ends in exactly one CRLF).dns_record.rs(§3.6.2) — DKIM TXT record parser. Tag names case-insensitive (P=vsp=was a PoC bug), whitespace insidep=tolerated (multi-chunk DNS TXT records),t=y/t=sflags honoured, unknown tags ignored.signature.rs— RSA-SHA256 (RFC 5702 / RFC 8301) and Ed25519-SHA256 (RFC 8463) signature verification on top ofrsa+sha2+ed25519-dalekfrom PR 1's deps. Enforces 1024-bit RSA minimum per design §5.6. Ed25519 path wraps in SHA-256 per RFC 8463. Plusbody_hash_sha256with optionall=truncation per §3.4.5.verify.rs— orchestration. Multi-signature loop per §5.5 (accept on first pass), tag enforcement per design §5.4 (c=relaxed/* only, x= expiration, i= alignment with d=, k= match, t=y testing-mode), bottom-up header selection per §5.4 when h= lists a name multiple times, b=value blanking that's structural-position-aware so it doesn't mis-target an internal substring.test_vectors.rs—#[cfg(test)].eml loader + 8 end-to-end tests against committed fixtures.test_vectors/dkim/relaxed/relaxed,relaxed/simple,simple/simple).Test plan
cargo check -p internet_identity --target wasm32-unknown-unknown— clean.cargo test -p internet_identity --bin internet_identity dkim— 75 tests pass (parse 14, canonicalize 18, dns_record 16, signature 7, verify 12, end-to-end 8).cargo test -p internet_identity --bin internet_identity— 313 tests pass total (was 238 before this PR; +75 DKIM, plus a few in smtp types).cargo test -p internet_identity_interface --lib— 52 tests pass (was 42; +10 SMTP type tests).cargo clippy -p internet_identity --bin internet_identity --tests -- -D warnings— clean.cargo fmt --check— clean (modulo pre-existing diffs unrelated to this PR).Stack
This is PR 2 of a 12-PR series. Includes PR 3838's commits as its base; once PR 3838 merges, the diff shrinks to just the DKIM additions.
Subsequent PRs:
PR Stack