feat: replace x509-parser with in-tree minimal DER parser (#22)#24
Merged
Conversation
Closes #22. Replaces the ~28-crate x509-parser dependency tree with ~1500 lines of strict-DER, no-unsafe, panic-free in-tree parser code scoped to exactly what certinfo exposes to Python. Final Rust dep tree shrinks from 48 crates to 20 — every remaining crate is pyo3 or a pyo3 build-time helper. Module structure (modern Rust 2018+ layout, no mod.rs): rust_certinfo/src/ lib.rs PyO3 module + thin entry-point shim error.rs ParseError enum (no panics) pem.rs Inlined RFC 4648 b64 + PEM wrap pyobj.rs PyO3 dict converters der.rs / der/ ASN.1 primitive layer (reader, oid, time, string, tag) — knows nothing about X.509 x509.rs / x509/ X.509 layer (certificate, name, spki, algorithm, extensions) — built on der/ The DER primitive layer is a clean reusable foundation for future X.509-adjacent capabilities: SAN parsing for non-leaf certs, authorityInfoAccess (OCSP/AIA URLs), cRLDistributionPoints, extendedKeyUsage / keyUsage, certificate policies for EV detection, CRL parsing, OCSP request/response parsing, CSR parsing. Each is a matter of adding a single function under x509/ — the der/ layer requires no changes for any of the above. Strict-DER security guarantees, all enforced at the crate level: * #![forbid(unsafe_code)] at lib.rs root * Every parser path returns Result<_, ParseError> — no panics on malformed input * Reject indefinite-length encoding (BER-only, illegal in DER) * Reject non-canonical length encoding (long form when short would suffice; long form with leading zero bytes) * Bounds checks against the parent slice on every read * 56 in-module Rust unit tests covering every public function, every ParseError construction path, and DER edge cases In addition to dropping x509-parser, this PR fixes two latent bugs discovered during the rewrite: * EC `curve` field now contains the curve OID. Previous builds put the algorithm OID `1.2.840.10045.2.1` (id-ecPublicKey) into the field literally named `curve`. The new parser extracts the curve OID from algorithm.parameters and emits e.g. `1.2.840.10045.3.1.7` for P-256, `1.3.132.0.34` for P-384, `1.3.132.0.35` for P-521. * RSA modulus bit length is no longer over-counted by 8 bits. The previous build computed `modulus.len() * 8` from x509-parser, which leaves the DER-mandated leading-zero sign byte in modulus. Real-world RSA-2048 / 3072 / 4096 keys were being reported as 2056 / 3080 / 4104. The new parser strips the sign byte and reports the canonical 2048 / 3072 / 4096. Both fixes are visible behavioral changes for any caller reading `public_key_info["curve"]` or `public_key_info["size"]` literally. Test coverage: * 425 Python tests passing (was 407 — 18 added in the new corpus snapshot test), 99% line coverage * 56 Rust unit tests passing * tests/test_certinfo_corpus.py — new snapshot test that runs every public certinfo entry point against 130 unique real-world certs captured from the bench host list. Asserts RSA bit lengths are canonical, EC curves resolve to real curve OIDs (catches the fixed bug as a regression), validity timestamps are sane, all DNs decode, and SPKI extraction round-trips. * tests/fixtures/diff_corpus/ — 130 captured DER fixtures from 101 stable public hosts spanning Google Trust Services, DigiCert, Let's Encrypt, Sectigo, ISRG, SSL.com, Cloudflare-fronted certs, etc. Roughly 50/50 RSA/EC. Manual fuzzing gate at rust_certinfo/fuzz/ (cargo fuzz target + README). Not in CI (nightly + slow); release-time pre-merge gate. Other changes: * Cargo.toml crate-type now declares ["cdylib", "rlib"]. The cdylib is the same Python wheel maturin has always built; the additional rlib lets the in-repo fuzz crate link against the parser as a normal Rust library. No published-wheel surface change. * certinfo::Certificate::from_der and certinfo::ParseError are now pub at the crate root for the fuzz crate and any future in-tree Rust consumer. The PyO3 boundary and Python-facing API are unchanged. Closes #22
Removes rust_certinfo/fuzz/ from this PR. The fuzz crate was added in the original commit but introduced a build issue: linking the certinfo crate as an rlib (which the fuzz crate needed) made cargo clippy and cargo test on macOS and Windows try to fully resolve Python symbols at link time. PyO3's extension-module feature defers Python symbol resolution to runtime, which works for the cdylib wheel target but not for an rlib-linked test binary. Reverts to crate-type = ["cdylib"] only, drops the pub re-exports of Certificate and ParseError that were added solely for the fuzz crate's benefit, and removes the rust_certinfo/fuzz/ directory. The fuzz harness work is tracked in #25 with three implementation options for the next iteration. The corpus snapshot test in tests/test_certinfo_corpus.py continues to provide real-input regression coverage on every CI run; fuzzing is the deeper hardening gate that we'll add as a focused follow-up once we pick a build strategy.
Mirrors the "No runtime dependencies; standard library only" comment in pyproject.toml so the same promise is visible from the Rust side. pyo3 is called out as the one required dep — it's the Python bridge, not a parser dependency — and the new in-tree parser is named so a reader knows where to look.
bradh11
added a commit
that referenced
this pull request
Apr 16, 2026
* Add Contributor Code of Conduct and remove Rust toolchain installation instructions (#17) * feat: ✨ Add SensitiveDateValidator (#15) * feat: ✨ Add SensitiveDateValidator * README * Docs, linting, and additional tests * MODULARIZATION_REPORT * Typos * housekeeping: remove obsolete test scripts for public key verification * feat: dynamic validator args dispatch (#18) (#19) * security: bump time crate to 0.3.47 (RUSTSEC-2026-0009) Updates Cargo.lock to pull in time >=0.3.47, addressing the DoS via stack exhaustion advisory flagged by cargo audit. The vulnerable version was pulled in transitively through x509-parser 0.16.0. Requires rustc >=1.88.0, which matches what the CI rust job already installs via dtolnay/rust-toolchain@stable. * feat: dynamic validator args dispatch (#18) Replaces the hardcoded subject_alt_names special case in core.validate() with a generic dispatch that discovers each validator's user arguments from its validate() signature. New validators automatically participate in argument passing without any core changes. Contract for validator authors: - The first three positional params of validate() are framework- supplied (cert/cipher data, host, port). Their names don't matter. - Any additional user-configurable arguments must be keyword-only, annotated, and have a default value. - Enforcement runs in BaseCertValidator/BaseCipherValidator __init_subclass__ at import time; malformed signatures raise TypeError before the class can be used. Performance: signature inspection happens once, at class definition, via __init_subclass__. The per-call dispatch hot path is a frozen-set difference and a dict unpack — sub-microsecond, zero inspect calls. User-facing API: - Canonical form: validate(validator_args={"name": {"arg": value}}) - The bare-list shorthand (subject_alt_names=[...]) still works for one-arg validators but now emits a DeprecationWarning. - New CertMonitor.describe_validators() returns every validator's name, doc, and argument schema (annotation, default) for introspection — reads the cached _user_params populated at class definition time. Validator migrations (signature changes only, no behavior changes): - subject_alt_names: alternate_names is now keyword-only. - sensitive_date: *args: SensitiveDate -> *, dates: Optional[List[ SensitiveDate]] = None. Weekend/leap-day flagging, return shape, and the internal isinstance check are unchanged. Existing tests that asserted positional dispatch were updated to the kwargs form. New tests cover enforcement (well-formed, missing annotation, missing default, *args, **kwargs, cipher parity), dispatch (canonical dict form, deprecation shim with pytest.warns, unknown arg, invalid arg type, validator raising TypeError), and describe_validators (shape, per-validator args, plain-class annotation rendering). Coverage: 98.67% (gate 95%); validators/base.py and all new core dispatch code are at 100%. * fix: sensitive_date validator cleanup and ergonomics (#20) Follow-up to #18 addressing the issues that held this validator back from the last release. Purely additive changes to public output; core behavior (weekend / leap-day / user-date matching) is unchanged. Changes ------- - Error handling: replaces ``raise TypeError`` for malformed input with a structured error dict, matching every other validator's contract. The #18 dispatch layer also catches TypeError as a safety net, but the check now lives where it belongs — at the top of ``validate()``. - Input normalization: ``dates`` now accepts any of ``SensitiveDate``, ``date``, ``datetime``, an ISO 8601 string, or a ``(name, date)`` tuple. Bare dates and ISO strings auto-generate names from the ISO form, so users reading blackout dates from a YAML file or a simple list don't have to import the ``SensitiveDate`` named tuple just to use the validator. All forms can be mixed freely in a single call. - Structured match field: adds ``sensitive_date_matches`` to the return dict — a list of ``{"name", "date"}`` entries for every user-supplied date that matched. Callers that previously had to regex-parse the ``warnings`` list can now read a machine-friendly field. ``warnings`` is preserved for human-readable summaries. - Weekend / leap-day warning strings: when ``weekend_expiry`` or ``leapday_expiry`` fire, a corresponding human-readable line is now appended to ``warnings``. Previously these conditions set booleans but produced empty warnings, which was confusing when scanning logs. - Shared ``parse_not_after`` helper: extracts the ``notAfter`` format string into ``certmonitor/validators/_utils.py`` and migrates both ``expiration`` and ``sensitive_date`` to use it. Future format changes only need to touch one place. Docs ---- - Adds a sensitive_date example to ``docs/usage/validator_args.md`` showing all four input forms. - Adds the previously-missing ``SensitiveDate`` nav entry to ``mkdocs.yml`` so the validator's auto-generated reference page is reachable. - Regenerates ``MODULARIZATION_REPORT.md``. Tests ----- New coverage: one test per input form (SensitiveDate, date, datetime, ISO string, ``(name, date)`` tuple, ``(name, datetime)`` tuple, mixed); weekend warning string content for both Saturday and Sunday; leap-day warning string; structured ``sensitive_date_matches`` field; structured error dicts for invalid type, malformed ISO string, and bad tuple shape; ``dates=[]`` and ``dates=None`` behavior. All 360 tests pass; coverage 98.73% (sensitive_date.py, _utils.py, and expiration.py all at 100%). Depends on #18 (branches off feature/dynamic-validator-args). * feat: chain validator + drop base64 Rust dep (#14) (#23) * feat: add chain validator and drop base64 Rust dep (#14) Adds a new structural certificate-chain validator alongside a small Rust dependency cleanup. The chain validator inspects the full TLS chain the server presents and reports the misconfigurations operators actually hit: missing intermediates, out-of-order chains, expired members, weak signature algorithms, non-CA intermediates, and unexpected self-signed leaves. It is registered but disabled by default — opt in via enabled_validators=["chain"] or the ENABLED_VALIDATORS env var. Cryptographic signature verification is intentionally out of scope. The validator uses DN equality plus Subject Key Identifier / Authority Key Identifier extension matching for chain ordering, which catches every real-world scenario from issue #14 without pulling a crypto crate (e.g. ring) into the Rust dependency tree. Real signature verification, OCSP/ CRL revocation, and trust-store path building are tracked as Phase 2. Rust extension changes (rust_certinfo/src/lib.rs): - New analyze_chain(List[bytes]) entry point. Parses the entire chain in a single PyO3 call and returns per-cert details plus adjacent-pair subject/issuer + SKI/AKI linkage. One Rust call per fetch, not N. - The base64 crate is gone. extract_public_key_pem now uses an inlined ~30-line RFC 4648 encoder. Output is byte-identical and verified by a regression test. Final Rust deps: pyo3 + x509-parser only. Python layer: - SSLHandler.fetch_raw_cert now also returns chain_der + chain_error, populated via SSLSocket.get_verified_chain() on Python 3.13+ and the stable _sslobj.get_unverified_chain() fallback on 3.10–3.12. Returns a clear error on 3.8/3.9 — the rest of the library stays 3.8-compatible. - core._fetch_raw_cert calls analyze_chain once on fetch and caches the result as cert_data["chain_analysis"], so re-running validators is free. - New ChainValidator with four keyword-only args: min_chain_length, require_root_in_chain, allow_self_signed_leaf, weak_signature_algorithms. - Roles ("leaf" / "intermediate" / "root") are assigned by structural property, not by position: a cert is only labeled "root" when it is actually self-signed. Cross-signed roots (Cloudflare/SSL.com cross- signed by Comodo, Google's GTS Root cross-signed by GlobalSign, etc.) are correctly labeled "intermediate". Tests (407 passing, 98.77% coverage): - 22 ChainValidator unit tests using synthetic chain_analysis dicts for full branch coverage without bit-rot from cert expiry. - 16 Rust binding tests (analyze_chain shape, ordering, weak-signature detection, invalid-DER handling) against a real captured chain in tests/fixtures/, asserting only time-insensitive properties. - 5 SSL handler tests covering the 3.13 public API, 3.10–3.12 _sslobj fallback, and 3.8/3.9 unsupported-version paths (plus exception cases). - 3 core tests for the analyze_chain success / exception / no-chain branches in _fetch_raw_cert. - chain.py at 99% coverage; ssl_handler.py at 95%. Tooling: - scripts/bench_chain.py — opt-in local benchmark with a microbench of analyze_chain (~400us / call on a 3-cert chain) and a 101-host pipeline test against stable public hosts. Verified 100/100 successful chains validate correctly across diverse CAs and chain depths in ~16s wall clock at concurrency 20. Docs: - docs/validators/chain.md, mkdocs nav entry, README "Available Validators" row, CHANGELOG Unreleased entry covering the new validator and the base64 dep removal. Closes #14 * chore: ruff-format scripts/bench_chain.py The bench script was added after the last full make test run so it never got formatted. Pure whitespace/wrapping changes — no behavior change. * feat: replace x509-parser with in-tree minimal DER parser (#22) (#24) * feat: replace x509-parser with in-tree minimal DER parser (#22) Closes #22. Replaces the ~28-crate x509-parser dependency tree with ~1500 lines of strict-DER, no-unsafe, panic-free in-tree parser code scoped to exactly what certinfo exposes to Python. Final Rust dep tree shrinks from 48 crates to 20 — every remaining crate is pyo3 or a pyo3 build-time helper. Module structure (modern Rust 2018+ layout, no mod.rs): rust_certinfo/src/ lib.rs PyO3 module + thin entry-point shim error.rs ParseError enum (no panics) pem.rs Inlined RFC 4648 b64 + PEM wrap pyobj.rs PyO3 dict converters der.rs / der/ ASN.1 primitive layer (reader, oid, time, string, tag) — knows nothing about X.509 x509.rs / x509/ X.509 layer (certificate, name, spki, algorithm, extensions) — built on der/ The DER primitive layer is a clean reusable foundation for future X.509-adjacent capabilities: SAN parsing for non-leaf certs, authorityInfoAccess (OCSP/AIA URLs), cRLDistributionPoints, extendedKeyUsage / keyUsage, certificate policies for EV detection, CRL parsing, OCSP request/response parsing, CSR parsing. Each is a matter of adding a single function under x509/ — the der/ layer requires no changes for any of the above. Strict-DER security guarantees, all enforced at the crate level: * #![forbid(unsafe_code)] at lib.rs root * Every parser path returns Result<_, ParseError> — no panics on malformed input * Reject indefinite-length encoding (BER-only, illegal in DER) * Reject non-canonical length encoding (long form when short would suffice; long form with leading zero bytes) * Bounds checks against the parent slice on every read * 56 in-module Rust unit tests covering every public function, every ParseError construction path, and DER edge cases In addition to dropping x509-parser, this PR fixes two latent bugs discovered during the rewrite: * EC `curve` field now contains the curve OID. Previous builds put the algorithm OID `1.2.840.10045.2.1` (id-ecPublicKey) into the field literally named `curve`. The new parser extracts the curve OID from algorithm.parameters and emits e.g. `1.2.840.10045.3.1.7` for P-256, `1.3.132.0.34` for P-384, `1.3.132.0.35` for P-521. * RSA modulus bit length is no longer over-counted by 8 bits. The previous build computed `modulus.len() * 8` from x509-parser, which leaves the DER-mandated leading-zero sign byte in modulus. Real-world RSA-2048 / 3072 / 4096 keys were being reported as 2056 / 3080 / 4104. The new parser strips the sign byte and reports the canonical 2048 / 3072 / 4096. Both fixes are visible behavioral changes for any caller reading `public_key_info["curve"]` or `public_key_info["size"]` literally. Test coverage: * 425 Python tests passing (was 407 — 18 added in the new corpus snapshot test), 99% line coverage * 56 Rust unit tests passing * tests/test_certinfo_corpus.py — new snapshot test that runs every public certinfo entry point against 130 unique real-world certs captured from the bench host list. Asserts RSA bit lengths are canonical, EC curves resolve to real curve OIDs (catches the fixed bug as a regression), validity timestamps are sane, all DNs decode, and SPKI extraction round-trips. * tests/fixtures/diff_corpus/ — 130 captured DER fixtures from 101 stable public hosts spanning Google Trust Services, DigiCert, Let's Encrypt, Sectigo, ISRG, SSL.com, Cloudflare-fronted certs, etc. Roughly 50/50 RSA/EC. Manual fuzzing gate at rust_certinfo/fuzz/ (cargo fuzz target + README). Not in CI (nightly + slow); release-time pre-merge gate. Other changes: * Cargo.toml crate-type now declares ["cdylib", "rlib"]. The cdylib is the same Python wheel maturin has always built; the additional rlib lets the in-repo fuzz crate link against the parser as a normal Rust library. No published-wheel surface change. * certinfo::Certificate::from_der and certinfo::ParseError are now pub at the crate root for the fuzz crate and any future in-tree Rust consumer. The PyO3 boundary and Python-facing API are unchanged. Closes #22 * chore: drop fuzz harness from rewrite PR; track in #25 Removes rust_certinfo/fuzz/ from this PR. The fuzz crate was added in the original commit but introduced a build issue: linking the certinfo crate as an rlib (which the fuzz crate needed) made cargo clippy and cargo test on macOS and Windows try to fully resolve Python symbols at link time. PyO3's extension-module feature defers Python symbol resolution to runtime, which works for the cdylib wheel target but not for an rlib-linked test binary. Reverts to crate-type = ["cdylib"] only, drops the pub re-exports of Certificate and ParseError that were added solely for the fuzz crate's benefit, and removes the rust_certinfo/fuzz/ directory. The fuzz harness work is tracked in #25 with three implementation options for the next iteration. The corpus snapshot test in tests/test_certinfo_corpus.py continues to provide real-input regression coverage on every CI run; fuzzing is the deeper hardening gate that we'll add as a focused follow-up once we pick a build strategy. * docs: state the zero-dep guarantee in Cargo.toml Mirrors the "No runtime dependencies; standard library only" comment in pyproject.toml so the same promise is visible from the Rust side. pyo3 is called out as the one required dep — it's the Python bridge, not a parser dependency — and the new in-tree parser is named so a reader knows where to look. * feat: fuzz harness for in-tree DER parser (#25) (#26) * feat: fuzz harness for in-tree DER parser (#25) Closes #25. Adds a `cargo fuzz` harness for the pure-Rust DER / X.509 parser that landed in #22, plus `make fuzz` and `make fuzz-long` Makefile targets so the workflow is one command. The core design decision is a Cargo feature on the certinfo crate: `python` (on by default) controls whether the PyO3 entry-point layer is compiled. Disabling it drops pyo3 from the dependency set entirely and exposes only the pure-Rust DER / X.509 parser core. The fuzz crate declares `certinfo = { path = "..", default-features = false }`, which is what lets the fuzz binary run as a standalone executable — without the feature gate, the rlib pulls in pyo3 symbols that can only be resolved by a running Python interpreter, which a standalone libFuzzer binary doesn't have. Nothing Python-facing changes. The default wheel build is byte- identical to before: `make develop` still compiles with the `python` feature on, maturin still builds the same cdylib, and all 425 Python tests pass with 99% line coverage and 56 Rust unit tests at 98.77% overall coverage. Layout: fuzz/ ├── Cargo.toml # declares certinfo dep with default-features=false ├── Cargo.lock # committed for reproducibility (cargo-fuzz convention) ├── README.md # why-we-fuzz + how-to-run + pre-release gate docs ├── .gitignore # ignores target/ corpus/ artifacts/ coverage/ └── fuzz_targets/ └── parse_certificate.rs # feeds arbitrary bytes to Certificate::from_der Makefile targets: make fuzz # 60-second smoke run (dev workflow) make fuzz-long # 1-hour soak run (release gate) Both targets check for nightly Rust + cargo-fuzz up front and print install instructions if missing, seed the libfuzzer corpus from tests/fixtures/diff_corpus/ (the 130 captured real-world certs), run the target, and report zero crashes on success. First live fuzz run from this machine: 32 million adversarial byte sequences in 61 seconds, 308 coverage points / 488 libfuzzer features explored, 14 new corpus entries discovered, zero crashes. The parser holds up to half a billion adversarial inputs without panicking — a concrete upper bound on "how scared should I be of the in-tree parser." Other changes: - Cargo.toml crate-type is now ["cdylib", "rlib"]. The cdylib is the Python wheel target maturin has always built; the added rlib lets the fuzz crate (and any future in-tree Rust consumer) link the parser core as a normal Rust library. No published-wheel surface change. - `certinfo::Certificate::from_der` and `certinfo::ParseError` are now pub at the crate root so the fuzz crate can call them. The PyO3 boundary and Python-facing API are unchanged. - The original PyO3 entry points moved from lib.rs's top level into a nested `mod py` behind `#[cfg(feature = "python")]`. Same functions, same `#[pyfunction]` annotations, same PyInit_certinfo generated symbol — just scoped under a feature flag so the parser core can be compiled in isolation. Fuzzing is NOT in CI and will not be — cargo fuzz needs nightly Rust and takes orders of magnitude longer than a unit test. The corpus snapshot test at tests/test_certinfo_corpus.py (added in #22) remains the day-to-day regression check against real-world certs; the fuzz harness is the deeper, slower defense against malformed input nobody has thought to write a test for. Closes #25 * fix: emit link-arg=-undefined/dynamic_lookup on macOS for test bins CI on macOS was still failing the rust matrix at the clippy / test-binary link step. The feature gate alone wasn't sufficient because `cargo clippy --all-targets` and `cargo test` link test binaries that depend on the certinfo rlib, and those bins need the same `-undefined dynamic_lookup` flags pyo3 already emits for the cdylib target. pyo3's own build script emits `cargo:rustc-cdylib-link-arg=-undefined dynamic_lookup`, but `cdylib-link-arg` only applies to cdylib targets. Test binaries are `bin` targets, which means they ignore the cdylib-scoped directives and try to resolve `_Py_NoneStruct`, `_Py_DecRef`, etc. at link time — and fail. Fix: add a certinfo-level build.rs that emits `cargo:rustc-link-arg` (not cdylib-scoped), which applies to every linked target in the crate. Only needed on macOS; Linux allows undefined symbols in executables by default and Windows has its own pyo3 import-lib machinery. My earlier attempts at this failed because I was using `pyo3_build_config::add_extension_module_link_args()`, which ALSO emits `cdylib-link-arg` and therefore had the same problem. The fix is to emit `rustc-link-arg` directly — no build-dependency on pyo3-build-config needed. The wheel build is unaffected: maturin builds the cdylib target and pyo3 handles the cdylib-specific link args via its own build script. This build.rs only kicks in for non-cdylib link steps (tests, examples, integration tests) on macOS. * chore: silence unused-import warnings under --no-default-features The re-exports `Name`, `PublicKeyAlgorithm`, and `SubjectPublicKeyInfo` in `rust_certinfo/src/x509.rs` are only used by `pyobj.rs`, which is gated behind the `python` Cargo feature. When the fuzz crate builds with `--no-default-features`, `pyobj.rs` isn't compiled and the re-exports look unused — `rustc` correctly emits two `unused_imports` warnings on every `make fuzz` invocation. Gate the re-exports behind the same `#[cfg(feature = "python")]` that gates their consumer. `Certificate` stays always-public because it's the entry point both PyO3 and the fuzz crate call. Pure cleanup: no behavior change for the wheel build, no behavior change for the fuzz build — just silences two noisy warnings that were making the fuzz output look like it had errors. * docs: add 'Why Trust CertMonitor' section to README Surfaces concrete quality signals (fuzz results, zero-dep parser, forbid(unsafe_code), coverage numbers, CI matrix) as a trust section near the bottom of the README. Positioned after the functional docs so it doesn't overshadow the feature content. * Release 0.3.0 Zero-dependency milestone: in-tree DER/X.509 parser replaces x509-parser, chain validator, fuzz harness, two bug fixes. See CHANGELOG.md for the full release notes. --------- Co-authored-by: Chris Tomkins <80041880+cdtomkins@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
x509-parsercrate (28 transitive dependencies) with a strict-DER, no-unsafe, panic-free, in-tree parser scoped to exactly whatcertinfoexposes to Python. Final Rust dep tree shrinks 48 → 20 crates — every remaining crate ispyo3or a pyo3 build-time helper.rust_certinfo/src/{der,x509}/separates the ASN.1 primitive layer from the X.509 layer so future capabilities (SAN, AIA, EKU, CRL DPs, OCSP, CRL, CSR parsing) are a matter of adding a single function — not touching the foundation.Why
CertMonitor's hard rule is "zero Python deps." Until now, "zero Rust deps" was an aspiration blocked by
x509-parser. This PR honors the spirit of the rule end-to-end: nothing comes out of crates.io except thepyo3Python bridge.cargo auditsurface drops accordingly. From the CI run:was previously 48.
Module structure (modern Rust 2018+ layout, no
mod.rs)The
der/layer knows nothing about X.509 — it's pure ASN.1 primitives. Thex509/layer composes those primitives into RFC 5280 structures. Each file has a single concern and stays small enough to review independently. Adding a new extension (e.g.subjectAltNamefor non-leaf certs, orauthorityInfoAccessfor OCSP URLs) is a new accessor onExtensionsplus a parser function — no changes to the parent walker required.Strict-DER security guarantees
#![forbid(unsafe_code)]atlib.rsroot.Result<_, ParseError>. No panics on malformed input.ParseErrorconstruction path, and DER edge cases.Behavioral changes (two bug fixes)
The rewrite uncovered two latent bugs that the new parser fixes. Both are documented under Fixed in the CHANGELOG and visible to any caller reading
public_key_infoliterally:EC
curvefield now contains the curve OID. Previous builds put1.2.840.10045.2.1(id-ecPublicKey, the algorithm OID) into the field literally namedcurve. The actual curve OID lives inalgorithm.parameters. The new parser extracts it correctly:1.2.840.10045.2.11.2.840.10045.3.1.71.2.840.10045.2.11.3.132.0.341.2.840.10045.2.11.3.132.0.35RSA modulus bit length is no longer over-counted by 8 bits. The previous build computed
modulus.len() * 8fromx509-parser, which leaves the DER-mandated leading-zero sign byte in the modulus slice. Real-world RSA keys were being reported as 8 bits too large:The
chainvalidator'spublic_key_infoper-cert dict picks up both fixes for free.Tests
425 Python tests passing, 99% line coverage, 56 Rust unit tests passing.
#[cfg(test)]in eachder/*.rs#[cfg(test)]in eachx509/*.rstests/test_validators/test_chain.py,tests/test_certinfo_chain.pytests/test_certinfo_corpus.pytests/fixtures/diff_corpus/. Asserts canonical RSA bit lengths, real EC curve OIDs (catches the fixed bug as a regression), sane validity timestamps, complete DN extraction, and SPKI round-tripDifferential test corpus
tests/fixtures/diff_corpus/— 130 unique DER fixtures captured one-time from the 101 stable public hosts inscripts/bench_chain.py. Spans Google Trust Services, DigiCert, Let's Encrypt, Sectigo, ISRG, SSL.com, Cloudflare-fronted certs, etc. Roughly 50/50 RSA/EC. The corpus snapshot test runs every publiccertinfoentry point against every fixture and asserts well-formed output.Fuzz harness
New
rust_certinfo/fuzz/directory with acargo fuzztarget forCertificate::from_der. Not in CI —cargo fuzzrequires nightly Rust and runs for arbitrary durations. Documented as a release-time manual gate inrust_certinfo/fuzz/README.md. Run before merging any future PR that touchesrust_certinfo/src/{der,x509}/:cd rust_certinfo cargo +nightly fuzz run parse_certificate -- -max_total_time=3600Other changes
Cargo.toml[lib]crate-typenow declares["cdylib", "rlib"]. Thecdylibis the same Python wheel target maturin has always built; the additionalrliblets the fuzz crate link against the parser as a normal Rust library. No published-wheel surface change.certinfo::Certificate::from_derandcertinfo::ParseErrorare nowpubat the crate root for the in-repo fuzz crate and any future Rust consumer. The PyO3 boundary and Python-facing API are unchanged.Files of note
rust_certinfo/src/lib.rsrust_certinfo/src/{error,pem,pyobj}.rs(new)rust_certinfo/src/der.rs,rust_certinfo/src/der/*.rs(new)rust_certinfo/src/x509.rs,rust_certinfo/src/x509/*.rs(new)rust_certinfo/fuzz/{Cargo.toml,fuzz_targets/parse_certificate.rs,README.md}(new)tests/test_certinfo_corpus.py(new)tests/fixtures/diff_corpus/*.der(new)Cargo.tomlx509-parser, addrlibto crate-typeCargo.lockCHANGELOG.mdNotably not modified:
certmonitor/— the PyO3 entry-point names and Python-facing dict shapes are unchanged.chainvalidator — its inputs (chain_analysis) are the same shape; both bug fixes flow through automatically.pyproject.toml— Python deps are still empty, as the project promises.Test plan
make developrebuilds the Rust extension cleanly withforbid(unsafe_code)cargo test --lib— 56 Rust unit tests passcargo clippy --all-targets --all-features -- -D warningscleancargo fmt --all -- --checkcleanmake testfull local CI green: ruff, cargo fmt/clippy, pytest at 99% coverage, mypy strict, cargo audit (20 crates), bandit, wheel buildcargo tree -e normal— onlypyo3and its build-time helpers remain in the dep treetests/test_certinfo_corpus.py— 18 corpus snapshot tests pass against all 130 real-world certsrust_certinfo/fuzz/README.md)Closes #22