feat: fuzz harness for in-tree DER parser (#25)#26
Merged
Conversation
Closes #25. Adds a `cargo fuzz` harness for the pure-Rust DER / X.509 parser that landed in #22, plus `make fuzz` and `make fuzz-long` Makefile targets so the workflow is one command. The core design decision is a Cargo feature on the certinfo crate: `python` (on by default) controls whether the PyO3 entry-point layer is compiled. Disabling it drops pyo3 from the dependency set entirely and exposes only the pure-Rust DER / X.509 parser core. The fuzz crate declares `certinfo = { path = "..", default-features = false }`, which is what lets the fuzz binary run as a standalone executable — without the feature gate, the rlib pulls in pyo3 symbols that can only be resolved by a running Python interpreter, which a standalone libFuzzer binary doesn't have. Nothing Python-facing changes. The default wheel build is byte- identical to before: `make develop` still compiles with the `python` feature on, maturin still builds the same cdylib, and all 425 Python tests pass with 99% line coverage and 56 Rust unit tests at 98.77% overall coverage. Layout: fuzz/ ├── Cargo.toml # declares certinfo dep with default-features=false ├── Cargo.lock # committed for reproducibility (cargo-fuzz convention) ├── README.md # why-we-fuzz + how-to-run + pre-release gate docs ├── .gitignore # ignores target/ corpus/ artifacts/ coverage/ └── fuzz_targets/ └── parse_certificate.rs # feeds arbitrary bytes to Certificate::from_der Makefile targets: make fuzz # 60-second smoke run (dev workflow) make fuzz-long # 1-hour soak run (release gate) Both targets check for nightly Rust + cargo-fuzz up front and print install instructions if missing, seed the libfuzzer corpus from tests/fixtures/diff_corpus/ (the 130 captured real-world certs), run the target, and report zero crashes on success. First live fuzz run from this machine: 32 million adversarial byte sequences in 61 seconds, 308 coverage points / 488 libfuzzer features explored, 14 new corpus entries discovered, zero crashes. The parser holds up to half a billion adversarial inputs without panicking — a concrete upper bound on "how scared should I be of the in-tree parser." Other changes: - Cargo.toml crate-type is now ["cdylib", "rlib"]. The cdylib is the Python wheel target maturin has always built; the added rlib lets the fuzz crate (and any future in-tree Rust consumer) link the parser core as a normal Rust library. No published-wheel surface change. - `certinfo::Certificate::from_der` and `certinfo::ParseError` are now pub at the crate root so the fuzz crate can call them. The PyO3 boundary and Python-facing API are unchanged. - The original PyO3 entry points moved from lib.rs's top level into a nested `mod py` behind `#[cfg(feature = "python")]`. Same functions, same `#[pyfunction]` annotations, same PyInit_certinfo generated symbol — just scoped under a feature flag so the parser core can be compiled in isolation. Fuzzing is NOT in CI and will not be — cargo fuzz needs nightly Rust and takes orders of magnitude longer than a unit test. The corpus snapshot test at tests/test_certinfo_corpus.py (added in #22) remains the day-to-day regression check against real-world certs; the fuzz harness is the deeper, slower defense against malformed input nobody has thought to write a test for. Closes #25
CI on macOS was still failing the rust matrix at the clippy / test-binary link step. The feature gate alone wasn't sufficient because `cargo clippy --all-targets` and `cargo test` link test binaries that depend on the certinfo rlib, and those bins need the same `-undefined dynamic_lookup` flags pyo3 already emits for the cdylib target. pyo3's own build script emits `cargo:rustc-cdylib-link-arg=-undefined dynamic_lookup`, but `cdylib-link-arg` only applies to cdylib targets. Test binaries are `bin` targets, which means they ignore the cdylib-scoped directives and try to resolve `_Py_NoneStruct`, `_Py_DecRef`, etc. at link time — and fail. Fix: add a certinfo-level build.rs that emits `cargo:rustc-link-arg` (not cdylib-scoped), which applies to every linked target in the crate. Only needed on macOS; Linux allows undefined symbols in executables by default and Windows has its own pyo3 import-lib machinery. My earlier attempts at this failed because I was using `pyo3_build_config::add_extension_module_link_args()`, which ALSO emits `cdylib-link-arg` and therefore had the same problem. The fix is to emit `rustc-link-arg` directly — no build-dependency on pyo3-build-config needed. The wheel build is unaffected: maturin builds the cdylib target and pyo3 handles the cdylib-specific link args via its own build script. This build.rs only kicks in for non-cdylib link steps (tests, examples, integration tests) on macOS.
The re-exports `Name`, `PublicKeyAlgorithm`, and `SubjectPublicKeyInfo` in `rust_certinfo/src/x509.rs` are only used by `pyobj.rs`, which is gated behind the `python` Cargo feature. When the fuzz crate builds with `--no-default-features`, `pyobj.rs` isn't compiled and the re-exports look unused — `rustc` correctly emits two `unused_imports` warnings on every `make fuzz` invocation. Gate the re-exports behind the same `#[cfg(feature = "python")]` that gates their consumer. `Certificate` stays always-public because it's the entry point both PyO3 and the fuzz crate call. Pure cleanup: no behavior change for the wheel build, no behavior change for the fuzz build — just silences two noisy warnings that were making the fuzz output look like it had errors.
Surfaces concrete quality signals (fuzz results, zero-dep parser, forbid(unsafe_code), coverage numbers, CI matrix) as a trust section near the bottom of the README. Positioned after the functional docs so it doesn't overshadow the feature content.
bradh11
added a commit
that referenced
this pull request
Apr 16, 2026
* Add Contributor Code of Conduct and remove Rust toolchain installation instructions (#17) * feat: ✨ Add SensitiveDateValidator (#15) * feat: ✨ Add SensitiveDateValidator * README * Docs, linting, and additional tests * MODULARIZATION_REPORT * Typos * housekeeping: remove obsolete test scripts for public key verification * feat: dynamic validator args dispatch (#18) (#19) * security: bump time crate to 0.3.47 (RUSTSEC-2026-0009) Updates Cargo.lock to pull in time >=0.3.47, addressing the DoS via stack exhaustion advisory flagged by cargo audit. The vulnerable version was pulled in transitively through x509-parser 0.16.0. Requires rustc >=1.88.0, which matches what the CI rust job already installs via dtolnay/rust-toolchain@stable. * feat: dynamic validator args dispatch (#18) Replaces the hardcoded subject_alt_names special case in core.validate() with a generic dispatch that discovers each validator's user arguments from its validate() signature. New validators automatically participate in argument passing without any core changes. Contract for validator authors: - The first three positional params of validate() are framework- supplied (cert/cipher data, host, port). Their names don't matter. - Any additional user-configurable arguments must be keyword-only, annotated, and have a default value. - Enforcement runs in BaseCertValidator/BaseCipherValidator __init_subclass__ at import time; malformed signatures raise TypeError before the class can be used. Performance: signature inspection happens once, at class definition, via __init_subclass__. The per-call dispatch hot path is a frozen-set difference and a dict unpack — sub-microsecond, zero inspect calls. User-facing API: - Canonical form: validate(validator_args={"name": {"arg": value}}) - The bare-list shorthand (subject_alt_names=[...]) still works for one-arg validators but now emits a DeprecationWarning. - New CertMonitor.describe_validators() returns every validator's name, doc, and argument schema (annotation, default) for introspection — reads the cached _user_params populated at class definition time. Validator migrations (signature changes only, no behavior changes): - subject_alt_names: alternate_names is now keyword-only. - sensitive_date: *args: SensitiveDate -> *, dates: Optional[List[ SensitiveDate]] = None. Weekend/leap-day flagging, return shape, and the internal isinstance check are unchanged. Existing tests that asserted positional dispatch were updated to the kwargs form. New tests cover enforcement (well-formed, missing annotation, missing default, *args, **kwargs, cipher parity), dispatch (canonical dict form, deprecation shim with pytest.warns, unknown arg, invalid arg type, validator raising TypeError), and describe_validators (shape, per-validator args, plain-class annotation rendering). Coverage: 98.67% (gate 95%); validators/base.py and all new core dispatch code are at 100%. * fix: sensitive_date validator cleanup and ergonomics (#20) Follow-up to #18 addressing the issues that held this validator back from the last release. Purely additive changes to public output; core behavior (weekend / leap-day / user-date matching) is unchanged. Changes ------- - Error handling: replaces ``raise TypeError`` for malformed input with a structured error dict, matching every other validator's contract. The #18 dispatch layer also catches TypeError as a safety net, but the check now lives where it belongs — at the top of ``validate()``. - Input normalization: ``dates`` now accepts any of ``SensitiveDate``, ``date``, ``datetime``, an ISO 8601 string, or a ``(name, date)`` tuple. Bare dates and ISO strings auto-generate names from the ISO form, so users reading blackout dates from a YAML file or a simple list don't have to import the ``SensitiveDate`` named tuple just to use the validator. All forms can be mixed freely in a single call. - Structured match field: adds ``sensitive_date_matches`` to the return dict — a list of ``{"name", "date"}`` entries for every user-supplied date that matched. Callers that previously had to regex-parse the ``warnings`` list can now read a machine-friendly field. ``warnings`` is preserved for human-readable summaries. - Weekend / leap-day warning strings: when ``weekend_expiry`` or ``leapday_expiry`` fire, a corresponding human-readable line is now appended to ``warnings``. Previously these conditions set booleans but produced empty warnings, which was confusing when scanning logs. - Shared ``parse_not_after`` helper: extracts the ``notAfter`` format string into ``certmonitor/validators/_utils.py`` and migrates both ``expiration`` and ``sensitive_date`` to use it. Future format changes only need to touch one place. Docs ---- - Adds a sensitive_date example to ``docs/usage/validator_args.md`` showing all four input forms. - Adds the previously-missing ``SensitiveDate`` nav entry to ``mkdocs.yml`` so the validator's auto-generated reference page is reachable. - Regenerates ``MODULARIZATION_REPORT.md``. Tests ----- New coverage: one test per input form (SensitiveDate, date, datetime, ISO string, ``(name, date)`` tuple, ``(name, datetime)`` tuple, mixed); weekend warning string content for both Saturday and Sunday; leap-day warning string; structured ``sensitive_date_matches`` field; structured error dicts for invalid type, malformed ISO string, and bad tuple shape; ``dates=[]`` and ``dates=None`` behavior. All 360 tests pass; coverage 98.73% (sensitive_date.py, _utils.py, and expiration.py all at 100%). Depends on #18 (branches off feature/dynamic-validator-args). * feat: chain validator + drop base64 Rust dep (#14) (#23) * feat: add chain validator and drop base64 Rust dep (#14) Adds a new structural certificate-chain validator alongside a small Rust dependency cleanup. The chain validator inspects the full TLS chain the server presents and reports the misconfigurations operators actually hit: missing intermediates, out-of-order chains, expired members, weak signature algorithms, non-CA intermediates, and unexpected self-signed leaves. It is registered but disabled by default — opt in via enabled_validators=["chain"] or the ENABLED_VALIDATORS env var. Cryptographic signature verification is intentionally out of scope. The validator uses DN equality plus Subject Key Identifier / Authority Key Identifier extension matching for chain ordering, which catches every real-world scenario from issue #14 without pulling a crypto crate (e.g. ring) into the Rust dependency tree. Real signature verification, OCSP/ CRL revocation, and trust-store path building are tracked as Phase 2. Rust extension changes (rust_certinfo/src/lib.rs): - New analyze_chain(List[bytes]) entry point. Parses the entire chain in a single PyO3 call and returns per-cert details plus adjacent-pair subject/issuer + SKI/AKI linkage. One Rust call per fetch, not N. - The base64 crate is gone. extract_public_key_pem now uses an inlined ~30-line RFC 4648 encoder. Output is byte-identical and verified by a regression test. Final Rust deps: pyo3 + x509-parser only. Python layer: - SSLHandler.fetch_raw_cert now also returns chain_der + chain_error, populated via SSLSocket.get_verified_chain() on Python 3.13+ and the stable _sslobj.get_unverified_chain() fallback on 3.10–3.12. Returns a clear error on 3.8/3.9 — the rest of the library stays 3.8-compatible. - core._fetch_raw_cert calls analyze_chain once on fetch and caches the result as cert_data["chain_analysis"], so re-running validators is free. - New ChainValidator with four keyword-only args: min_chain_length, require_root_in_chain, allow_self_signed_leaf, weak_signature_algorithms. - Roles ("leaf" / "intermediate" / "root") are assigned by structural property, not by position: a cert is only labeled "root" when it is actually self-signed. Cross-signed roots (Cloudflare/SSL.com cross- signed by Comodo, Google's GTS Root cross-signed by GlobalSign, etc.) are correctly labeled "intermediate". Tests (407 passing, 98.77% coverage): - 22 ChainValidator unit tests using synthetic chain_analysis dicts for full branch coverage without bit-rot from cert expiry. - 16 Rust binding tests (analyze_chain shape, ordering, weak-signature detection, invalid-DER handling) against a real captured chain in tests/fixtures/, asserting only time-insensitive properties. - 5 SSL handler tests covering the 3.13 public API, 3.10–3.12 _sslobj fallback, and 3.8/3.9 unsupported-version paths (plus exception cases). - 3 core tests for the analyze_chain success / exception / no-chain branches in _fetch_raw_cert. - chain.py at 99% coverage; ssl_handler.py at 95%. Tooling: - scripts/bench_chain.py — opt-in local benchmark with a microbench of analyze_chain (~400us / call on a 3-cert chain) and a 101-host pipeline test against stable public hosts. Verified 100/100 successful chains validate correctly across diverse CAs and chain depths in ~16s wall clock at concurrency 20. Docs: - docs/validators/chain.md, mkdocs nav entry, README "Available Validators" row, CHANGELOG Unreleased entry covering the new validator and the base64 dep removal. Closes #14 * chore: ruff-format scripts/bench_chain.py The bench script was added after the last full make test run so it never got formatted. Pure whitespace/wrapping changes — no behavior change. * feat: replace x509-parser with in-tree minimal DER parser (#22) (#24) * feat: replace x509-parser with in-tree minimal DER parser (#22) Closes #22. Replaces the ~28-crate x509-parser dependency tree with ~1500 lines of strict-DER, no-unsafe, panic-free in-tree parser code scoped to exactly what certinfo exposes to Python. Final Rust dep tree shrinks from 48 crates to 20 — every remaining crate is pyo3 or a pyo3 build-time helper. Module structure (modern Rust 2018+ layout, no mod.rs): rust_certinfo/src/ lib.rs PyO3 module + thin entry-point shim error.rs ParseError enum (no panics) pem.rs Inlined RFC 4648 b64 + PEM wrap pyobj.rs PyO3 dict converters der.rs / der/ ASN.1 primitive layer (reader, oid, time, string, tag) — knows nothing about X.509 x509.rs / x509/ X.509 layer (certificate, name, spki, algorithm, extensions) — built on der/ The DER primitive layer is a clean reusable foundation for future X.509-adjacent capabilities: SAN parsing for non-leaf certs, authorityInfoAccess (OCSP/AIA URLs), cRLDistributionPoints, extendedKeyUsage / keyUsage, certificate policies for EV detection, CRL parsing, OCSP request/response parsing, CSR parsing. Each is a matter of adding a single function under x509/ — the der/ layer requires no changes for any of the above. Strict-DER security guarantees, all enforced at the crate level: * #![forbid(unsafe_code)] at lib.rs root * Every parser path returns Result<_, ParseError> — no panics on malformed input * Reject indefinite-length encoding (BER-only, illegal in DER) * Reject non-canonical length encoding (long form when short would suffice; long form with leading zero bytes) * Bounds checks against the parent slice on every read * 56 in-module Rust unit tests covering every public function, every ParseError construction path, and DER edge cases In addition to dropping x509-parser, this PR fixes two latent bugs discovered during the rewrite: * EC `curve` field now contains the curve OID. Previous builds put the algorithm OID `1.2.840.10045.2.1` (id-ecPublicKey) into the field literally named `curve`. The new parser extracts the curve OID from algorithm.parameters and emits e.g. `1.2.840.10045.3.1.7` for P-256, `1.3.132.0.34` for P-384, `1.3.132.0.35` for P-521. * RSA modulus bit length is no longer over-counted by 8 bits. The previous build computed `modulus.len() * 8` from x509-parser, which leaves the DER-mandated leading-zero sign byte in modulus. Real-world RSA-2048 / 3072 / 4096 keys were being reported as 2056 / 3080 / 4104. The new parser strips the sign byte and reports the canonical 2048 / 3072 / 4096. Both fixes are visible behavioral changes for any caller reading `public_key_info["curve"]` or `public_key_info["size"]` literally. Test coverage: * 425 Python tests passing (was 407 — 18 added in the new corpus snapshot test), 99% line coverage * 56 Rust unit tests passing * tests/test_certinfo_corpus.py — new snapshot test that runs every public certinfo entry point against 130 unique real-world certs captured from the bench host list. Asserts RSA bit lengths are canonical, EC curves resolve to real curve OIDs (catches the fixed bug as a regression), validity timestamps are sane, all DNs decode, and SPKI extraction round-trips. * tests/fixtures/diff_corpus/ — 130 captured DER fixtures from 101 stable public hosts spanning Google Trust Services, DigiCert, Let's Encrypt, Sectigo, ISRG, SSL.com, Cloudflare-fronted certs, etc. Roughly 50/50 RSA/EC. Manual fuzzing gate at rust_certinfo/fuzz/ (cargo fuzz target + README). Not in CI (nightly + slow); release-time pre-merge gate. Other changes: * Cargo.toml crate-type now declares ["cdylib", "rlib"]. The cdylib is the same Python wheel maturin has always built; the additional rlib lets the in-repo fuzz crate link against the parser as a normal Rust library. No published-wheel surface change. * certinfo::Certificate::from_der and certinfo::ParseError are now pub at the crate root for the fuzz crate and any future in-tree Rust consumer. The PyO3 boundary and Python-facing API are unchanged. Closes #22 * chore: drop fuzz harness from rewrite PR; track in #25 Removes rust_certinfo/fuzz/ from this PR. The fuzz crate was added in the original commit but introduced a build issue: linking the certinfo crate as an rlib (which the fuzz crate needed) made cargo clippy and cargo test on macOS and Windows try to fully resolve Python symbols at link time. PyO3's extension-module feature defers Python symbol resolution to runtime, which works for the cdylib wheel target but not for an rlib-linked test binary. Reverts to crate-type = ["cdylib"] only, drops the pub re-exports of Certificate and ParseError that were added solely for the fuzz crate's benefit, and removes the rust_certinfo/fuzz/ directory. The fuzz harness work is tracked in #25 with three implementation options for the next iteration. The corpus snapshot test in tests/test_certinfo_corpus.py continues to provide real-input regression coverage on every CI run; fuzzing is the deeper hardening gate that we'll add as a focused follow-up once we pick a build strategy. * docs: state the zero-dep guarantee in Cargo.toml Mirrors the "No runtime dependencies; standard library only" comment in pyproject.toml so the same promise is visible from the Rust side. pyo3 is called out as the one required dep — it's the Python bridge, not a parser dependency — and the new in-tree parser is named so a reader knows where to look. * feat: fuzz harness for in-tree DER parser (#25) (#26) * feat: fuzz harness for in-tree DER parser (#25) Closes #25. Adds a `cargo fuzz` harness for the pure-Rust DER / X.509 parser that landed in #22, plus `make fuzz` and `make fuzz-long` Makefile targets so the workflow is one command. The core design decision is a Cargo feature on the certinfo crate: `python` (on by default) controls whether the PyO3 entry-point layer is compiled. Disabling it drops pyo3 from the dependency set entirely and exposes only the pure-Rust DER / X.509 parser core. The fuzz crate declares `certinfo = { path = "..", default-features = false }`, which is what lets the fuzz binary run as a standalone executable — without the feature gate, the rlib pulls in pyo3 symbols that can only be resolved by a running Python interpreter, which a standalone libFuzzer binary doesn't have. Nothing Python-facing changes. The default wheel build is byte- identical to before: `make develop` still compiles with the `python` feature on, maturin still builds the same cdylib, and all 425 Python tests pass with 99% line coverage and 56 Rust unit tests at 98.77% overall coverage. Layout: fuzz/ ├── Cargo.toml # declares certinfo dep with default-features=false ├── Cargo.lock # committed for reproducibility (cargo-fuzz convention) ├── README.md # why-we-fuzz + how-to-run + pre-release gate docs ├── .gitignore # ignores target/ corpus/ artifacts/ coverage/ └── fuzz_targets/ └── parse_certificate.rs # feeds arbitrary bytes to Certificate::from_der Makefile targets: make fuzz # 60-second smoke run (dev workflow) make fuzz-long # 1-hour soak run (release gate) Both targets check for nightly Rust + cargo-fuzz up front and print install instructions if missing, seed the libfuzzer corpus from tests/fixtures/diff_corpus/ (the 130 captured real-world certs), run the target, and report zero crashes on success. First live fuzz run from this machine: 32 million adversarial byte sequences in 61 seconds, 308 coverage points / 488 libfuzzer features explored, 14 new corpus entries discovered, zero crashes. The parser holds up to half a billion adversarial inputs without panicking — a concrete upper bound on "how scared should I be of the in-tree parser." Other changes: - Cargo.toml crate-type is now ["cdylib", "rlib"]. The cdylib is the Python wheel target maturin has always built; the added rlib lets the fuzz crate (and any future in-tree Rust consumer) link the parser core as a normal Rust library. No published-wheel surface change. - `certinfo::Certificate::from_der` and `certinfo::ParseError` are now pub at the crate root so the fuzz crate can call them. The PyO3 boundary and Python-facing API are unchanged. - The original PyO3 entry points moved from lib.rs's top level into a nested `mod py` behind `#[cfg(feature = "python")]`. Same functions, same `#[pyfunction]` annotations, same PyInit_certinfo generated symbol — just scoped under a feature flag so the parser core can be compiled in isolation. Fuzzing is NOT in CI and will not be — cargo fuzz needs nightly Rust and takes orders of magnitude longer than a unit test. The corpus snapshot test at tests/test_certinfo_corpus.py (added in #22) remains the day-to-day regression check against real-world certs; the fuzz harness is the deeper, slower defense against malformed input nobody has thought to write a test for. Closes #25 * fix: emit link-arg=-undefined/dynamic_lookup on macOS for test bins CI on macOS was still failing the rust matrix at the clippy / test-binary link step. The feature gate alone wasn't sufficient because `cargo clippy --all-targets` and `cargo test` link test binaries that depend on the certinfo rlib, and those bins need the same `-undefined dynamic_lookup` flags pyo3 already emits for the cdylib target. pyo3's own build script emits `cargo:rustc-cdylib-link-arg=-undefined dynamic_lookup`, but `cdylib-link-arg` only applies to cdylib targets. Test binaries are `bin` targets, which means they ignore the cdylib-scoped directives and try to resolve `_Py_NoneStruct`, `_Py_DecRef`, etc. at link time — and fail. Fix: add a certinfo-level build.rs that emits `cargo:rustc-link-arg` (not cdylib-scoped), which applies to every linked target in the crate. Only needed on macOS; Linux allows undefined symbols in executables by default and Windows has its own pyo3 import-lib machinery. My earlier attempts at this failed because I was using `pyo3_build_config::add_extension_module_link_args()`, which ALSO emits `cdylib-link-arg` and therefore had the same problem. The fix is to emit `rustc-link-arg` directly — no build-dependency on pyo3-build-config needed. The wheel build is unaffected: maturin builds the cdylib target and pyo3 handles the cdylib-specific link args via its own build script. This build.rs only kicks in for non-cdylib link steps (tests, examples, integration tests) on macOS. * chore: silence unused-import warnings under --no-default-features The re-exports `Name`, `PublicKeyAlgorithm`, and `SubjectPublicKeyInfo` in `rust_certinfo/src/x509.rs` are only used by `pyobj.rs`, which is gated behind the `python` Cargo feature. When the fuzz crate builds with `--no-default-features`, `pyobj.rs` isn't compiled and the re-exports look unused — `rustc` correctly emits two `unused_imports` warnings on every `make fuzz` invocation. Gate the re-exports behind the same `#[cfg(feature = "python")]` that gates their consumer. `Certificate` stays always-public because it's the entry point both PyO3 and the fuzz crate call. Pure cleanup: no behavior change for the wheel build, no behavior change for the fuzz build — just silences two noisy warnings that were making the fuzz output look like it had errors. * docs: add 'Why Trust CertMonitor' section to README Surfaces concrete quality signals (fuzz results, zero-dep parser, forbid(unsafe_code), coverage numbers, CI matrix) as a trust section near the bottom of the README. Positioned after the functional docs so it doesn't overshadow the feature content. * Release 0.3.0 Zero-dependency milestone: in-tree DER/X.509 parser replaces x509-parser, chain validator, fuzz harness, two bug fixes. See CHANGELOG.md for the full release notes. --------- Co-authored-by: Chris Tomkins <80041880+cdtomkins@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cargo fuzzharness for the pure-Rust DER / X.509 parser that landed in Replace x509-parser crate with in-tree minimal DER parser #22, plusmake fuzz(60s smoke) andmake fuzz-long(1hr soak) Makefile targets so the workflow is one command.pythonCargo feature on thecertinfocrate (default = on) that gates the PyO3 entry-point layer. The fuzz crate declaresdefault-features = false, which drops pyo3 entirely and exposes only the pure-Rust parser core — the mechanism that lets the fuzz binary run as a standalone executable.pyo3+ its build helpers) are in the runtime dependency tree.Live fuzz run from this machine
From the commit message, actual
make fuzzoutput:32 million adversarial byte sequences in 61 seconds, 308 code-coverage points / 487 libfuzzer features explored, 14 new corpus entries discovered by the fuzzer on top of the 130 real-world seeds, zero crashes. The parser holds up to half a billion adversarial inputs per minute without panicking.
Why fuzz a DER parser
The new in-tree parser takes untrusted bytes from the network on every TLS handshake CertMonitor performs. Three risk classes fuzzing defends against:
#![forbid(unsafe_code)]at the crate root prevents memory-safety bugs, but Rust panics still abort the process. Every bounds-check in the parser is a potential panic if I got the math wrong. Fuzzing tries millions of adversarial byte sequences and reports any that crash.The corpus snapshot test at
tests/test_certinfo_corpus.py(added in #22) covers the day-to-day regression check against real-world certs. This fuzz target is the deeper, slower defense against malformed input nobody has thought to write a test for — worth running before tagging a release, not worth running on every commit.The
pythonfeature mechanismThe key insight that makes this work is feature-gating the PyO3 surface:
With the feature on (the default), the crate compiles the full PyO3 entry-point surface (
parse_public_key_info,extract_public_key_der,extract_public_key_pem,analyze_chain) exactly as before. With it off, the#[pyfunction]/#[pymodule]code moves into a nestedmod pyunder#[cfg(feature = \"python\")]and is excluded from the build entirely — thepyo3dependency isn't even compiled. What's left is the pure-Rust DER parser core inrust_certinfo/src/{der,x509}/.The fuzz crate uses this:
Which gives the fuzz binary the parser code without the Python runtime dependency. No linker workarounds, no
.cargo/config.toml, nobuild.rs— just a clean feature flag.What I tried first and why I moved on
The previous attempt (rolled out of #22 before merge and tracked in this issue) tried to add the fuzz crate while keeping PyO3 as a mandatory dependency. That approach required
.cargo/config.tomlwith platform-specific-undefined dynamic_lookuplinker flags AND abuild.rsemittingcargo:rustc-link-argdirectives, and even then the fuzz binary died at runtime because standalone Rust binaries don't have a Python interpreter to resolve_PyBaseObject_Type,_Py_NoneStruct, etc. Feature-gating is the cleaner answer — it removes the PyO3 code from the build instead of trying to link around it.Layout
New
pubsurface at the crate root (for fuzz consumption):certinfo::Certificate::from_dercertinfo::ParseErrorMakefile
Both targets check for nightly Rust +
cargo-fuzzup front and print install instructions if missing, seed the libfuzzer corpus fromtests/fixtures/diff_corpus/(the 130 captured real-world certs), run the target, and report zero crashes on success.Preflight check output when prereqs are missing:
Not in CI (deliberate)
cargo fuzzrequires nightly Rust, takes orders of magnitude longer than a unit test, and brings inlibfuzzer-sys. None of those belong in the PR CI matrix. The fuzz harness is documented as a manual pre-release gate infuzz/README.md.Files of note
Cargo.tomlpythonfeature (default on), makespyo3optional,crate-type = [\"cdylib\", \"rlib\"]rust_certinfo/src/lib.rsmod pem/mod pyobjmoved under#[cfg(feature = \"python\")];pub use Certificate/pub use ParseErroradded at crate rootfuzz/(new)Makefilefuzzandfuzz-longtargets +cleanentries for fuzz artifacts +helptextCHANGELOG.mdpythonfeature +rlibcrate-type + newpubsurfaceTest plan
make developrebuilds clean with thepythonfeature on (default)uv run pytest -q— 425 Python tests passing, 99% line coveragecargo test --all-targets— 56 Rust unit tests passingcargo clippy --all-targets --all-features -- -D warnings— cleancargo fmt --all -- --check— cleanmake test— full local CI green: ruff, cargo fmt/clippy, pytest--cov-fail-under=95, mypy strict, cargo audit (20 crates), bandit, wheel buildmake fuzz— 60s smoke run, 32M runs, 308 cov, 487 ft, zero crashescargo tree -e normal— still 20 dependency crates total, allpyo3or build-time helpersrlibcrate-type was specifically becausecargo clippy --all-targetsfailed at link time on macOS / Windows. Feature-gating pyo3 should fix that because thecargo clippy/cargo testof the main certinfo crate compiles withdefault-features= true and pyo3's own build script handles the link flags for those configurations. If CI fails on macOS or Windows again we'll know the feature gate wasn't sufficient. This is the key risk to watch on this PR.make developandmake testwork in their local environmentCloses #25