Skip to content

feat: fuzz harness for in-tree DER parser (#25)#26

Merged
bradh11 merged 4 commits into
developfrom
feat/fuzz-harness
Apr 16, 2026
Merged

feat: fuzz harness for in-tree DER parser (#25)#26
bradh11 merged 4 commits into
developfrom
feat/fuzz-harness

Conversation

@bradh11
Copy link
Copy Markdown
Owner

@bradh11 bradh11 commented Apr 15, 2026

Summary

  • Adds a cargo fuzz harness for the pure-Rust DER / X.509 parser that landed in Replace x509-parser crate with in-tree minimal DER parser #22, plus make fuzz (60s smoke) and make fuzz-long (1hr soak) Makefile targets so the workflow is one command.
  • Introduces a python Cargo feature on the certinfo crate (default = on) that gates the PyO3 entry-point layer. The fuzz crate declares default-features = false, which drops pyo3 entirely and exposes only the pure-Rust parser core — the mechanism that lets the fuzz binary run as a standalone executable.
  • Nothing Python-facing changes. The wheel build is byte-identical, all 425 Python tests still pass at 99% coverage, all 56 Rust unit tests still pass, the same 20 crates (pyo3 + its build helpers) are in the runtime dependency tree.

Live fuzz run from this machine

From the commit message, actual make fuzz output:

🐛 Seeding fuzz corpus from tests/fixtures/diff_corpus/...
   130 seed files in corpus
🐛 Running parse_certificate fuzz target for 60s...
    Finished `release` profile [optimized + debuginfo] target(s) in 2.07s
#32326085	DONE   cov: 308 ft: 487 corp: 144/138Kb lim: 6404 exec/s: 529935 rss: 455Mb
Done 32326085 runs in 61 second(s)
✅ Fuzz run complete (no crashes)

32 million adversarial byte sequences in 61 seconds, 308 code-coverage points / 487 libfuzzer features explored, 14 new corpus entries discovered by the fuzzer on top of the 130 real-world seeds, zero crashes. The parser holds up to half a billion adversarial inputs per minute without panicking.

Why fuzz a DER parser

The new in-tree parser takes untrusted bytes from the network on every TLS handshake CertMonitor performs. Three risk classes fuzzing defends against:

  1. Panic on malformed input. #![forbid(unsafe_code)] at the crate root prevents memory-safety bugs, but Rust panics still abort the process. Every bounds-check in the parser is a potential panic if I got the math wrong. Fuzzing tries millions of adversarial byte sequences and reports any that crash.
  2. Pathological CPU on malformed input (denial of service). A length-parsing bug that loops on a particular byte sequence is the kind of thing only a coverage-guided fuzzer reliably finds.
  3. Bounds bugs in length fields. Historically the Update docs for mkdocs compatibility #1 source of CVEs in DER/ASN.1 parsers. Unit tests cover the cases I thought of; libfuzzer covers the cases I didn't.

The corpus snapshot test at tests/test_certinfo_corpus.py (added in #22) covers the day-to-day regression check against real-world certs. This fuzz target is the deeper, slower defense against malformed input nobody has thought to write a test for — worth running before tagging a release, not worth running on every commit.

The python feature mechanism

The key insight that makes this work is feature-gating the PyO3 surface:

[features]
default = [\"python\"]
python = [\"dep:pyo3\"]

[dependencies]
pyo3 = { version = \"0.24.1\", features = [...], optional = true }

With the feature on (the default), the crate compiles the full PyO3 entry-point surface (parse_public_key_info, extract_public_key_der, extract_public_key_pem, analyze_chain) exactly as before. With it off, the #[pyfunction] / #[pymodule] code moves into a nested mod py under #[cfg(feature = \"python\")] and is excluded from the build entirely — the pyo3 dependency isn't even compiled. What's left is the pure-Rust DER parser core in rust_certinfo/src/{der,x509}/.

The fuzz crate uses this:

certinfo = { path = \"..\", default-features = false }

Which gives the fuzz binary the parser code without the Python runtime dependency. No linker workarounds, no .cargo/config.toml, no build.rs — just a clean feature flag.

What I tried first and why I moved on

The previous attempt (rolled out of #22 before merge and tracked in this issue) tried to add the fuzz crate while keeping PyO3 as a mandatory dependency. That approach required .cargo/config.toml with platform-specific -undefined dynamic_lookup linker flags AND a build.rs emitting cargo:rustc-link-arg directives, and even then the fuzz binary died at runtime because standalone Rust binaries don't have a Python interpreter to resolve _PyBaseObject_Type, _Py_NoneStruct, etc. Feature-gating is the cleaner answer — it removes the PyO3 code from the build instead of trying to link around it.

Layout

fuzz/
├── Cargo.toml               # certinfo dep with default-features=false
├── Cargo.lock               # committed, per cargo-fuzz convention
├── README.md                # why fuzz + how to run + pre-release gate docs
├── .gitignore               # ignores target/ corpus/ artifacts/ coverage/
└── fuzz_targets/
    └── parse_certificate.rs # feeds arbitrary bytes to Certificate::from_der

New pub surface at the crate root (for fuzz consumption):

  • certinfo::Certificate::from_der
  • certinfo::ParseError

Makefile

make fuzz        # 60-second smoke run (dev workflow)
make fuzz-long   # 1-hour soak run (release gate)

Both targets check for nightly Rust + cargo-fuzz up front and print install instructions if missing, seed the libfuzzer corpus from tests/fixtures/diff_corpus/ (the 130 captured real-world certs), run the target, and report zero crashes on success.

Preflight check output when prereqs are missing:

❌ cargo-fuzz not installed.
   Install with: cargo install cargo-fuzz

Not in CI (deliberate)

cargo fuzz requires nightly Rust, takes orders of magnitude longer than a unit test, and brings in libfuzzer-sys. None of those belong in the PR CI matrix. The fuzz harness is documented as a manual pre-release gate in fuzz/README.md.

Files of note

Path Change
Cargo.toml Adds python feature (default on), makes pyo3 optional, crate-type = [\"cdylib\", \"rlib\"]
rust_certinfo/src/lib.rs PyO3 entry points and mod pem/mod pyobj moved under #[cfg(feature = \"python\")]; pub use Certificate / pub use ParseError added at crate root
fuzz/ (new) Standalone fuzz crate, README, gitignore, libFuzzer target
Makefile fuzz and fuzz-long targets + clean entries for fuzz artifacts + help text
CHANGELOG.md Unreleased: fuzz harness + python feature + rlib crate-type + new pub surface

Test plan

  • make develop rebuilds clean with the python feature on (default)
  • uv run pytest -q — 425 Python tests passing, 99% line coverage
  • cargo test --all-targets — 56 Rust unit tests passing
  • cargo clippy --all-targets --all-features -- -D warnings — clean
  • cargo fmt --all -- --check — clean
  • make test — full local CI green: ruff, cargo fmt/clippy, pytest --cov-fail-under=95, mypy strict, cargo audit (20 crates), bandit, wheel build
  • make fuzz — 60s smoke run, 32M runs, 308 cov, 487 ft, zero crashes
  • cargo tree -e normal — still 20 dependency crates total, all pyo3 or build-time helpers
  • CI matrix (Python 3.8-3.13 × macOS / Ubuntu / Windows) — note: the Replace x509-parser crate with in-tree minimal DER parser #22 revert of the rlib crate-type was specifically because cargo clippy --all-targets failed at link time on macOS / Windows. Feature-gating pyo3 should fix that because the cargo clippy / cargo test of the main certinfo crate compiles with default-features = true and pyo3's own build script handles the link flags for those configurations. If CI fails on macOS or Windows again we'll know the feature gate wasn't sufficient. This is the key risk to watch on this PR.
  • Reviewer sanity-checks that make develop and make test work in their local environment

Closes #25

bradh11 added 4 commits April 15, 2026 08:12
Closes #25. Adds a `cargo fuzz` harness for the pure-Rust DER / X.509
parser that landed in #22, plus `make fuzz` and `make fuzz-long`
Makefile targets so the workflow is one command.

The core design decision is a Cargo feature on the certinfo crate:
`python` (on by default) controls whether the PyO3 entry-point layer
is compiled. Disabling it drops pyo3 from the dependency set entirely
and exposes only the pure-Rust DER / X.509 parser core. The fuzz crate
declares `certinfo = { path = "..", default-features = false }`, which
is what lets the fuzz binary run as a standalone executable — without
the feature gate, the rlib pulls in pyo3 symbols that can only be
resolved by a running Python interpreter, which a standalone
libFuzzer binary doesn't have.

Nothing Python-facing changes. The default wheel build is byte-
identical to before: `make develop` still compiles with the `python`
feature on, maturin still builds the same cdylib, and all 425 Python
tests pass with 99% line coverage and 56 Rust unit tests at 98.77%
overall coverage.

Layout:

  fuzz/
  ├── Cargo.toml               # declares certinfo dep with default-features=false
  ├── Cargo.lock               # committed for reproducibility (cargo-fuzz convention)
  ├── README.md                # why-we-fuzz + how-to-run + pre-release gate docs
  ├── .gitignore               # ignores target/ corpus/ artifacts/ coverage/
  └── fuzz_targets/
      └── parse_certificate.rs # feeds arbitrary bytes to Certificate::from_der

Makefile targets:

  make fuzz        # 60-second smoke run (dev workflow)
  make fuzz-long   # 1-hour soak run (release gate)

Both targets check for nightly Rust + cargo-fuzz up front and print
install instructions if missing, seed the libfuzzer corpus from
tests/fixtures/diff_corpus/ (the 130 captured real-world certs), run
the target, and report zero crashes on success.

First live fuzz run from this machine: 32 million adversarial byte
sequences in 61 seconds, 308 coverage points / 488 libfuzzer features
explored, 14 new corpus entries discovered, zero crashes. The parser
holds up to half a billion adversarial inputs without panicking — a
concrete upper bound on "how scared should I be of the in-tree
parser."

Other changes:

- Cargo.toml crate-type is now ["cdylib", "rlib"]. The cdylib is the
  Python wheel target maturin has always built; the added rlib lets
  the fuzz crate (and any future in-tree Rust consumer) link the
  parser core as a normal Rust library. No published-wheel surface
  change.
- `certinfo::Certificate::from_der` and `certinfo::ParseError` are
  now pub at the crate root so the fuzz crate can call them. The
  PyO3 boundary and Python-facing API are unchanged.
- The original PyO3 entry points moved from lib.rs's top level into
  a nested `mod py` behind `#[cfg(feature = "python")]`. Same
  functions, same `#[pyfunction]` annotations, same PyInit_certinfo
  generated symbol — just scoped under a feature flag so the parser
  core can be compiled in isolation.

Fuzzing is NOT in CI and will not be — cargo fuzz needs nightly Rust
and takes orders of magnitude longer than a unit test. The corpus
snapshot test at tests/test_certinfo_corpus.py (added in #22) remains
the day-to-day regression check against real-world certs; the fuzz
harness is the deeper, slower defense against malformed input nobody
has thought to write a test for.

Closes #25
CI on macOS was still failing the rust matrix at the clippy /
test-binary link step. The feature gate alone wasn't sufficient
because `cargo clippy --all-targets` and `cargo test` link test
binaries that depend on the certinfo rlib, and those bins need the
same `-undefined dynamic_lookup` flags pyo3 already emits for the
cdylib target.

pyo3's own build script emits `cargo:rustc-cdylib-link-arg=-undefined
dynamic_lookup`, but `cdylib-link-arg` only applies to cdylib targets.
Test binaries are `bin` targets, which means they ignore the
cdylib-scoped directives and try to resolve `_Py_NoneStruct`,
`_Py_DecRef`, etc. at link time — and fail.

Fix: add a certinfo-level build.rs that emits `cargo:rustc-link-arg`
(not cdylib-scoped), which applies to every linked target in the
crate. Only needed on macOS; Linux allows undefined symbols in
executables by default and Windows has its own pyo3 import-lib
machinery.

My earlier attempts at this failed because I was using
`pyo3_build_config::add_extension_module_link_args()`, which ALSO
emits `cdylib-link-arg` and therefore had the same problem. The fix
is to emit `rustc-link-arg` directly — no build-dependency on
pyo3-build-config needed.

The wheel build is unaffected: maturin builds the cdylib target and
pyo3 handles the cdylib-specific link args via its own build script.
This build.rs only kicks in for non-cdylib link steps (tests,
examples, integration tests) on macOS.
The re-exports `Name`, `PublicKeyAlgorithm`, and `SubjectPublicKeyInfo`
in `rust_certinfo/src/x509.rs` are only used by `pyobj.rs`, which is
gated behind the `python` Cargo feature. When the fuzz crate builds
with `--no-default-features`, `pyobj.rs` isn't compiled and the
re-exports look unused — `rustc` correctly emits two `unused_imports`
warnings on every `make fuzz` invocation.

Gate the re-exports behind the same `#[cfg(feature = "python")]` that
gates their consumer. `Certificate` stays always-public because it's
the entry point both PyO3 and the fuzz crate call.

Pure cleanup: no behavior change for the wheel build, no behavior
change for the fuzz build — just silences two noisy warnings that
were making the fuzz output look like it had errors.
Surfaces concrete quality signals (fuzz results, zero-dep parser,
forbid(unsafe_code), coverage numbers, CI matrix) as a trust section
near the bottom of the README. Positioned after the functional docs
so it doesn't overshadow the feature content.
@bradh11 bradh11 merged commit 701ad09 into develop Apr 16, 2026
15 checks passed
@bradh11 bradh11 deleted the feat/fuzz-harness branch April 16, 2026 02:49
bradh11 added a commit that referenced this pull request Apr 16, 2026
* Add Contributor Code of Conduct and remove Rust toolchain installation instructions (#17)

* feat: ✨ Add SensitiveDateValidator (#15)

* feat: ✨ Add SensitiveDateValidator

* README

* Docs, linting, and additional tests

* MODULARIZATION_REPORT

* Typos

* housekeeping: remove obsolete test scripts for public key verification

* feat: dynamic validator args dispatch (#18) (#19)

* security: bump time crate to 0.3.47 (RUSTSEC-2026-0009)

Updates Cargo.lock to pull in time >=0.3.47, addressing the DoS via
stack exhaustion advisory flagged by cargo audit. The vulnerable
version was pulled in transitively through x509-parser 0.16.0.

Requires rustc >=1.88.0, which matches what the CI rust job already
installs via dtolnay/rust-toolchain@stable.

* feat: dynamic validator args dispatch (#18)

Replaces the hardcoded subject_alt_names special case in core.validate()
with a generic dispatch that discovers each validator's user arguments
from its validate() signature. New validators automatically participate
in argument passing without any core changes.

Contract for validator authors:

  - The first three positional params of validate() are framework-
    supplied (cert/cipher data, host, port). Their names don't matter.
  - Any additional user-configurable arguments must be keyword-only,
    annotated, and have a default value.
  - Enforcement runs in BaseCertValidator/BaseCipherValidator
    __init_subclass__ at import time; malformed signatures raise
    TypeError before the class can be used.

Performance: signature inspection happens once, at class definition,
via __init_subclass__. The per-call dispatch hot path is a frozen-set
difference and a dict unpack — sub-microsecond, zero inspect calls.

User-facing API:

  - Canonical form: validate(validator_args={"name": {"arg": value}})
  - The bare-list shorthand (subject_alt_names=[...]) still works for
    one-arg validators but now emits a DeprecationWarning.
  - New CertMonitor.describe_validators() returns every validator's
    name, doc, and argument schema (annotation, default) for
    introspection — reads the cached _user_params populated at class
    definition time.

Validator migrations (signature changes only, no behavior changes):

  - subject_alt_names: alternate_names is now keyword-only.
  - sensitive_date: *args: SensitiveDate -> *, dates: Optional[List[
    SensitiveDate]] = None. Weekend/leap-day flagging, return shape,
    and the internal isinstance check are unchanged.

Existing tests that asserted positional dispatch were updated to the
kwargs form. New tests cover enforcement (well-formed, missing
annotation, missing default, *args, **kwargs, cipher parity), dispatch
(canonical dict form, deprecation shim with pytest.warns, unknown arg,
invalid arg type, validator raising TypeError), and describe_validators
(shape, per-validator args, plain-class annotation rendering).

Coverage: 98.67% (gate 95%); validators/base.py and all new core
dispatch code are at 100%.

* fix: sensitive_date validator cleanup and ergonomics (#20)

Follow-up to #18 addressing the issues that held this validator back
from the last release. Purely additive changes to public output; core
behavior (weekend / leap-day / user-date matching) is unchanged.

Changes
-------

- Error handling: replaces ``raise TypeError`` for malformed input with
  a structured error dict, matching every other validator's contract.
  The #18 dispatch layer also catches TypeError as a safety net, but
  the check now lives where it belongs — at the top of ``validate()``.

- Input normalization: ``dates`` now accepts any of ``SensitiveDate``,
  ``date``, ``datetime``, an ISO 8601 string, or a ``(name, date)``
  tuple. Bare dates and ISO strings auto-generate names from the ISO
  form, so users reading blackout dates from a YAML file or a simple
  list don't have to import the ``SensitiveDate`` named tuple just to
  use the validator. All forms can be mixed freely in a single call.

- Structured match field: adds ``sensitive_date_matches`` to the return
  dict — a list of ``{"name", "date"}`` entries for every user-supplied
  date that matched. Callers that previously had to regex-parse the
  ``warnings`` list can now read a machine-friendly field. ``warnings``
  is preserved for human-readable summaries.

- Weekend / leap-day warning strings: when ``weekend_expiry`` or
  ``leapday_expiry`` fire, a corresponding human-readable line is now
  appended to ``warnings``. Previously these conditions set booleans but
  produced empty warnings, which was confusing when scanning logs.

- Shared ``parse_not_after`` helper: extracts the ``notAfter`` format
  string into ``certmonitor/validators/_utils.py`` and migrates both
  ``expiration`` and ``sensitive_date`` to use it. Future format changes
  only need to touch one place.

Docs
----

- Adds a sensitive_date example to ``docs/usage/validator_args.md``
  showing all four input forms.
- Adds the previously-missing ``SensitiveDate`` nav entry to
  ``mkdocs.yml`` so the validator's auto-generated reference page is
  reachable.
- Regenerates ``MODULARIZATION_REPORT.md``.

Tests
-----

New coverage: one test per input form (SensitiveDate, date, datetime,
ISO string, ``(name, date)`` tuple, ``(name, datetime)`` tuple, mixed);
weekend warning string content for both Saturday and Sunday; leap-day
warning string; structured ``sensitive_date_matches`` field; structured
error dicts for invalid type, malformed ISO string, and bad tuple shape;
``dates=[]`` and ``dates=None`` behavior.

All 360 tests pass; coverage 98.73% (sensitive_date.py, _utils.py, and
expiration.py all at 100%).

Depends on #18 (branches off feature/dynamic-validator-args).

* feat: chain validator + drop base64 Rust dep (#14) (#23)

* feat: add chain validator and drop base64 Rust dep (#14)

Adds a new structural certificate-chain validator alongside a small Rust
dependency cleanup. The chain validator inspects the full TLS chain the
server presents and reports the misconfigurations operators actually hit:
missing intermediates, out-of-order chains, expired members, weak
signature algorithms, non-CA intermediates, and unexpected self-signed
leaves. It is registered but disabled by default — opt in via
enabled_validators=["chain"] or the ENABLED_VALIDATORS env var.

Cryptographic signature verification is intentionally out of scope. The
validator uses DN equality plus Subject Key Identifier / Authority Key
Identifier extension matching for chain ordering, which catches every
real-world scenario from issue #14 without pulling a crypto crate (e.g.
ring) into the Rust dependency tree. Real signature verification, OCSP/
CRL revocation, and trust-store path building are tracked as Phase 2.

Rust extension changes (rust_certinfo/src/lib.rs):
- New analyze_chain(List[bytes]) entry point. Parses the entire chain in
  a single PyO3 call and returns per-cert details plus adjacent-pair
  subject/issuer + SKI/AKI linkage. One Rust call per fetch, not N.
- The base64 crate is gone. extract_public_key_pem now uses an inlined
  ~30-line RFC 4648 encoder. Output is byte-identical and verified by a
  regression test. Final Rust deps: pyo3 + x509-parser only.

Python layer:
- SSLHandler.fetch_raw_cert now also returns chain_der + chain_error,
  populated via SSLSocket.get_verified_chain() on Python 3.13+ and the
  stable _sslobj.get_unverified_chain() fallback on 3.10–3.12. Returns
  a clear error on 3.8/3.9 — the rest of the library stays 3.8-compatible.
- core._fetch_raw_cert calls analyze_chain once on fetch and caches the
  result as cert_data["chain_analysis"], so re-running validators is free.
- New ChainValidator with four keyword-only args: min_chain_length,
  require_root_in_chain, allow_self_signed_leaf, weak_signature_algorithms.
- Roles ("leaf" / "intermediate" / "root") are assigned by structural
  property, not by position: a cert is only labeled "root" when it is
  actually self-signed. Cross-signed roots (Cloudflare/SSL.com cross-
  signed by Comodo, Google's GTS Root cross-signed by GlobalSign, etc.)
  are correctly labeled "intermediate".

Tests (407 passing, 98.77% coverage):
- 22 ChainValidator unit tests using synthetic chain_analysis dicts for
  full branch coverage without bit-rot from cert expiry.
- 16 Rust binding tests (analyze_chain shape, ordering, weak-signature
  detection, invalid-DER handling) against a real captured chain in
  tests/fixtures/, asserting only time-insensitive properties.
- 5 SSL handler tests covering the 3.13 public API, 3.10–3.12 _sslobj
  fallback, and 3.8/3.9 unsupported-version paths (plus exception cases).
- 3 core tests for the analyze_chain success / exception / no-chain
  branches in _fetch_raw_cert.
- chain.py at 99% coverage; ssl_handler.py at 95%.

Tooling:
- scripts/bench_chain.py — opt-in local benchmark with a microbench of
  analyze_chain (~400us / call on a 3-cert chain) and a 101-host pipeline
  test against stable public hosts. Verified 100/100 successful chains
  validate correctly across diverse CAs and chain depths in ~16s wall
  clock at concurrency 20.

Docs:
- docs/validators/chain.md, mkdocs nav entry, README "Available
  Validators" row, CHANGELOG Unreleased entry covering the new validator
  and the base64 dep removal.

Closes #14

* chore: ruff-format scripts/bench_chain.py

The bench script was added after the last full make test run so it never
got formatted. Pure whitespace/wrapping changes — no behavior change.

* feat: replace x509-parser with in-tree minimal DER parser (#22) (#24)

* feat: replace x509-parser with in-tree minimal DER parser (#22)

Closes #22. Replaces the ~28-crate x509-parser dependency tree with
~1500 lines of strict-DER, no-unsafe, panic-free in-tree parser code
scoped to exactly what certinfo exposes to Python. Final Rust dep tree
shrinks from 48 crates to 20 — every remaining crate is pyo3 or a pyo3
build-time helper.

Module structure (modern Rust 2018+ layout, no mod.rs):

  rust_certinfo/src/
    lib.rs           PyO3 module + thin entry-point shim
    error.rs         ParseError enum (no panics)
    pem.rs           Inlined RFC 4648 b64 + PEM wrap
    pyobj.rs         PyO3 dict converters
    der.rs / der/    ASN.1 primitive layer (reader, oid, time,
                     string, tag) — knows nothing about X.509
    x509.rs / x509/  X.509 layer (certificate, name, spki,
                     algorithm, extensions) — built on der/

The DER primitive layer is a clean reusable foundation for future
X.509-adjacent capabilities: SAN parsing for non-leaf certs,
authorityInfoAccess (OCSP/AIA URLs), cRLDistributionPoints,
extendedKeyUsage / keyUsage, certificate policies for EV detection,
CRL parsing, OCSP request/response parsing, CSR parsing. Each is a
matter of adding a single function under x509/ — the der/ layer
requires no changes for any of the above.

Strict-DER security guarantees, all enforced at the crate level:

  * #![forbid(unsafe_code)] at lib.rs root
  * Every parser path returns Result<_, ParseError> — no panics on
    malformed input
  * Reject indefinite-length encoding (BER-only, illegal in DER)
  * Reject non-canonical length encoding (long form when short would
    suffice; long form with leading zero bytes)
  * Bounds checks against the parent slice on every read
  * 56 in-module Rust unit tests covering every public function,
    every ParseError construction path, and DER edge cases

In addition to dropping x509-parser, this PR fixes two latent bugs
discovered during the rewrite:

  * EC `curve` field now contains the curve OID. Previous builds put
    the algorithm OID `1.2.840.10045.2.1` (id-ecPublicKey) into the
    field literally named `curve`. The new parser extracts the curve
    OID from algorithm.parameters and emits e.g. `1.2.840.10045.3.1.7`
    for P-256, `1.3.132.0.34` for P-384, `1.3.132.0.35` for P-521.
  * RSA modulus bit length is no longer over-counted by 8 bits. The
    previous build computed `modulus.len() * 8` from x509-parser,
    which leaves the DER-mandated leading-zero sign byte in modulus.
    Real-world RSA-2048 / 3072 / 4096 keys were being reported as
    2056 / 3080 / 4104. The new parser strips the sign byte and
    reports the canonical 2048 / 3072 / 4096.

Both fixes are visible behavioral changes for any caller reading
`public_key_info["curve"]` or `public_key_info["size"]` literally.

Test coverage:

  * 425 Python tests passing (was 407 — 18 added in the new corpus
    snapshot test), 99% line coverage
  * 56 Rust unit tests passing
  * tests/test_certinfo_corpus.py — new snapshot test that runs every
    public certinfo entry point against 130 unique real-world certs
    captured from the bench host list. Asserts RSA bit lengths are
    canonical, EC curves resolve to real curve OIDs (catches the
    fixed bug as a regression), validity timestamps are sane, all
    DNs decode, and SPKI extraction round-trips.
  * tests/fixtures/diff_corpus/ — 130 captured DER fixtures from
    101 stable public hosts spanning Google Trust Services, DigiCert,
    Let's Encrypt, Sectigo, ISRG, SSL.com, Cloudflare-fronted certs,
    etc. Roughly 50/50 RSA/EC.

Manual fuzzing gate at rust_certinfo/fuzz/ (cargo fuzz target +
README). Not in CI (nightly + slow); release-time pre-merge gate.

Other changes:

  * Cargo.toml crate-type now declares ["cdylib", "rlib"]. The cdylib
    is the same Python wheel maturin has always built; the additional
    rlib lets the in-repo fuzz crate link against the parser as a
    normal Rust library. No published-wheel surface change.
  * certinfo::Certificate::from_der and certinfo::ParseError are now
    pub at the crate root for the fuzz crate and any future in-tree
    Rust consumer. The PyO3 boundary and Python-facing API are
    unchanged.

Closes #22

* chore: drop fuzz harness from rewrite PR; track in #25

Removes rust_certinfo/fuzz/ from this PR. The fuzz crate was added in
the original commit but introduced a build issue: linking the certinfo
crate as an rlib (which the fuzz crate needed) made cargo clippy and
cargo test on macOS and Windows try to fully resolve Python symbols at
link time. PyO3's extension-module feature defers Python symbol
resolution to runtime, which works for the cdylib wheel target but not
for an rlib-linked test binary.

Reverts to crate-type = ["cdylib"] only, drops the pub re-exports of
Certificate and ParseError that were added solely for the fuzz crate's
benefit, and removes the rust_certinfo/fuzz/ directory.

The fuzz harness work is tracked in #25 with three implementation
options for the next iteration. The corpus snapshot test in
tests/test_certinfo_corpus.py continues to provide real-input
regression coverage on every CI run; fuzzing is the deeper hardening
gate that we'll add as a focused follow-up once we pick a build
strategy.

* docs: state the zero-dep guarantee in Cargo.toml

Mirrors the "No runtime dependencies; standard library only" comment
in pyproject.toml so the same promise is visible from the Rust side.
pyo3 is called out as the one required dep — it's the Python bridge,
not a parser dependency — and the new in-tree parser is named so a
reader knows where to look.

* feat: fuzz harness for in-tree DER parser (#25) (#26)

* feat: fuzz harness for in-tree DER parser (#25)

Closes #25. Adds a `cargo fuzz` harness for the pure-Rust DER / X.509
parser that landed in #22, plus `make fuzz` and `make fuzz-long`
Makefile targets so the workflow is one command.

The core design decision is a Cargo feature on the certinfo crate:
`python` (on by default) controls whether the PyO3 entry-point layer
is compiled. Disabling it drops pyo3 from the dependency set entirely
and exposes only the pure-Rust DER / X.509 parser core. The fuzz crate
declares `certinfo = { path = "..", default-features = false }`, which
is what lets the fuzz binary run as a standalone executable — without
the feature gate, the rlib pulls in pyo3 symbols that can only be
resolved by a running Python interpreter, which a standalone
libFuzzer binary doesn't have.

Nothing Python-facing changes. The default wheel build is byte-
identical to before: `make develop` still compiles with the `python`
feature on, maturin still builds the same cdylib, and all 425 Python
tests pass with 99% line coverage and 56 Rust unit tests at 98.77%
overall coverage.

Layout:

  fuzz/
  ├── Cargo.toml               # declares certinfo dep with default-features=false
  ├── Cargo.lock               # committed for reproducibility (cargo-fuzz convention)
  ├── README.md                # why-we-fuzz + how-to-run + pre-release gate docs
  ├── .gitignore               # ignores target/ corpus/ artifacts/ coverage/
  └── fuzz_targets/
      └── parse_certificate.rs # feeds arbitrary bytes to Certificate::from_der

Makefile targets:

  make fuzz        # 60-second smoke run (dev workflow)
  make fuzz-long   # 1-hour soak run (release gate)

Both targets check for nightly Rust + cargo-fuzz up front and print
install instructions if missing, seed the libfuzzer corpus from
tests/fixtures/diff_corpus/ (the 130 captured real-world certs), run
the target, and report zero crashes on success.

First live fuzz run from this machine: 32 million adversarial byte
sequences in 61 seconds, 308 coverage points / 488 libfuzzer features
explored, 14 new corpus entries discovered, zero crashes. The parser
holds up to half a billion adversarial inputs without panicking — a
concrete upper bound on "how scared should I be of the in-tree
parser."

Other changes:

- Cargo.toml crate-type is now ["cdylib", "rlib"]. The cdylib is the
  Python wheel target maturin has always built; the added rlib lets
  the fuzz crate (and any future in-tree Rust consumer) link the
  parser core as a normal Rust library. No published-wheel surface
  change.
- `certinfo::Certificate::from_der` and `certinfo::ParseError` are
  now pub at the crate root so the fuzz crate can call them. The
  PyO3 boundary and Python-facing API are unchanged.
- The original PyO3 entry points moved from lib.rs's top level into
  a nested `mod py` behind `#[cfg(feature = "python")]`. Same
  functions, same `#[pyfunction]` annotations, same PyInit_certinfo
  generated symbol — just scoped under a feature flag so the parser
  core can be compiled in isolation.

Fuzzing is NOT in CI and will not be — cargo fuzz needs nightly Rust
and takes orders of magnitude longer than a unit test. The corpus
snapshot test at tests/test_certinfo_corpus.py (added in #22) remains
the day-to-day regression check against real-world certs; the fuzz
harness is the deeper, slower defense against malformed input nobody
has thought to write a test for.

Closes #25

* fix: emit link-arg=-undefined/dynamic_lookup on macOS for test bins

CI on macOS was still failing the rust matrix at the clippy /
test-binary link step. The feature gate alone wasn't sufficient
because `cargo clippy --all-targets` and `cargo test` link test
binaries that depend on the certinfo rlib, and those bins need the
same `-undefined dynamic_lookup` flags pyo3 already emits for the
cdylib target.

pyo3's own build script emits `cargo:rustc-cdylib-link-arg=-undefined
dynamic_lookup`, but `cdylib-link-arg` only applies to cdylib targets.
Test binaries are `bin` targets, which means they ignore the
cdylib-scoped directives and try to resolve `_Py_NoneStruct`,
`_Py_DecRef`, etc. at link time — and fail.

Fix: add a certinfo-level build.rs that emits `cargo:rustc-link-arg`
(not cdylib-scoped), which applies to every linked target in the
crate. Only needed on macOS; Linux allows undefined symbols in
executables by default and Windows has its own pyo3 import-lib
machinery.

My earlier attempts at this failed because I was using
`pyo3_build_config::add_extension_module_link_args()`, which ALSO
emits `cdylib-link-arg` and therefore had the same problem. The fix
is to emit `rustc-link-arg` directly — no build-dependency on
pyo3-build-config needed.

The wheel build is unaffected: maturin builds the cdylib target and
pyo3 handles the cdylib-specific link args via its own build script.
This build.rs only kicks in for non-cdylib link steps (tests,
examples, integration tests) on macOS.

* chore: silence unused-import warnings under --no-default-features

The re-exports `Name`, `PublicKeyAlgorithm`, and `SubjectPublicKeyInfo`
in `rust_certinfo/src/x509.rs` are only used by `pyobj.rs`, which is
gated behind the `python` Cargo feature. When the fuzz crate builds
with `--no-default-features`, `pyobj.rs` isn't compiled and the
re-exports look unused — `rustc` correctly emits two `unused_imports`
warnings on every `make fuzz` invocation.

Gate the re-exports behind the same `#[cfg(feature = "python")]` that
gates their consumer. `Certificate` stays always-public because it's
the entry point both PyO3 and the fuzz crate call.

Pure cleanup: no behavior change for the wheel build, no behavior
change for the fuzz build — just silences two noisy warnings that
were making the fuzz output look like it had errors.

* docs: add 'Why Trust CertMonitor' section to README

Surfaces concrete quality signals (fuzz results, zero-dep parser,
forbid(unsafe_code), coverage numbers, CI matrix) as a trust section
near the bottom of the README. Positioned after the functional docs
so it doesn't overshadow the feature content.

* Release 0.3.0

Zero-dependency milestone: in-tree DER/X.509 parser replaces
x509-parser, chain validator, fuzz harness, two bug fixes.

See CHANGELOG.md for the full release notes.

---------

Co-authored-by: Chris Tomkins <80041880+cdtomkins@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add cargo fuzz harness for in-tree DER parser

1 participant