Status: v0.1.0a3 — Pre-Alpha. Engineering prototype, not a formal verifier and not intended for production use. APIs may break without notice until v0.2.
Scope: streaming machine-learning inference verification only. No LLM agent oversight, no formal proof of correctness, no security guarantee.
conformlock is a small CPU-only Python library that wraps a streaming ML predictor
with four runtime checks and a tamper-evident audit log:
- Split conformal prediction (and an adaptive variant in the style of Gibbs & Candès 2021) for per-decision prediction intervals / sets.
- Finite-trace temporal-logic property automata (LTLf, evaluated incrementally) for "the system did X before Y" rules over the recent decision stream.
- Online drift detectors (ADWIN, CUSUM, Page-Hinkley, sliding KS, PSI) to flag when calibration is no longer trustworthy.
- Append-only audit ledger using BLAKE3 hash chaining + ULID identifiers, so a verifier can later show that a recorded decision was not edited after the fact. The ledger is tamper-evident only against in-place edits under a single-writer assumption: an attacker who can replace the whole file undetected, or two concurrent writers across processes, can rewrite history. External-verifiability anchoring (Sigstore/Rekor or a public chain) is on the v0.2 roadmap. Within a single process,
Ledger.appendis now thread-safe (added in v0.1.0a3).
If any check rejects a decision, conformlock returns an abstain verdict instead of the model's prediction; the caller decides what to do (escalate to a human, return a default, retry, etc.).
| Is | Is not |
|---|---|
| An engineering convenience layer over well-known statistical methods | A formal verification system |
| CPU-only, deterministic given seeds | A guarantee of safety, correctness, or compliance |
| Useful for streaming tabular / time-series inference | An LLM agent oversight tool (see Subjunctor for that scope) |
| A starting point for runtime monitoring policies | A drop-in replacement for human review |
The word "lock" in the name is metaphorical; this library does not lock anything down.
# Not on PyPI yet — install from the GitHub release tag.
pip install "conformlock @ git+https://github.com/hinanohart/conformlock@v0.1.0a3"
# or with notebook ML extras:
pip install "conformlock[ml] @ git+https://github.com/hinanohart/conformlock@v0.1.0a3"Extras: [ml] adds scikit-learn and torch (notebook examples only); the core has no ML framework dependency.
import numpy as np
from conformlock import (
ConformalCalibrator,
Decision,
LTLfSpec,
Verifier,
)
# 1. Calibrate split conformal on a held-out set.
cal = ConformalCalibrator(alpha=0.1) # target 90% coverage
cal.fit(scores=np.array([0.12, 0.08, 0.31, 0.05, 0.22]))
# 2. Write the LTLf rule:
# "if a 'risky' decision is made, an 'audit' decision must follow within 5 steps".
spec = LTLfSpec.parse("G (risky -> F[0,5] audit)")
# 3. Build a verifier. The calibrator already carries the target ``alpha``;
# the verifier just orchestrates conformal → property → drift.
v = Verifier(calibrator=cal, spec=spec)
# 4. Use it on the streaming predictor. ``stream`` is whatever your caller
# yields; one ``Decision`` per inference is the contract.
for record_id, score, atom in stream: # e.g. atom in {"risky", "audit", ""}
decision = Decision(
record_id=record_id,
score=score,
atoms=frozenset({atom}) if atom else frozenset(),
)
verdict = v.observe(decision)
if verdict.action == "abstain":
handle_escalation(verdict) # e.g. route to a human reviewer
else:
act_on(verdict) # the caller decides what acting meansNumbers in the snippet above are illustrative inputs, not benchmark claims; stream, handle_escalation, and act_on are placeholders the caller supplies. See examples/ for runnable scripts with deterministic seeds and printable output, and tests/test_readme_example.py for the CI-enforced regression test that runs this exact snippet end-to-end.
- It does not prove that the underlying model is calibrated, fair, or correct.
- It does not claim coverage guarantees in the strict statistical sense once the data-generating distribution drifts; drift detection only tells you that the assumption has likely broken.
- It does not verify LLM agents, function-call traces, or tool use — see Subjunctor.
- It is not evaluated against any regulatory certification scheme.
| Library | Latest release | Streaming online conformal | LTLf/MTL property layer | Tamper-evident ledger | License |
|---|---|---|---|---|---|
| MAPIE | v1.4.0 (2026-04-30) | Partial (batch time-series only, no ACI) | No | No | BSD-3-Clause |
| crepes | active | No streaming hook | No | No | BSD-3-Clause |
| nonconformist | maintenance only | No | No | No | MIT |
| conformlock | v0.1.0a3 (2026-05-24) | Yes (split CP + ACI; ACI advances through Verifier.record_outcome) |
Yes (self-implemented LTLf, finite-trace re-evaluator; no DFA pre-compile) | Yes (BLAKE3 chain; single-writer threat model — see ledger note below) | Apache-2.0 |
The combination split conformal + temporal-logic monitor + drift + ledger is the unit of value conformlock claims to add; on any single one of those four axes, several mature OSS projects already exist and conformlock is not trying to replace them:
- alibi-detect — drift and outlier detection (no conformal, no temporal logic, no ledger).
- Frouros — 31 drift-detection methods (no conformal, no temporal logic, no ledger).
- NannyML — drift + performance estimation under absent labels (no conformal interval, no temporal logic, no ledger).
- Evidently, deepchecks, whylogs — ML-monitoring dashboards (no per-decision conformal interval, no temporal logic).
- river — online learning primitives (no conformal, no ledger).
If you only need the drift axis you should look at those first; conformlock exists for the case where you specifically need the per-decision conformal abstain + finite-trace temporal-property monitor + tamper-evident log together. No equivalent OSS combination was found at scaffold time (2026-05-24); please file an issue if one exists.
The EU AI Act (Article 15 — Accuracy, Robustness and Cybersecurity, enforceable for high-risk AI systems from 2 August 2026) requires high-risk systems to "achieve an appropriate level of accuracy, robustness and cybersecurity, and perform consistently … throughout their lifecycle," and to disclose accuracy metrics in the instructions for use.
conformlock is designed with reference to that text: it gives operators a programmatic way to produce per-decision uncertainty bounds, detect distributional drift, and retain an audit log. It does not by itself make any AI system "Article 15 compliant"; compliance is an organisational and process determination that an operator's notified body or supervisory authority makes.
Footnote: the phrase "designed with reference to" has not been reviewed by EU legal counsel. We use it as a documentation citation, not as a compliance attestation. If you need a legal opinion on whether shipping
conformlockinside your product changes your Article 15 posture, ask a regulatory lawyer, not the README.
Similarly, the library is not certified against ISO/IEC 23894:2023, NIST AI RMF, FDA SaMD Good Machine Learning Practice, or any other framework. We deliberately avoid the marketing register that would imply such a posture (see docs/honest-marketing-policy.md for the exact CI-enforced exclusion list).
conformlock v0.1.0a1 and v0.1.0a3 were assembled by an LLM-driven autonomous workflow under the project author's account (hinanohart). The only human in the loop is the author; no third party has independently reviewed the implementation or the marketing claims at release time. The git tag v0.1.0a3 supersedes v0.1.0a1, which was retained on GitHub solely so the audit trail remains intact; new users should install v0.1.0a3 and treat the line accordingly until an external reviewer signs off. Issues filed against either tag are welcome.
- Vovk, Gammerman, Shafer — Algorithmic Learning in a Random World (Springer 2005) — split / inductive conformal prediction.
- Gibbs & Candès (2021) — Adaptive Conformal Inference Under Distribution Shift (NeurIPS 2021) — ACI update rule.
- De Giacomo & Vardi (2013) — Linear Temporal Logic and Linear Dynamic Logic on Finite Traces (IJCAI 2013) — LTLf semantics.
- Bauer, Leucker, Schallhart (2011) — Runtime Verification for LTL and TLTL (ACM TOSEM 20(4)) — three-valued LTL₃ semantics; this library's permanent-verdict heuristic follows the same spirit.
- Lindemann, Qin, Fan, Pappas, Bastani (2022) — Conformal Prediction for STL Runtime Verification (arXiv:2211.01539) — academic predecessor targeting STL (continuous-time temporal logic);
conformlocktargets LTLf (discrete-step finite-trace) and ships a public Apache-2.0 implementation. - Bifet & Gavaldà (2007) — Learning from Time-Changing Data with Adaptive Windowing — ADWIN.
- O'Connor, Aumasson, Neves, Wilcox-O'Hearn — BLAKE3 (2020) — hash function used for the ledger.
conformlock/
├─ src/conformlock/ # core library (numpy + scipy + blake3 + python-ulid)
├─ tests/ # unit + property + ledger-tamper tests
├─ examples/ # runnable CPU-only examples
├─ notebooks/ # optional [ml] extra: scikit-learn / torch
├─ docs/ # background and design notes
├─ CHANGELOG.md
├─ ROADMAP.md
└─ LICENSE # Apache-2.0
See ROADMAP.md. Highlights: a more general MTL fragment, an offline conformlock-replay tool to re-verify an existing ledger, and HuggingFace dataset bindings (deferred to v0.1.1).
The names verielle and decisionscope appear in the roadmap as v0.3 backlog items, not promises.
This is a pre-alpha personal project. Issues with reproducible repro steps and small focused PRs are welcome; please open an issue first for anything larger than a typo.
Apache License, Version 2.0 — see LICENSE.