v0.3 closed: conformance task — DFG fitness × precision → F#11
Closed
protosphinx wants to merge 2 commits into
Closed
v0.3 closed: conformance task — DFG fitness × precision → F#11protosphinx wants to merge 2 commits into
protosphinx wants to merge 2 commits into
Conversation
- score_conformance — pure CPython, no pm4py dep. F = 2fp/(f+p)
where f and p are computed from set overlap of the submitted DFG
and the test-partition DFG
- pm_bench/conformance.py — extract_dfg, write/read_model_json.
Submission format: {"transitions": [["a","b"], ...]}
- New CLI verb `pm-bench discover <name> --baseline dfg --out
model.json` — discovers a DFG from training cases. Score path
takes --dataset and --split (instead of --prefixes) since the
model is global, not per-prefix
- leaderboard/conformance/synthetic-toy.json with dfg-ref entry
(F=0.857, fitness 1.0, precision 0.75); pm-bench leaderboard
--all now walks 4 boards
- leaderboard.py + CLI standings printer + STANDINGS.md learn the
conformance column set
- 11 new tests (test_conformance.py); 108 total, ruff clean
- v0.3 (5-task scoring) closed: every task has a baseline + entry
No semantic change; ASCII-only punctuation across READMEs, GOALS, source comments, doctests, and config. Verified by running the existing test suite (no test asserts on em-dash text).
This was referenced May 1, 2026
Member
Author
|
Merged into main as part of the audit-cleanup stack (commit 9c00b47). The full content of this PR is now on main. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on top of #10 (CSV ingest). Merge order: #2 → #3 → #4 → #5 → #6 → #7 → #8 → #9 → #10 → this.
Summary
[discovery]extra.What's new
score_conformance— F = 2fp / (f+p) where f and p are computed from set overlap of submitted DFG vs test-partition DFG. Catches both the "model too small" (loses fitness) and "model too big" (loses precision) failure modes.pm_bench/conformance.py— DFG extraction, model JSON r/w. Submission format is a JSON file with atransitionslist of[a, b]pairs.pm-bench discover—discover <name> --baseline dfg --out model.jsondiscovers a DFG from training cases. Score path takes--dataset+--split(not--prefixes) since the model is global, not per-prefix.leaderboard/conformance/synthetic-toy.jsonwith thedfg-refentry: F=0.857 (fitness 1.0, precision 0.75 — the model covers all test transitions but carries 2 extras the test never uses).leaderboard.py, CLI standings printer, andSTANDINGS.mdmarkdown all learn the conformance column set.pm-bench leaderboard --all --verifynow walks 4 boards.test_conformance.py) — extract_dfg, every score corner case (perfect/tiny/big/disjoint), JSON round-trip, board verification, e2e CLI. 108 total.Smoke
Test plan
pytest -q— 108 passed (was 97 on PR CSV ingest — pm-bench works on any event-log CSV without registry plumbing #10)ruff check pm_bench tests— cleanRoadmap impact