[Lean Squad] feat(fv): Task 3+9 — Lean 4 formal spec for TreeNodeFilter.MatchFilterPattern + CI fix#8111
Conversation
…ask 3) 🔬 Lean Squad — formal verification artifact Target: TreeNodeFilter.MatchFilterPattern (Target #7, Phase 2→3) File: formal-verification/lean/FVSquad/TreeNodeFilter.lean ## What's included - FilterExpr inductive type modelling the C# FilterExpression hierarchy (leaf/nop/and/or/not/withProps constructors) - Mutual recursive evaluators: evalFilter, evalFilterAll, evalFilterAny mirroring the MatchFilterPattern switch statement - 12 definitional @[simp] equation lemmas (proved via rw [*.eq_def]) - 2 De Morgan helper lemmas (proved by induction on List FilterExpr) - 12 Boolean-algebra theorems B1-B12 from the informal spec (all proved, no sorry) - 7 additional structural properties (proved) ## Proof status All 21 theorems are proved. No sorry. lake build passes clean. ## Approximations - matchesGlob is an opaque Bool-valued axiom (real impl uses Regex) - Property matching abstracted as String -> Bool inside withProps - MatchesFilter (public entry) not modelled in this PR ## Infrastructure changes - lakefile.toml: removed Mathlib dependency (firewall blocks cache) - lean-toolchain: updated v4.14.0 -> v4.29.1 (stable) - FVSquad.lean: root module file created Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… sorry-checker step - Change elan install from v4.2.1 (non-existent) to v3.1.0 (verified) - Remove sha256sum check (no .sha256 file published for v3.1.0); add file-size sanity check instead - Include lake-manifest.json in cache key for better cache invalidation - Add 'Check for sorry' step that lists any unfinished proof stubs - Improve proof summary: add proof-status row and fix sorry grep pattern - Update README to reflect current toolchain (v4.29.1, no Mathlib) and current FVSquad targets table Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR advances the formal-verification (Lean 4) effort by adding a Lean specification/proof module for TreeNodeFilter.MatchFilterPattern and updating the Lean CI workflow/tooling docs to build and report proof status in GitHub Actions.
Changes:
- Add
FVSquad.TreeNodeFilterLean module modeling filter-expression evaluation plus accompanying theorems. - Update Lean project plumbing (root module import, toolchain pin, and manifest) and refresh the Lean README.
- Adjust
.github/workflows/lean-proofs.ymlto install a pinnedelanversion, update caching, and add a “sorry” scan + proof summary output.
Show a summary per file
| File | Description |
|---|---|
formal-verification/lean/README.md |
Updates Lean project structure/build instructions and adds a targets/status table. |
formal-verification/lean/lean-toolchain |
Pins the Lean toolchain to leanprover/lean4:v4.29.1. |
formal-verification/lean/lake-manifest.json |
Adds the Lake dependency manifest to support reproducibility/caching. |
formal-verification/lean/FVSquad/TreeNodeFilter.lean |
New Lean formal spec + theorems for filter evaluation semantics. |
formal-verification/lean/FVSquad.lean |
Adds root module importing the new TreeNodeFilter module. |
.github/workflows/lean-proofs.yml |
Updates CI cache key, pinned elan install logic, and adds sorry/proof summary reporting. |
Copilot's findings
Comments suppressed due to low confidence (1)
.github/workflows/lean-proofs.yml:127
- In the proof summary,
SORRY_COUNTuses\bsorry\b, which won’t act as a word-boundary with defaultgrepand can undercount to 0 even whensorryexists. Also,THEOREM_COUNTonly matches lines starting withtheorem|lemma, so it won’t count declarations preceded by attributes like@[simp] theorem, making the reported theorem total misleading once such declarations exist (as in the new TreeNodeFilter.lean).
LEAN_DIR="formal-verification/lean/FVSquad"
THEOREM_COUNT=$(grep -rEc '^(theorem|lemma) ' "${LEAN_DIR}" --include='*.lean' 2>/dev/null || echo 0)
SORRY_COUNT=$(grep -rc '\bsorry\b' "${LEAN_DIR}" --include='*.lean' 2>/dev/null \
| awk -F: '{sum += $2} END {print sum+0}')
- Files reviewed: 6/6 changed files
- Comments generated: 2
| SORRY_FILES=$(grep -rl '\bsorry\b' "${LEAN_DIR}" --include='*.lean' 2>/dev/null || true) | ||
| if [ -n "${SORRY_FILES}" ]; then | ||
| echo "⚠️ Files containing sorry (unfinished proofs):" | ||
| echo "${SORRY_FILES}" | ||
| grep -rn '\bsorry\b' "${LEAN_DIR}" --include='*.lean' |
| | File | Target | Phase | Theorems | sorry | | ||
| |------|--------|-------|----------|-------| | ||
| | `FVSquad/TreeNodeFilter.lean` | `TreeNodeFilter.MatchFilterPattern` | 3 | 21 | 0 | |
Evangelink
left a comment
There was a problem hiding this comment.
Summary
Workflow: PR Nitpick Reviewer
Date: 2026-05-11
Repository: microsoft/testfx
Overview
This is a well-structured PR adding a Lean 4 formal specification for TreeNodeFilter.MatchFilterPattern alongside CI improvements. The Lean file is thorough — 21 theorems covering B1–B12 Boolean-algebra invariants plus structural properties, all with zero sorry. The CI changes are pragmatic (downgrading to elan v3.1.0 with a file-size sanity check instead of the missing SHA-256).
Key Findings
- Terminology mismatch (
TreeNodeFilter.lean:16) — Comments say "Regex matching" but the abstraction is namedmatchesGlob, creating ambiguity about what is being modelled. - Naming inconsistency (
TreeNodeFilter.lean:38, 48) — Constructor parameterpropPredin the type definition becomesfin the pattern match;valuefor a sub-expression is also ambiguous. - Magic number (
lean-proofs.yml:83) —1000000byte threshold lacks a named constant or comment quantifying expected archive size. - Unused
GITHUB_OUTPUT(lean-proofs.yml:115) —sorry_foundis written to step outputs but never consumed downstream; either wire it to a conditional failure or drop the output infrastructure. - Misleading "auto-generated" label (
FVSquad.lean:2) — The root module appears to be hand-maintained, making the label potentially confusing for future contributors.
Positive Highlights
- Clean mutual-recursion design for structural termination — a subtle Lean 4 requirement handled correctly.
- Comprehensive
@[simp]definitional lemmas make the main theorems very readable. - The precedence pitfall (
=vs&&/||) is well-documented in the file header and per-theorem notes. - The CI sorry-checker is a nice addition for automated proof-completeness tracking.
🔍 Meticulously inspected by PR Nitpick Reviewer
🔍 Meticulously inspected by PR Nitpick Reviewer 🔍
| - Proved theorems for Boolean-algebra invariants B1-B12 from informal spec | ||
|
|
||
| ## Approximations / limitations | ||
| 1. Regex matching is abstracted as opaque Bool function `matchesGlob`. |
There was a problem hiding this comment.
Nitpick (Minor): The comment says "Regex matching is abstracted as opaque Bool function matchesGlob", but the function is named matchesGlob — suggesting a glob/wildcard abstraction, not regex. The C# implementation uses regex under the hood, but the Lean model uses glob semantics. This naming mismatch could confuse readers about what is actually being abstracted. Consider aligning the language: either call it "glob/wildcard matching" or rename the opaque function to matchesRegex to match the C# reality.
| | and (subExprs : List FilterExpr) : FilterExpr | ||
| | or (subExprs : List FilterExpr) : FilterExpr | ||
| | not (inner : FilterExpr) : FilterExpr | ||
| | withProps (value : FilterExpr) (propPred : String -> Bool) : FilterExpr |
There was a problem hiding this comment.
Nitpick (Minor): The constructor parameter is named value but represents a sub-filter expression. The name value is generic and ambiguous — it could mean the evaluated result rather than a sub-FilterExpr. Consider a more descriptive name like subExpr or filterExpr to match the naming used in and/or constructors (subExprs).
| | .and es, s => evalFilterAll es s | ||
| | .or es, s => evalFilterAny es s | ||
| | .not inner, s => ! evalFilter inner s | ||
| | .withProps v f, s => evalFilter v s && f s |
There was a problem hiding this comment.
Nitpick (Minor): The pattern-match variable f doesn't match the parameter name propPred used in the FilterExpr.withProps constructor declaration (line 38). Consistent naming between the type definition and match arms reduces the cognitive overhead of cross-referencing the two. Consider using propPred here as well:
| .withProps v propPred, s => evalFilter v s && propPred s| curl -sSfL "${ELAN_BASE_URL}/${ELAN_ARCHIVE}" -o "${ELAN_TMP_DIR}/${ELAN_ARCHIVE}" | ||
| curl -sSfL "${ELAN_BASE_URL}/${ELAN_ARCHIVE}.sha256" -o "${ELAN_TMP_DIR}/${ELAN_ARCHIVE}.sha256" | ||
| ARCHIVE_SIZE=$(stat -c%s "${ELAN_TMP_DIR}/${ELAN_ARCHIVE}" 2>/dev/null || stat -f%z "${ELAN_TMP_DIR}/${ELAN_ARCHIVE}") | ||
| if [ "${ARCHIVE_SIZE:-0}" -lt 1000000 ]; then |
There was a problem hiding this comment.
Nitpick (Minor): 1000000 (1 MB) is a magic number. While the intent — catching a suspiciously small or empty archive — is clear from the surrounding comment, a named variable would make the threshold explicit and easier to adjust:
MIN_ARCHIVE_BYTES=1000000 # elan archives are typically ~5 MB; 1 MB signals corruption
if [ "\$\{ARCHIVE_SIZE:-0}" -lt "\$\{MIN_ARCHIVE_BYTES}" ]; thenThis is especially helpful since the threshold is the only integrity check replacing the now-absent SHA-256 verification.
| echo "⚠️ Files containing sorry (unfinished proofs):" | ||
| echo "${SORRY_FILES}" | ||
| grep -rn '\bsorry\b' "${LEAN_DIR}" --include='*.lean' | ||
| echo "sorry_found=true" >> "$GITHUB_OUTPUT" |
There was a problem hiding this comment.
Nitpick (Minor): sorry_found is written to $GITHUB_OUTPUT on both branches, but no subsequent step references steps.sorry-check.outputs.sorry_found. If it is only used for human-readable console output, the id: sorry-check and the >> "$GITHUB_OUTPUT" lines are dead infrastructure — misleading future maintainers into thinking some downstream step depends on this value. Either wire it up to a conditional step (e.g. fail the job when sorry_found=true) or remove the id and output lines and keep the step purely informational.
| @@ -0,0 +1,4 @@ | |||
| -- FVSquad: Lean 4 formal verification artifacts for microsoft/testfx | |||
| -- 🔬 Lean Squad — auto-generated | |||
There was a problem hiding this comment.
Nitpick (Minor): The comment -- 🔬 Lean Squad — auto-generated is slightly misleading. This file appears to be hand-crafted (it selectively imports only FVSquad.TreeNodeFilter). If it will be regenerated automatically as targets grow, add a note about what generates it. If it's maintained by hand, drop "auto-generated" to avoid confusion and unintentional overwriting by future tooling.
Evangelink
left a comment
There was a problem hiding this comment.
Summary
Workflow: Expert Code Reviewer
Date: 2026-05-11
Repository: microsoft/testfx
Key Findings
[Security] Weak integrity check on elan binary download (.github/workflows/lean-proofs.yml line 83)
The previous flow fetched a .sha256 companion file from GitHub Releases and verified the archive with sha256sum -c. The new flow replaces this with a file-size sanity check (≥ 1 MB), which provides no cryptographic guarantee. The elan binary is executed directly after download (./elan-init -y), making this a supply-chain risk. Suggest hardcoding the SHA-256 digest of the v3.1.0 binary.
[Correctness] sorry-checker step is informational only — build will not fail on incomplete proofs (.github/workflows/lean-proofs.yml line 115)
The step detects sorry placeholders and sets sorry_found=true output, but neither calls exit 1 nor is its output gated on by any subsequent step. Lean 4's lake build does not fail on sorry by default either, so a PR with incomplete proofs will pass CI. This defeats the purpose of the checker.
Positive Observations
- The Lean 4 formal spec (
FVSquad/TreeNodeFilter.lean) is well-structured with 21 machine-verified theorems covering B1–B12 Boolean algebra invariants and additional structural properties. All proofs use only Lean 4 core tactics (no Mathlib dependency), which is appropriate given CI network constraints. - The mutual recursion block (
evalFilter/evalFilterAll/evalFilterAny) correctly mirrors the structural termination argument from the C# implementation. - Adding
lake-manifest.jsonto the cache key is a correct fix — this ensures cache invalidation when the dependency manifest changes. - The
\bsorry\bword-boundary regex ingrepis an improvement over the previous\<sorry\>ERE boundary, as it works consistently across both GNU and BSD grep.
Recommendations
- Hardcode the SHA-256 hash of the
elan-x86_64-unknown-linux-gnu.tar.gzv3.1.0 archive and restoresha256sum -cverification. - Add
exit 1in the sorry-checker'sif [ -n "${SORRY_FILES}" ]branch (or a downstream gate step) to enforce proof completeness in CI.
Generated by Expert Code Reviewer
🧠 Reviewed by Expert Code Reviewer 🧠
| curl -sSfL "${ELAN_BASE_URL}/${ELAN_ARCHIVE}" -o "${ELAN_TMP_DIR}/${ELAN_ARCHIVE}" | ||
| curl -sSfL "${ELAN_BASE_URL}/${ELAN_ARCHIVE}.sha256" -o "${ELAN_TMP_DIR}/${ELAN_ARCHIVE}.sha256" | ||
| ARCHIVE_SIZE=$(stat -c%s "${ELAN_TMP_DIR}/${ELAN_ARCHIVE}" 2>/dev/null || stat -f%z "${ELAN_TMP_DIR}/${ELAN_ARCHIVE}") | ||
| if [ "${ARCHIVE_SIZE:-0}" -lt 1000000 ]; then |
There was a problem hiding this comment.
[Security] The file-size check (≥ 1 MB) is not a meaningful integrity guarantee — any malicious or corrupted archive that exceeds 1 MB will pass silently.
Mechanism: The original flow fetched a .sha256 companion file from the same GitHub Release and called sha256sum -c to cryptographically verify the binary. That guarantee is now gone. A MITM or a compromised CDN serving a large-but-malicious tarball would not be caught.
Impact: A tampered elan binary is executed directly via ./elan-init -y, which installs toolchain binaries into the runner's ~/.elan path and is subsequently used to build Lean proofs. Supply-chain risk in CI.
Suggestion: Hardcode the known-good SHA-256 digest of elan-x86_64-unknown-linux-gnu.tar.gz for v3.1.0 and verify against it, even if GitHub doesn't publish a companion file:
EXPECTED_SHA256="<precomputed-sha256-of-v3.1.0-archive>"
echo "\$\{EXPECTED_SHA256} \$\{ELAN_TMP_DIR}/\$\{ELAN_ARCHIVE}" | sha256sum -cThis is a one-time cost per elan version bump and provides real supply-chain protection.
| echo "⚠️ Files containing sorry (unfinished proofs):" | ||
| echo "${SORRY_FILES}" | ||
| grep -rn '\bsorry\b' "${LEAN_DIR}" --include='*.lean' | ||
| echo "sorry_found=true" >> "$GITHUB_OUTPUT" |
There was a problem hiding this comment.
[Correctness] The sorry-check step detects sorry stubs but never fails the build — the CI will report green even if proofs are incomplete.
Mechanism: When sorry is found, the step prints a warning and sets sorry_found=true as an output, but there is no exit 1 and no downstream step gates on steps.sorry-check.outputs.sorry_found. The lake build step itself also does not fail on sorry by default in Lean 4 (it emits a warning, not an error).
Impact: A future PR that introduces sorry placeholders (intentional or accidental) will pass CI, undermining the purpose of the checker.
Suggestion: Either add exit 1 inside the if [ -n "\$\{SORRY_FILES}" ] block, or add an explicit gate step after the sorry-check:
- name: Fail on sorry
if: steps.sorry-check.outputs.sorry_found == 'true'
run: |
echo "CI failed: unfinished sorry proofs found." >&2
exit 1If in-progress proofs with sorry should be allowed on feature branches but blocked on main, the condition could check github.ref == 'refs/heads/main'.
…er.MatchFilterPattern + CI fix (#8111) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixes #8054