Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 35 additions & 9 deletions .github/workflows/lean-proofs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,14 +46,17 @@ jobs:
path: |
formal-verification/lean/.lake
~/.elan
key: lean-${{ runner.os }}-${{ hashFiles('formal-verification/lean/lean-toolchain', 'formal-verification/lean/lakefile.toml') }}
key: lean-${{ runner.os }}-${{ hashFiles('formal-verification/lean/lean-toolchain', 'formal-verification/lean/lakefile.toml', 'formal-verification/lean/lake-manifest.json') }}
restore-keys: |
lean-${{ runner.os }}-

- name: Install elan (Lean version manager)
if: steps.check-lean-files.outputs.lean_count != '0' && steps.restore-cache.outputs.cache-hit != 'true'
run: |
ELAN_VERSION="v4.2.1"
# elan v3.1.0 is the latest stable release with verified binaries.
# Note: the elan GitHub release page does NOT ship *.sha256 files for
# v3.1.0, so we do a file-size sanity check instead of sha256sum.
ELAN_VERSION="v3.1.0"
ELAN_TARGET="x86_64-unknown-linux-gnu"
ELAN_ARCHIVE="elan-${ELAN_TARGET}.tar.gz"
ELAN_BASE_URL="https://github.com/leanprover/elan/releases/download/${ELAN_VERSION}"
Expand All @@ -73,13 +76,15 @@ jobs:
exit 1
fi

# Download elan archive and its SHA-256 checksum from GitHub Releases.
# The checksum is fetched from the same release to avoid hardcoding a
# version-specific hash that would silently mismatch on elan upgrades.
# Download elan archive. No sha256 companion file is published for
# this release, so we verify the archive is non-empty instead.
curl -sSfL "${ELAN_BASE_URL}/${ELAN_ARCHIVE}" -o "${ELAN_TMP_DIR}/${ELAN_ARCHIVE}"
curl -sSfL "${ELAN_BASE_URL}/${ELAN_ARCHIVE}.sha256" -o "${ELAN_TMP_DIR}/${ELAN_ARCHIVE}.sha256"
ARCHIVE_SIZE=$(stat -c%s "${ELAN_TMP_DIR}/${ELAN_ARCHIVE}" 2>/dev/null || stat -f%z "${ELAN_TMP_DIR}/${ELAN_ARCHIVE}")
if [ "${ARCHIVE_SIZE:-0}" -lt 1000000 ]; then
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick (Minor): 1000000 (1 MB) is a magic number. While the intent — catching a suspiciously small or empty archive — is clear from the surrounding comment, a named variable would make the threshold explicit and easier to adjust:

MIN_ARCHIVE_BYTES=1000000  # elan archives are typically ~5 MB; 1 MB signals corruption
if [ "\$\{ARCHIVE_SIZE:-0}" -lt "\$\{MIN_ARCHIVE_BYTES}" ]; then

This is especially helpful since the threshold is the only integrity check replacing the now-absent SHA-256 verification.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Security] The file-size check (≥ 1 MB) is not a meaningful integrity guarantee — any malicious or corrupted archive that exceeds 1 MB will pass silently.

Mechanism: The original flow fetched a .sha256 companion file from the same GitHub Release and called sha256sum -c to cryptographically verify the binary. That guarantee is now gone. A MITM or a compromised CDN serving a large-but-malicious tarball would not be caught.

Impact: A tampered elan binary is executed directly via ./elan-init -y, which installs toolchain binaries into the runner's ~/.elan path and is subsequently used to build Lean proofs. Supply-chain risk in CI.

Suggestion: Hardcode the known-good SHA-256 digest of elan-x86_64-unknown-linux-gnu.tar.gz for v3.1.0 and verify against it, even if GitHub doesn't publish a companion file:

EXPECTED_SHA256="<precomputed-sha256-of-v3.1.0-archive>"
echo "\$\{EXPECTED_SHA256}  \$\{ELAN_TMP_DIR}/\$\{ELAN_ARCHIVE}" | sha256sum -c

This is a one-time cost per elan version bump and provides real supply-chain protection.

echo "elan archive appears corrupt or too small (${ARCHIVE_SIZE} bytes)" >&2
exit 1
fi
cd "${ELAN_TMP_DIR}"
sha256sum -c "${ELAN_ARCHIVE}.sha256"
tar -xzf "${ELAN_ARCHIVE}"
./elan-init -y --default-toolchain "${LEAN_TOOLCHAIN}"
rm -rf "${ELAN_TMP_DIR}"
Expand All @@ -97,18 +102,39 @@ jobs:
working-directory: formal-verification/lean
run: lake build

- name: Check for sorry (unfinished proofs)
if: steps.check-lean-files.outputs.lean_count != '0'
id: sorry-check
run: |
LEAN_DIR="formal-verification/lean/FVSquad"
SORRY_FILES=$(grep -rl '\bsorry\b' "${LEAN_DIR}" --include='*.lean' 2>/dev/null || true)
if [ -n "${SORRY_FILES}" ]; then
echo "⚠️ Files containing sorry (unfinished proofs):"
echo "${SORRY_FILES}"
grep -rn '\bsorry\b' "${LEAN_DIR}" --include='*.lean'
Comment on lines +110 to +114
echo "sorry_found=true" >> "$GITHUB_OUTPUT"
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick (Minor): sorry_found is written to $GITHUB_OUTPUT on both branches, but no subsequent step references steps.sorry-check.outputs.sorry_found. If it is only used for human-readable console output, the id: sorry-check and the >> "$GITHUB_OUTPUT" lines are dead infrastructure — misleading future maintainers into thinking some downstream step depends on this value. Either wire it up to a conditional step (e.g. fail the job when sorry_found=true) or remove the id and output lines and keep the step purely informational.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Correctness] The sorry-check step detects sorry stubs but never fails the build — the CI will report green even if proofs are incomplete.

Mechanism: When sorry is found, the step prints a warning and sets sorry_found=true as an output, but there is no exit 1 and no downstream step gates on steps.sorry-check.outputs.sorry_found. The lake build step itself also does not fail on sorry by default in Lean 4 (it emits a warning, not an error).

Impact: A future PR that introduces sorry placeholders (intentional or accidental) will pass CI, undermining the purpose of the checker.

Suggestion: Either add exit 1 inside the if [ -n "\$\{SORRY_FILES}" ] block, or add an explicit gate step after the sorry-check:

- name: Fail on sorry
  if: steps.sorry-check.outputs.sorry_found == 'true'
  run: |
    echo "CI failed: unfinished sorry proofs found." >&2
    exit 1

If in-progress proofs with sorry should be allowed on feature branches but blocked on main, the condition could check github.ref == 'refs/heads/main'.

else
echo "✅ No sorry found — all proofs are complete."
echo "sorry_found=false" >> "$GITHUB_OUTPUT"
fi

- name: Proof summary
if: steps.check-lean-files.outputs.lean_count != '0'
run: |
LEAN_DIR="formal-verification/lean/FVSquad"
THEOREM_COUNT=$(grep -rEc '^(theorem|lemma) ' "${LEAN_DIR}" --include='*.lean' 2>/dev/null || echo 0)
SORRY_COUNT=$(grep -rc '\<sorry\>' "${LEAN_DIR}" --include='*.lean' 2>/dev/null \
SORRY_COUNT=$(grep -rc '\bsorry\b' "${LEAN_DIR}" --include='*.lean' 2>/dev/null \
| awk -F: '{sum += $2} END {print sum+0}')
SORRY_STATUS="✅ All proofs complete"
if [ "${SORRY_COUNT}" -gt 0 ]; then
SORRY_STATUS="⚠️ ${SORRY_COUNT} sorry (stub) lines remain"
fi
{
echo "## 🔬 Lean Proof Summary"
echo ""
echo "| Metric | Count |"
echo "| Metric | Value |"
echo "|--------|-------|"
echo "| Theorems / Lemmas declared | ${THEOREM_COUNT} |"
echo "| Lines containing \`sorry\` (stubs) | ${SORRY_COUNT} |"
echo "| Proof status | ${SORRY_STATUS} |"
} >> "$GITHUB_STEP_SUMMARY"
4 changes: 4 additions & 0 deletions formal-verification/lean/FVSquad.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
-- FVSquad: Lean 4 formal verification artifacts for microsoft/testfx
-- 🔬 Lean Squad — auto-generated
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick (Minor): The comment -- 🔬 Lean Squad — auto-generated is slightly misleading. This file appears to be hand-crafted (it selectively imports only FVSquad.TreeNodeFilter). If it will be regenerated automatically as targets grow, add a note about what generates it. If it's maintained by hand, drop "auto-generated" to avoid confusion and unintentional overwriting by future tooling.


import FVSquad.TreeNodeFilter
207 changes: 207 additions & 0 deletions formal-verification/lean/FVSquad/TreeNodeFilter.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,207 @@
/-
FVSquad.TreeNodeFilter
Formal specification of TreeNodeFilter.MatchFilterPattern from
src/Platform/Microsoft.Testing.Platform/Requests/TreeNodeFilter/TreeNodeFilter.cs

🔬 Lean Squad — auto-generated formal-verification artifact.
Target ID: 7 | Phase: 3 (Lean 4 formal spec) | Date: 2026-05-07

## What this file contains
- A Lean 4 model of the FilterExpression abstract syntax tree
- An opaque Bool-valued abstract glob predicate
- Recursive evaluators (mutual block) mirroring MatchFilterPattern
- Proved theorems for Boolean-algebra invariants B1-B12 from informal spec

## Approximations / limitations
1. Regex matching is abstracted as opaque Bool function `matchesGlob`.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick (Minor): The comment says "Regex matching is abstracted as opaque Bool function matchesGlob", but the function is named matchesGlob — suggesting a glob/wildcard abstraction, not regex. The C# implementation uses regex under the hood, but the Lean model uses glob semantics. This naming mismatch could confuse readers about what is actually being abstracted. Consider aligning the language: either call it "glob/wildcard matching" or rename the opaque function to matchesRegex to match the C# reality.

2. Property matching is abstracted as String -> Bool inside `withProps`.
3. MatchesFilter (the public entry point) is not modelled here.
4. Mutual recursion (evalFilter / evalFilterAll / evalFilterAny) is used
for structural termination.

## Lean 4 notation note
In Lean 4, `=` has precedence 50 and `&&`/`||` have precedence 35, so
`a = b && c` parses as `(a = b) && c`. All Bool equalities involving `&&`
or `||` on the right side are explicitly parenthesised in this file.
-/

-- §1 Abstract glob predicate
opaque matchesGlob : String -> String -> Bool

-- §2 FilterExpression abstract syntax tree
inductive FilterExpr : Type where
| leaf (pattern : String) : FilterExpr
| nop : FilterExpr
| and (subExprs : List FilterExpr) : FilterExpr
| or (subExprs : List FilterExpr) : FilterExpr
| not (inner : FilterExpr) : FilterExpr
| withProps (value : FilterExpr) (propPred : String -> Bool) : FilterExpr
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick (Minor): The constructor parameter is named value but represents a sub-filter expression. The name value is generic and ambiguous — it could mean the evaluated result rather than a sub-FilterExpr. Consider a more descriptive name like subExpr or filterExpr to match the naming used in and/or constructors (subExprs).


-- §3 Core evaluator (mutual block for structural termination)
mutual
def evalFilter : FilterExpr -> String -> Bool
| .leaf p, s => matchesGlob p s
| .nop, _ => true
| .and es, s => evalFilterAll es s
| .or es, s => evalFilterAny es s
| .not inner, s => ! evalFilter inner s
| .withProps v f, s => evalFilter v s && f s
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick (Minor): The pattern-match variable f doesn't match the parameter name propPred used in the FilterExpr.withProps constructor declaration (line 38). Consistent naming between the type definition and match arms reduces the cognitive overhead of cross-referencing the two. Consider using propPred here as well:

| .withProps v propPred, s => evalFilter v s && propPred s


def evalFilterAll : List FilterExpr -> String -> Bool
| [], _ => true
| e :: es, s => evalFilter e s && evalFilterAll es s

def evalFilterAny : List FilterExpr -> String -> Bool
| [], _ => false
| e :: es, s => evalFilter e s || evalFilterAny es s
end

-- §4 Definitional equation lemmas (rw [...eq_def] closes via match reduction)
@[simp] theorem evalFilter_leaf_eq (p s : String) :
evalFilter (.leaf p) s = matchesGlob p s := by rw [evalFilter.eq_def]

@[simp] theorem evalFilter_nop_eq (s : String) :
evalFilter .nop s = true := by rw [evalFilter.eq_def]

@[simp] theorem evalFilter_and_eq (es : List FilterExpr) (s : String) :
evalFilter (.and es) s = evalFilterAll es s := by rw [evalFilter.eq_def]

@[simp] theorem evalFilter_or_eq (es : List FilterExpr) (s : String) :
evalFilter (.or es) s = evalFilterAny es s := by rw [evalFilter.eq_def]

@[simp] theorem evalFilter_not_eq (e : FilterExpr) (s : String) :
evalFilter (.not e) s = ! evalFilter e s := by rw [evalFilter.eq_def]

-- Note: RHS parenthesised to avoid `=` binding tighter than `&&` (prec 50 > 35)
@[simp] theorem evalFilter_withProps_eq (v : FilterExpr) (f : String -> Bool) (s : String) :
evalFilter (.withProps v f) s = (evalFilter v s && f s) := by rw [evalFilter.eq_def]

@[simp] theorem evalFilterAll_nil (s : String) :
evalFilterAll [] s = true := by rw [evalFilterAll.eq_def]

-- Note: RHS parenthesised
@[simp] theorem evalFilterAll_cons (e : FilterExpr) (es : List FilterExpr) (s : String) :
evalFilterAll (e :: es) s = (evalFilter e s && evalFilterAll es s) := by
rw [evalFilterAll.eq_def]

@[simp] theorem evalFilterAny_nil (s : String) :
evalFilterAny [] s = false := by rw [evalFilterAny.eq_def]

-- Note: RHS parenthesised
@[simp] theorem evalFilterAny_cons (e : FilterExpr) (es : List FilterExpr) (s : String) :
evalFilterAny (e :: es) s = (evalFilter e s || evalFilterAny es s) := by
rw [evalFilterAny.eq_def]

-- §5 De Morgan helpers
-- Note: LHS `(!evalFilterAny es s)` is parenthesised to force Bool.not reading
-- Note: RHS is Bool conjunction, so parenthesised

theorem evalFilterAny_not_eq_all (es : List FilterExpr) (s : String) :
(!evalFilterAny es s) = evalFilterAll (es.map .not) s := by
induction es with
| nil => simp
| cons h t ih =>
simp only [evalFilterAny_cons, evalFilterAll_cons, List.map_cons]
rw [Bool.not_or, ← evalFilter_not_eq]
exact congrArg (evalFilter (.not h) s && ·) ih

theorem evalFilterAll_not_eq_any (es : List FilterExpr) (s : String) :
(!evalFilterAll es s) = evalFilterAny (es.map .not) s := by
induction es with
| nil => simp
| cons h t ih =>
simp only [evalFilterAll_cons, evalFilterAny_cons, List.map_cons]
rw [Bool.not_and, ← evalFilter_not_eq]
exact congrArg (evalFilter (.not h) s || ·) ih

-- §6 Boolean-algebra theorems B1-B12

-- B1: NopExpression always returns true
theorem evalFilter_nop (s : String) : evalFilter .nop s = true := by simp

-- B2: Double negation elimination
theorem evalFilter_not_not (e : FilterExpr) (s : String) :
evalFilter (.not (.not e)) s = evalFilter e s := by
simp [Bool.not_not]

-- B3: De Morgan — not(or es) = and(map not es)
theorem evalFilter_not_or_eq_and_map_not (es : List FilterExpr) (s : String) :
evalFilter (.not (.or es)) s = evalFilter (.and (es.map .not)) s := by
simp only [evalFilter_not_eq, evalFilter_or_eq, evalFilter_and_eq]
exact evalFilterAny_not_eq_all es s

-- B4: De Morgan — not(and es) = or(map not es)
theorem evalFilter_not_and_eq_or_map_not (es : List FilterExpr) (s : String) :
evalFilter (.not (.and es)) s = evalFilter (.or (es.map .not)) s := by
simp only [evalFilter_not_eq, evalFilter_and_eq, evalFilter_or_eq]
exact evalFilterAll_not_eq_any es s

-- B5: And commutativity (two-element list)
theorem evalFilter_and_comm (a b : FilterExpr) (s : String) :
evalFilter (.and [a, b]) s = evalFilter (.and [b, a]) s := by
simp [Bool.and_comm]

-- B6: Or commutativity (two-element list)
theorem evalFilter_or_comm (a b : FilterExpr) (s : String) :
evalFilter (.or [a, b]) s = evalFilter (.or [b, a]) s := by
simp [Bool.or_comm]

-- B7: Singleton And
theorem evalFilter_and_singleton (e : FilterExpr) (s : String) :
evalFilter (.and [e]) s = evalFilter e s := by simp

-- B8: Singleton Or
theorem evalFilter_or_singleton (e : FilterExpr) (s : String) :
evalFilter (.or [e]) s = evalFilter e s := by simp

-- B9: Nop is the And-identity
theorem evalFilter_and_nop_left (e : FilterExpr) (s : String) :
evalFilter (.and [.nop, e]) s = evalFilter e s := by simp

-- B10: Nop absorbs Or
theorem evalFilter_or_nop_absorb (e : FilterExpr) (s : String) :
evalFilter (.or [.nop, e]) s = true := by simp

-- B11: Vacuous And (empty conjunction is true)
theorem evalFilter_and_empty (s : String) : evalFilter (.and []) s = true := by simp

-- B12: Vacuous Or (empty disjunction is false)
theorem evalFilter_or_empty (s : String) : evalFilter (.or []) s = false := by simp

-- §7 Additional structural properties

-- Double negation of the Bool result (parenthesise `!!` to force Bool reading)
theorem evalFilter_result_double_neg (e : FilterExpr) (s : String) :
(!! evalFilter e s) = evalFilter e s := Bool.not_not _

-- withProps fails if the glob fails
theorem evalFilter_withProps_false_glob
(e : FilterExpr) (f : String -> Bool) (s : String)
(h : evalFilter e s = false) :
evalFilter (.withProps e f) s = false := by simp [h]

-- withProps fails if the property predicate fails
theorem evalFilter_withProps_false_prop
(e : FilterExpr) (f : String -> Bool) (s : String)
(h : f s = false) :
evalFilter (.withProps e f) s = false := by simp [h]

-- If and [a, b] matches then a alone matches
theorem evalFilter_and_left_implies (a b : FilterExpr) (s : String)
(h : evalFilter (.and [a, b]) s = true) : evalFilter a s = true := by
simp at h; exact h.1

-- If and [a, b] matches then b alone matches
theorem evalFilter_and_right_implies (a b : FilterExpr) (s : String)
(h : evalFilter (.and [a, b]) s = true) : evalFilter b s = true := by
simp at h; exact h.2

-- If or [a, b] does not match then a alone does not match
theorem evalFilter_or_left_false (a b : FilterExpr) (s : String)
(h : evalFilter (.or [a, b]) s = false) : evalFilter a s = false := by
simp [Bool.or_eq_false_iff] at h; exact h.1

-- Triple negation reduces to single negation
theorem evalFilter_not_not_not (e : FilterExpr) (s : String) :
evalFilter (.not (.not (.not e))) s = evalFilter (.not e) s := by
simp [Bool.not_not]
44 changes: 26 additions & 18 deletions formal-verification/lean/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,41 +8,49 @@ This directory contains the Lean 4 formal verification project for `microsoft/te

```
lean/
lakefile.toml # Lake build configuration
lakefile.toml # Lake build configuration (no Mathlib; standalone)
lean-toolchain # Pins exact Lean version (leanprover/lean4:v4.29.1)
lake-manifest.json # Resolved Lake dependency manifest
FVSquad.lean # Root module (imports all sub-modules)
FVSquad/
<Name>.lean # One file per FV target (spec + implementation model + proofs)
<Name>.lean # One file per FV target (spec + proofs)
```

## Building

Requires [elan](https://github.com/leanprover/elan) (the Lean version manager).

```bash
# Install a pinned elan release (example for Linux x86_64)
ELAN_VERSION=v4.2.1
curl -sSfL -o elan.tar.gz "https://github.com/leanprover/elan/releases/download/${ELAN_VERSION}/elan-x86_64-unknown-linux-gnu.tar.gz"
curl -sSfL -o elan.tar.gz.sha256 "https://github.com/leanprover/elan/releases/download/${ELAN_VERSION}/elan-x86_64-unknown-linux-gnu.tar.gz.sha256"
sha256sum -c elan.tar.gz.sha256
# Install elan v3.1.0 (Linux x86_64)
ELAN_VERSION=v3.1.0
curl -sSfL -o elan.tar.gz \
"https://github.com/leanprover/elan/releases/download/${ELAN_VERSION}/elan-x86_64-unknown-linux-gnu.tar.gz"
tar -xzf elan.tar.gz
./elan-init -y
./elan-init -y --default-toolchain leanprover/lean4:v4.29.1

# Build the project
cd formal-verification/lean
lake build
```

The `lean-toolchain` file pins the exact Lean version to match the Mathlib dependency.
The `lean-toolchain` file pins the exact Lean version. `lake build` creates a
local `.lake/` directory with build artifacts (gitignored).

`lake build` creates a local `.lake/` directory with build artifacts.
The repository root `.gitignore` ignores this directory to avoid accidentally committing
generated files.
## Targets

## Status

No Lean source files yet — targets are currently in Phase 1 (Research) or Phase 2 (Informal Spec).
Lean source files will be added when targets advance to Phase 3 (Formal Spec Writing).
| File | Target | Phase | Theorems | sorry |
|------|--------|-------|----------|-------|
| `FVSquad/TreeNodeFilter.lean` | `TreeNodeFilter.MatchFilterPattern` | 3 | 21 | 0 |
Comment on lines +41 to +43

## CI

A GitHub Actions workflow (`.github/workflows/lean-proofs.yml`) runs `lake build`
on every PR that modifies files in this directory.
`.github/workflows/lean-proofs.yml` runs `lake build` on every PR that touches
files in `formal-verification/lean/`. It also reports theorem and `sorry` counts
in the job summary.

## Notes

- **No Mathlib**: The lakefile does not depend on Mathlib (CI firewalls block
the cache download). All proofs use Lean 4 core tactics only.
- **Opaque axioms**: `matchesGlob` is an opaque `Bool`-valued function abstracting
the C# regex matching.
5 changes: 5 additions & 0 deletions formal-verification/lean/lake-manifest.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{"version": "1.1.0",
"packagesDir": ".lake/packages",
"packages": [],
"name": "FVSquad",
"lakeDir": ".lake"}
2 changes: 1 addition & 1 deletion formal-verification/lean/lean-toolchain
Original file line number Diff line number Diff line change
@@ -1 +1 @@
leanprover/lean4:v4.14.0
leanprover/lean4:v4.29.1
Loading