[Lean Squad] feat(ci): Task 9 — FV docs validation workflow + updated REPORT.md#7936
Conversation
Add .github/workflows/fv-docs-validation.yml: a CI workflow that validates the structural integrity of formal-verification/ artifacts on every PR or push that touches formal-verification/**. Checks performed: - Required top-level FV docs present (RESEARCH.md, TARGETS.md, CORRESPONDENCE.md, CRITIQUE.md, REPORT.md) - Lean project files present (lakefile.toml, lean-toolchain, README.md) - lean-toolchain is non-empty and well-formed - 🔬 Lean Squad disclosure present in top-level docs - specs/ directory exists - Phase-2+ targets in TARGETS.md each have an informal spec in specs/ (keyword-based matching handles CamelCase name decomposition) Also update formal-verification/REPORT.md: add run history for the 2026-04-27 runs, expand target table with new targets, and document the Lean toolchain blocker. �� Lean Squad — automated formal verification agent Run: https://github.com/microsoft/testfx/actions/runs/24986088168 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds CI validation for the formal-verification/ artifact structure and updates FV reporting to reflect recent runs and current blockers, helping keep Lean Squad formal-verification documentation consistent over time.
Changes:
- Introduce a GitHub Actions workflow that validates required FV docs, Lean project scaffolding, and basic TARGETS/specs consistency on PRs/pushes affecting
formal-verification/**. - Update
formal-verification/REPORT.mdwith refreshed status/summary and additional run history entries.
Show a summary per file
| File | Description |
|---|---|
formal-verification/REPORT.md |
Updates FV status/summary text and adds 2026-04-27 run history entries. |
.github/workflows/fv-docs-validation.yml |
New CI workflow to validate presence/shape of FV docs, Lean project files, and Phase-2+ informal spec alignment. |
Copilot's findings
- Files reviewed: 2/2 changed files
- Comments generated: 3
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| **Phase**: Early — Research expanded to 7 targets; informal specs extracted for `ArgumentArity` (merged) and `CommandLineParser.TryUnescape` (PR open). CI automation maturing. Lean toolchain blocked in CI runner; Task 3+ deferred. | ||
|
|
||
| ## Summary | ||
|
|
||
| The Lean Squad has surveyed the `microsoft/testfx` codebase and identified **seven** high-quality FV targets. Two informal specifications have been extracted. The Lean toolchain cannot currently be installed in the runner environment due to network restrictions. | ||
| The Lean Squad has surveyed the `microsoft/testfx` codebase and identified **seven** high-quality FV targets in the command-line infrastructure and filter logic of Microsoft.Testing.Platform. All targets are pure or near-pure functions with clear algebraic properties, making them suitable for Lean 4 formal verification. | ||
|
|
||
| Key findings: | ||
| - **`ArgumentArity`**: informal spec extracted (14 properties, 4 groups). Constructor does not enforce `Min ≤ Max`; `CommandLineOption` enforces this on construction, acting as the real guard. | ||
| - **`ArgumentArity`**: informal spec extracted (14 properties, 4 groups). Constructor does not enforce `Min ≤ Max`; `CommandLineOption` enforces this on construction, acting as the real guard. Correspondence documented. | ||
| - **`CommandLineParser.TryUnescape`**: informal spec extracted (24 properties, 5 groups, 2 confirmed bugs for single-char quote inputs). PR open. | ||
| - **New targets identified**: `ResponseFileHelper.SplitCommandLine` (pure tokeniser) and `TreeNodeFilter.MatchFilterPattern` (Boolean algebra — ideal for structural induction proofs of De Morgan, double negation, idempotence). |
There was a problem hiding this comment.
The status/summary bullets appear out of date with the current FV artifacts: TARGETS.md lists several Phase-2 targets beyond just ArgumentArity and TryUnescape (e.g., ParseOptionAndSeparators, SplitCommandLine, TreeNodeFilter.MatchFilterPattern), and formal-verification/specs/ already contains additional *_informal.md files. Please update this section to reflect the actual set of extracted informal specs (or adjust phases elsewhere so the report is consistent).
| - name: Check FV 🔬 disclosure in top-level docs | ||
| run: | | ||
| FAILED=0 | ||
| for doc in RESEARCH.md TARGETS.md CORRESPONDENCE.md CRITIQUE.md REPORT.md; do | ||
| if ! grep -q "🔬" "formal-verification/${doc}"; then | ||
| echo "::error file=formal-verification/${doc}::FV document missing 🔬 Lean Squad disclosure: ${doc}" | ||
| FAILED=1 | ||
| else | ||
| echo "✅ formal-verification/${doc} has 🔬 disclosure" | ||
| fi | ||
| done | ||
| if [ "$FAILED" -ne 0 ]; then | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
This step fails the workflow if any required doc is missing the 🔬 disclosure marker, but the PR description/issue states the disclosure check should be warning-only (non-blocking). Consider emitting ::warning (and not exiting nonzero) so the workflow behavior matches the documented intent.
| TOOLCHAIN="$(tr -d '\r\n' < formal-verification/lean/lean-toolchain)" | ||
| if [ -z "$TOOLCHAIN" ]; then | ||
| echo "::error file=formal-verification/lean/lean-toolchain::lean-toolchain file is empty" | ||
| exit 1 | ||
| fi | ||
|
|
||
| if [[ "$TOOLCHAIN" =~ [[:space:]] ]]; then | ||
| echo "::error file=formal-verification/lean/lean-toolchain::lean-toolchain must not contain whitespace" | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
The lean-toolchain validation strips all newlines (tr -d '\r\n'), so a malformed multi-line lean-toolchain file would be concatenated and could still pass the whitespace/regex checks. If the goal is to enforce a single well-formed toolchain spec, check that the file contains exactly one non-empty line (and fail if there are extra lines or trailing whitespace) before validating the pattern.
Evangelink
left a comment
There was a problem hiding this comment.
Summary
Workflow: Expert Code Reviewer
Date: 2026-04-30
Repository: microsoft/testfx
Key Findings
The two changed files are a new GitHub Actions workflow (.github/workflows/fv-docs-validation.yml) and a documentation update (formal-verification/REPORT.md). No production C# code is changed.
Correctness issues in the CI workflow — all in the final Python step:
-
Missing spec check emits
::warning::instead of::error::(Line 141) — This is already silently misfiring:TreeNodeFilter.MatchFilterPatternis a Phase-2 target inTARGETS.mdbut has no corresponding file informal-verification/specs/. The workflow reports success. The purpose of this step — to gate PRs that forget to add informal specs — is completely unenforced. Change to::error::+sys.exit(1). -
Spec glob uses
*.mdinstead of*_informal.md(Line 123) — The prior step counts*_informal.mdfiles, but the validation step considers all.mdfiles inspecs/. Any incidental Markdown (README, notes) would satisfy a match, hiding a genuinely missing spec. -
Vacuous
all()on empty keyword list (Line 132) — For any target whose name decomposes entirely into tokens shorter than 3 characters,all(kw in s for kw in [])returnsTruefor every spec file, so the target appears validated regardless of actual coverage.
Positive Observations
- The workflow correctly scopes permissions to
contents: read. - The shell steps for checking file existence,
lean-toolchainformat, and 🔬 disclosure are solid and correctly useexit 1to fail CI. - Hardcoded loop variables in shell steps eliminate any injection risk.
Recommendations
Fix the three Python-step correctness bugs before merging. The most critical is converting the missing-spec diagnostic from ::warning:: to a hard failure; the other two address false-positive scenarios.
Generated by Expert Code Reviewer
🧠 Reviewed by Expert Code Reviewer 🧠
| parts = re.sub(r'([a-z])([A-Z])', r'\1_\2', name) | ||
| return [w.lower() for w in re.split(r'[._\s]+', parts) if len(w) >= 3] | ||
|
|
||
| all_specs = [os.path.basename(p) for p in glob.glob(os.path.join(specs_dir, "*.md"))] |
There was a problem hiding this comment.
[Correctness] glob.glob(... "*.md") collects all Markdown files in specs/, not just *_informal.md files. Any incidental .md file (e.g., a future README.md or NOTES.md placed in that directory) would be treated as a candidate spec, producing false-positive matches.
Impact: The step directly above counts only *_informal.md files; this step silently widens the net, creating an inconsistency that could hide a genuinely missing informal spec.
Suggestion: Narrow the glob to match the established naming convention:
all_specs = [os.path.basename(p) for p in glob.glob(os.path.join(specs_dir, "*_informal.md"))]There was a problem hiding this comment.
Fixed in 39ac0ad — changed glob.glob(... "*.md") to glob.glob(... "*_informal.md") so the validation is consistent with the count step and won't match incidental markdown files.
|
@copilot address review comments |
- Narrow glob from *.md to *_informal.md to prevent false-positive matches - Add empty-keywords guard to skip targets whose name has only short tokens (avoids vacuous all() returning True on empty list) - Change ::warning:: to ::error:: + sys.exit(1) so missing specs block CI - Remove redundant keywords[0] guard (now handled by the empty-keywords check) Agent-Logs-Url: https://github.com/microsoft/testfx/sessions/950eb316-fac6-464f-8197-0054dfa30e41 Co-authored-by: Evangelink <11340282+Evangelink@users.noreply.github.com>
Add .github/workflows/fv-docs-validation.yml: a CI workflow that validates
the structural integrity of formal-verification/ artifacts on every PR or
push that touches formal-verification/**.
Checks performed:
CORRESPONDENCE.md, CRITIQUE.md, REPORT.md)
(keyword-based matching handles CamelCase name decomposition)
Also update formal-verification/REPORT.md: add run history for the
2026-04-27 runs, expand target table with new targets, and document the
Lean toolchain blocker.
�� Lean Squad — automated formal verification agent
Run: https://github.com/microsoft/testfx/actions/runs/24986088168
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com
Fixes #7861