Tests for grounding pipeline + audit scripts (+43 tests)#71
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
Adds unit-test coverage for TraitMech’s “grounding” pipelines and audit/validation scripts to prevent regressions in idempotency, mapping ingestion, residual reporting, and writer-auditing behaviors that aren’t caught by schema-only validation.
Changes:
- Introduces focused unit tests for predicate grounding and node grounding (mapping load behavior, conflicts, idempotency, residual accounting).
- Adds unit tests for
validate_strictclassification and per-file validation behaviors (unknown fields, missing required fields, YAML parse errors, directory walking). - Adds unit tests for
audit_writersdetection logic including self-suppression and safeguards classification.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
tests/test_ground_causal_predicates.py |
New unit tests for predicate mapping ingestion + edge grounding idempotency/residuals. |
tests/test_ground_causal_nodes.py |
New unit tests for node mapping ingestion (header validation) + node grounding contracts. |
tests/test_validate_strict.py |
New unit tests for error classification, closed-mode validation, and YAML file discovery. |
tests/test_audit_writers.py |
New unit tests for YAML-writer heuristics, safeguard detection, and auditor self-suppression. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
4 tasks
realmarcin
added a commit
that referenced
this pull request
May 24, 2026
Add explicit `assert "b.yml" not in names` to
test_iter_yaml_files_walks_directory_and_filters — the prior test
documented the .yml-skipping behavior in a comment but never
asserted it, so a regression that started picking up .yml during
directory walks would have slipped through silently.
Also add test_iter_yaml_files_accepts_yml_file_passed_directly
to lock in the asymmetry that the previous test only hinted at:
iter_yaml_files() does accept .yml when passed as a file argument
(only the rglob('*.yaml') walk is .yaml-only).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The grounding pipelines and audit scripts have been load-bearing infrastructure for the last 7 PRs (#61, #66, #67, #69, #70 — all of which rewrite causal-graph fields based on these scripts' output). They had zero unit-test coverage. A silent regression in idempotency, header validation, or self-suppression would not be caught by validate-strict (which only checks per-record schema conformance, not pipeline correctness). Test counts: tests/test_ground_causal_predicates.py 9 tests tests/test_ground_causal_nodes.py 12 tests tests/test_validate_strict.py 11 tests tests/test_audit_writers.py 11 tests --- total new 43 tests total suite 54 tests (was 11) Coverage highlights: ground_causal_predicates.py: - load_mapping: basic happy path, conflict detection (same label → different CURIEs raises ValueError), incomplete-row skipping, missing-file error. - ground_edges_in_doc: idempotency (second pass = 0 changes), existing predicate_id never overwritten, residual counting for unmapped labels, empty/missing-predicate edges skipped. ground_causal_nodes.py: - All of the predicate suite plus: - (label, node_type) keyed lookup — same label, different node_types map to different CURIEs without aliasing. - Header validation (Copilot fix from PR #66): TSV with `nodetype` / `targetcurie` typo'd headers raises ValueError naming both missing columns. - grounded_keys-on-validation-failure separability (Copilot fix from PR #66): caller can union residual + grounded_keys to recover the corpus-state residual after rolling back an invalid file write. validate_strict.py: - classify: parametrized over the 5 categories (unexpected_field, missing_required, enum_mismatch, pattern_mismatch, other) — the messages must match the actual jsonschema phrasings the validator emits. - validate_one: clean record produces 0 errors; unknown field surfaces unexpected_field (the G01 gate behavior); missing required field surfaces missing_required; YAML parse error surfaces as yaml_parse_error category. - iter_yaml_files: walks directories, filters .txt, picks up nested *.yaml. audit_writers.py: - looks_like_yaml_writer: yaml.safe_dump / yaml.dump positive, bare .write_text negative, .write_text near .yaml hint positive, arbitrary code negative. - audit: full-safeguards writer flagged yes/yes/yes/yes; no-safeguards writer flagged no/no/no; non-writer returns None; wired_into_just yes when justfile mentions the script stem. - Self-suppression (Copilot fix from PR #64): audit_writers.py itself returns None even though its own source matches yaml.safe_dump. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add explicit `assert "b.yml" not in names` to
test_iter_yaml_files_walks_directory_and_filters — the prior test
documented the .yml-skipping behavior in a comment but never
asserted it, so a regression that started picking up .yml during
directory walks would have slipped through silently.
Also add test_iter_yaml_files_accepts_yml_file_passed_directly
to lock in the asymmetry that the previous test only hinted at:
iter_yaml_files() does accept .yml when passed as a file argument
(only the rglob('*.yaml') walk is .yaml-only).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #75 changed `looks_like_yaml_writer` to require that the yaml-serializer call feed directly into write_text on the same line (instead of the looser "any .write_text + any .yaml token" heuristic, which produced false positives for scripts that only READ trait YAMLs). The pre-#75 test asserted that `path.write_text(content) # .yaml` counted as a YAML writer. That returned True under the old heuristic and False under the new (correct) one. Replace it with two tests that lock in the new contract: test_looks_like_yaml_writer_write_text_of_yaml_dump Positive: write_text(yaml.safe_dump(...)) / write_text(yaml.dump(...)) both count. test_looks_like_yaml_writer_write_text_of_json_is_false Negative: a script that reads *.yaml then writes JSON via write_text is NOT a YAML writer — this is the false-positive case #75 explicitly fixed for scripts/build_embedding_index.py and scripts/render_trait_pages.py. Also rename test_looks_like_yaml_writer_write_text_without_yaml_hint_is_false to test_looks_like_yaml_writer_write_text_plain_is_false since the "yaml hint" phrasing was tied to the old heuristic. 56 tests pass (was 54; +2). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6249749 to
d938511
Compare
realmarcin
added a commit
that referenced
this pull request
May 26, 2026
#71 added 43 unit tests across 4 files locking in idempotency, header validation, and other pipeline contracts — but none of them run in CI. validate-strict.yaml only checks per-record schema validity; it can't catch a regression that, say, breaks grounded_keys propagation in ground_causal_nodes.py or removes self-suppression from audit_writers.py. This workflow mirrors validate-strict.yaml's shape: - Triggers on PRs and pushes-to-main that touch scripts/**, src/traitmech/**, tests/**, pyproject.toml, or this workflow. - Uses astral-sh/setup-uv@v3 + Python 3.12 + uv sync --extra dev (pytest is in the dev extra per pyproject.toml). - Runs uv run pytest tests/ with -v --tb=short for readable failures in the GitHub Actions log. Verified locally: - python3 -c "import yaml; yaml.safe_load(open(...))" → yaml ok - uv run pytest tests/ → 56 passed Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The grounding pipelines and audit scripts have been load-bearing infrastructure for the last 7 PRs (#61, #66, #67, #69, #70 — all of which rewrite causal-graph fields based on these scripts' output). They had zero unit-test coverage. A silent regression in idempotency, header validation, or self-suppression would not be caught by
validate-strict(which only checks per-record schema conformance, not pipeline correctness).This PR adds 43 new tests across 4 new test files; total suite grows from 11 → 54.
tests/test_ground_causal_predicates.pytests/test_ground_causal_nodes.py(label, node_type)keyed lookup, header validation (Copilot fix from #66), grounded_keys-on-validation-failure separability (Copilot fix from #66)tests/test_validate_strict.pyclassifyparametrized over the 5 categories (matching actual jsonschema phrasings);validate_oneunknown-field →unexpected_field(the G01 gate); missing-required, YAML parse-error coverage;iter_yaml_fileswalks dirs + filters non-YAMLtests/test_audit_writers.pylooks_like_yaml_writerpositive/negative cases;auditfull vs no safeguards; wired_into_just detection; self-suppression (Copilot fix from #64)Why now
ground-predicates --applyorground-nodes --applyrelies on it. Currently no test would catch a future refactor that breaks "second pass = 0 changes."ground_causal_nodes.load_mappingwas added in PR Ground causal-graph nodes: 39-mapping cohort, 77 nodes grounded (38% → 45%) #66 in response to Copilot. Without a test, the next refactor could silently regress.audit_writers.pyself-suppression was the Copilot fix in Audit backlog cleanup: G02, G03, G04, G05 #64. Easy to break by a future refactor that drops the path-identity check.Verified locally
Test plan
validate-strict); future PR could add apytestworkflow🤖 Generated with Claude Code