[plan] Create "test-quality-sentinel" agentic workflow

## Context

From the Scout analysis of [issue #25311](https://github.com/github/gh-aw/issues/25311), this workflow addresses **Problem 4: False Comfort from Tests**.

> *"Tests created a similar false comfort. Having 500+ tests felt reassuring... there were several times in the vibe-coding phase where a new test case revealed that the design of some component was completely wrong."*

High test counts create an illusion of safety. The real signal is whether tests cover behavioral contracts and design invariants — not just happy-path implementations.

## Objective

Create a new gh-aw agentic workflow called `test-quality-sentinel` that analyzes test quality beyond code coverage percentages on every PR.

## Workflow Prompt

```
Create an agentic GitHub Actions workflow called "test-quality-sentinel" that
analyzes test quality beyond code coverage percentages.

The workflow must:
1. On every PR, analyze new and changed tests to detect:
   - Tests that only test implementation details (mocking internal functions
     rather than testing observable behavior)
   - Tests that lack assertions about error/edge cases (only testing the
     happy path)
   - Test files that grew proportionally faster than the code they test
     (possible "test inflation" — quantity without quality)
   - Duplicated test logic that suggests tests are generated without intent
2. Use an AI agent to review the tests in the PR diff and answer:
   a. "What design invariant does this test enforce?"
   b. "What would break in the system if this test were deleted?"
   c. "Does this test cover a behavioral contract or just an implementation detail?"
3. Post a PR comment with:
   - A "Test Quality Score" (0–100) based on the above criteria
   - Specific tests flagged for review with AI-generated improvement suggestions
   - A distinction between "design tests" (high value) vs "implementation tests"
     (low value, prone to false assurance)
4. Fail the check if >30% of new tests are classified as low-value
   implementation tests

Use AST analysis + AI review. Support pytest (Python) and #[test] blocks (Rust).
```

## Files to Create

- `.github/workflows/test-quality-sentinel.md` — the workflow markdown file

## Acceptance Criteria

- [ ] Workflow triggers on every PR
- [ ] Detects implementation-detail tests, happy-path-only tests, test inflation, and duplication
- [ ] AI agent answers the 3 quality questions per test
- [ ] Posts "Test Quality Score" (0–100) comment with per-test feedback
- [ ] Distinguishes "design tests" vs "implementation tests"
- [ ] Fails check if >30% of new tests are low-value
- [ ] Supports Python (pytest) and Rust (`#[test]`)
- [ ] Compiled `.lock.yml` generated via `make recompile`
Related to #25311




> Generated by [Plan Command](https://github.com/github/gh-aw/actions/runs/24139048285/agentic_workflow) for issue #25311 · ● 394.6K · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Fplan%22&type=issues)
> - [x] expires  on Apr 10, 2026, 1:57 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[plan] Create "test-quality-sentinel" agentic workflow #25320

Context

Objective

Workflow Prompt

Files to Create

Acceptance Criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[plan] Create "test-quality-sentinel" agentic workflow #25320

Description

Context

Objective

Workflow Prompt

Files to Create

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions