spike: sample merged PRs in agentv and classify eval-case yield

Part of #1155.

## Objective

Validate the premise behind the PR/issue-mining direction by sampling real merged PRs in this repo and checking how many convert to useful eval cases. If yield is too low, the dependent work (sub-issue on skill extension) should be rescoped or dropped.

## Scope

- Take the 20 most recent merged PRs in `main`.
- For each, classify as one of:
  - **useful**: a plausible eval case — the PR title/body is a task spec, the diff represents a behavioral change an agent could reproduce.
  - **not useful**: typo fix, version bump, dep update, pure refactor with no behavior change, doc-only, or too small to be meaningful.
- Record classification with a one-line reason per PR.
- Report yield percentage.
- Recommendation: proceed, rescope, or drop.

## Acceptance signals

- A short markdown note (in this issue, as a comment, or linked from the research repo) with a table: PR number, one-line summary, classification, reason.
- Yield percentage computed.
- Yes/no/rescope recommendation with rationale.
- No code changes.

## Non-goals

- Not building any tooling to mine PRs programmatically — this is manual classification.
- Not evaluating quality of generated cases (we aren't generating any here).
- Not extending beyond 20 PRs unless initial signal is borderline.

## Rule of thumb

- ≥50% useful: proceed with the skill extension as proposed.
- 30-50%: proceed but narrow the scope (e.g., filter by label, commit message pattern, PR size).
- <30%: rescope or drop — the premise doesn't hold for this codebase.

## Blocks

Sub-issue for `agentv-eval-writer` extension (see #1155).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spike: sample merged PRs in agentv and classify eval-case yield #1156

Objective

Scope

Acceptance signals

Non-goals

Rule of thumb

Blocks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

spike: sample merged PRs in agentv and classify eval-case yield #1156

Description

Objective

Scope

Acceptance signals

Non-goals

Rule of thumb

Blocks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions