Skip to content

Implementation Plan: Tests for Experiment-Type Registry Expansion#2183

Merged
Trecek merged 2 commits into
developfrom
auto-research-2-4-tests-for-experiment-type-registry-expansi/836
May 7, 2026
Merged

Implementation Plan: Tests for Experiment-Type Registry Expansion#2183
Trecek merged 2 commits into
developfrom
auto-research-2-4-tests-for-experiment-type-registry-expansi/836

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented May 7, 2026

Summary

Extend tests/recipe/test_experiment_type_registry.py with a parameterized per-type schema-validation test for each of the 7 new experiment types and expand the is_silent_type false-case parametrize to cover all non-silent types. Most acceptance criteria from issue #836 are already satisfied by prior work items (#833, #834, #835); the remaining gaps are (1) a parameterized per-type schema validation test and (2) reaching the ≥9 parameterized cases threshold.

Requirements

Extend tests/recipe/test_experiment_type_registry.py and related tests to cover the expanded 12-type registry, the priority-ordering mechanism, the silent-type handler, the cache-invalidation path, user-override schema mismatches, and the new dimension_weight_rationale field. Add integration test coverage for the full classification path.

Conflict Resolution Decisions

The following files had merge conflicts that were automatically resolved.

Architecture Impact

Changed Files

Modified (●):

● tests/recipe/test_experiment_type_registry.py
● src/autoskillit/recipes/contracts/merge-prs.json
● src/autoskillit/recipes/implementation-groups.json
● src/autoskillit/recipes/implementation.json
● src/autoskillit/recipes/merge-prs.json
● src/autoskillit/recipes/remediation.json
● src/autoskillit/recipes/research.json

Closes #836

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/impl-20260507-113945-278449/.autoskillit/temp/make-plan/tests_experiment_type_registry_expansion_plan_2026-05-07_114500.md

🤖 Generated with Claude Code via AutoSkillit

Token Usage Summary

Step Model count uncached output cache_read peak_ctx turns cache_write time
plan claude-sonnet-4-6 1 3.0k 12.0k 924.0k 76.5k 97 63.4k 6m 36s
verify claude-opus-4-6 1 39 8.7k 525.8k 62.0k 40 48.8k 3m 28s
implement* MiniMax-M2.7-highspeed 1 311.3k 3.6k 503.2k 29.8k 42 16.2k 5m 17s
prepare_pr* MiniMax-M2.7-highspeed 1 71.4k 3.9k 205.5k 29.8k 17 15.1k 1m 24s
compose_pr* MiniMax-M2.7-highspeed 1 45.4k 1.6k 205.5k 29.8k 14 15.0k 48s
Total 431.2k 29.9k 2.4M 76.5k 158.5k 17m 34s

* Step used a non-Anthropic provider; caching behavior may differ.

Token Efficiency

Step LoC Changed cache_read/LoC cache_write/LoC output/LoC
plan 0
verify 0
implement 311 1617.9 52.0 11.7
prepare_pr 0
compose_pr 0
Total 311 7601.3 509.7 96.1

Model Usage Breakdown

Model steps uncached output cache_read cache_write time
claude-sonnet-4-6 1 3.0k 12.0k 924.0k 63.4k 6m 36s
claude-opus-4-6 1 39 8.7k 525.8k 48.8k 3m 28s
MiniMax-M2.7-highspeed 3 428.2k 9.1k 914.2k 46.3k 7m 29s

Trecek and others added 2 commits May 7, 2026 12:08
…silent-type false-case coverage

- Hoist NEW_TYPES constant to module level to eliminate duplication
- Add test_new_type_full_schema_valid: 7-parametrize schema validation test
- Expand test_is_silent_type_false_for_standard_types: 3 → 11 cases
- Total new parameterized cases: 18 (7 + 11), exceeding ≥9 threshold

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…dcoding 11

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Trecek Trecek added this pull request to the merge queue May 7, 2026
Merged via the queue into develop with commit 16329ef May 7, 2026
2 checks passed
@Trecek Trecek deleted the auto-research-2-4-tests-for-experiment-type-registry-expansi/836 branch May 7, 2026 20:36
Trecek added a commit that referenced this pull request May 8, 2026
)

## Summary

Extend `tests/recipe/test_experiment_type_registry.py` with a
parameterized per-type schema-validation test for each of the 7 new
experiment types and expand the `is_silent_type` false-case parametrize
to cover all non-silent types. Most acceptance criteria from issue #836
are already satisfied by prior work items (#833, #834, #835); the
remaining gaps are (1) a parameterized per-type schema validation test
and (2) reaching the ≥9 parameterized cases threshold.

## Requirements

Extend `tests/recipe/test_experiment_type_registry.py` and related tests
to cover the expanded 12-type registry, the priority-ordering mechanism,
the silent-type handler, the cache-invalidation path, user-override
schema mismatches, and the new dimension_weight_rationale field. Add
integration test coverage for the full classification path.

## Conflict Resolution Decisions

The following files had merge conflicts that were automatically
resolved.

## Architecture Impact

## Changed Files

### Modified (●):

● tests/recipe/test_experiment_type_registry.py
● src/autoskillit/recipes/contracts/merge-prs.json
● src/autoskillit/recipes/implementation-groups.json
● src/autoskillit/recipes/implementation.json
● src/autoskillit/recipes/merge-prs.json
● src/autoskillit/recipes/remediation.json
● src/autoskillit/recipes/research.json

Closes #836

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260507-113945-278449/.autoskillit/temp/make-plan/tests_experiment_type_registry_expansion_plan_2026-05-07_114500.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

## Token Usage Summary

| Step | Model | count | uncached | output | cache_read | peak_ctx |
turns | cache_write | time |

|------|-------|-------|----------|--------|------------|----------|-------|-------------|------|
| plan | claude-sonnet-4-6 | 1 | 3.0k | 12.0k | 924.0k | 76.5k | 97 |
63.4k | 6m 36s |
| verify | claude-opus-4-6 | 1 | 39 | 8.7k | 525.8k | 62.0k | 40 | 48.8k
| 3m 28s |
| implement* | MiniMax-M2.7-highspeed | 1 | 311.3k | 3.6k | 503.2k |
29.8k | 42 | 16.2k | 5m 17s |
| prepare_pr* | MiniMax-M2.7-highspeed | 1 | 71.4k | 3.9k | 205.5k |
29.8k | 17 | 15.1k | 1m 24s |
| compose_pr* | MiniMax-M2.7-highspeed | 1 | 45.4k | 1.6k | 205.5k |
29.8k | 14 | 15.0k | 48s |
| **Total** | | | 431.2k | 29.9k | 2.4M | 76.5k | | 158.5k | 17m 34s |

\* *Step used a non-Anthropic provider; caching behavior may differ.*

## Token Efficiency

| Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC |
|------|-------------|----------------|-----------------|------------|
| plan | 0 | — | — | — |
| verify | 0 | — | — | — |
| implement | 311 | 1617.9 | 52.0 | 11.7 |
| prepare_pr | 0 | — | — | — |
| compose_pr | 0 | — | — | — |
| **Total** | **311** | 7601.3 | 509.7 | 96.1 |

## Model Usage Breakdown

| Model | steps | uncached | output | cache_read | cache_write | time |
|-------|-------|----------|--------|------------|-------------|------|
| claude-sonnet-4-6 | 1 | 3.0k | 12.0k | 924.0k | 63.4k | 6m 36s |
| claude-opus-4-6 | 1 | 39 | 8.7k | 525.8k | 48.8k | 3m 28s |
| MiniMax-M2.7-highspeed | 3 | 428.2k | 9.1k | 914.2k | 46.3k | 7m 29s |

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant