Skip to content

Rectify: run_python Type-Erasure Boundary — Annotation-Aware Coercion#1972

Merged
Trecek merged 5 commits into
developfrom
annotate-pr-diff-has-100-failure-rate-int-vs-str-type-mismat/1965
May 6, 2026
Merged

Rectify: run_python Type-Erasure Boundary — Annotation-Aware Coercion#1972
Trecek merged 5 commits into
developfrom
annotate-pr-diff-has-100-failure-rate-int-vs-str-type-mismat/1965

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented May 6, 2026

Summary

The run_python MCP tool dispatches callables through _import_and_call using a dict[str, object] parameter — a type-erasure boundary. Direct MCP tools get Pydantic lax-mode validation via FastMCP (which coerces str→int automatically), but run_python callables receive raw unvalidated values. The dispatcher already calls inspect.signature(func) but only uses it for None→default coercion (PR #1602), never reading param.annotation. This creates an asymmetry where str-typed callable parameters receiving JSON integers crash at subprocess boundaries, f-string operations, or Path construction.

The architectural fix: extend _import_and_call's existing coercion loop to read param.annotation and coerce primitive scalar types, closing the type-safety gap between the two dispatch paths. Defense-in-depth: add str() guards at each subprocess call site. Structural tests: a parametrized test matrix that exercises every callable × every param type combination through _import_and_call.

Closes #1965

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/remediation-20260505-183247-696032/.autoskillit/temp/rectify/rectify_run_python_type_coercion_2026-05-05_183800.md

🤖 Generated with Claude Code via AutoSkillit

Token Usage Summary

Step count uncached output cache_read peak_ctx turns cache_write time
rectify 1 46 10.4k 358.8k 104.0k 112 79.5k 8m 27s
dry_walkthrough 1 1.0k 9.6k 1.1M 58.1k 82 45.8k 4m 21s
implement 1 146.6k 46.2k 11.5M 19.0k 251 0 30m 28s
prepare_pr 1 39 7.7k 426.6k 44.2k 20 31.2k 2m 56s
compose_pr 1 23 1.5k 134.5k 25.9k 10 12.7k 38s
review_pr 1 28 35.7k 483.6k 80.8k 29 68.2k 8m 30s
resolve_review 1 42 8.7k 1.1M 59.3k 46 46.6k 7m 32s
ci_conflict_fix 1 28 2.2k 299.7k 37.3k 17 24.0k 59s
diagnose_ci 1 25 2.0k 177.9k 35.6k 10 23.4k 58s
resolve_ci 1 40 2.9k 474.7k 46.8k 26 33.8k 4m 10s
resolve_queue_merge_conflicts 1 37 10.6k 1.0M 73.7k 44 60.7k 3m 26s
Total 147.9k 137.6k 17.1M 104.0k 425.9k 1h 12m

Token Efficiency

Step LoC Changed cache_read/LoC cache_write/LoC output/LoC
rectify 0
dry_walkthrough 0
implement 289 39802.9 0.0 159.8
prepare_pr 0
compose_pr 0
review_pr 0
resolve_review 11 101826.2 4237.2 794.9
ci_conflict_fix 1392 215.3 17.3 1.6
diagnose_ci 0
resolve_ci 1 474733.0 33787.0 2930.0
resolve_queue_merge_conflicts 1384 744.8 43.9 7.6
Total 3077 5559.7 138.4 44.7

Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review — Verdict: approved_with_comments

Comment thread src/autoskillit/server/tools/tools_execution.py Outdated
Comment thread src/autoskillit/server/tools/tools_execution.py Outdated
Comment thread src/autoskillit/server/tools/tools_execution.py
Comment thread tests/server/test_tools_run_cmd.py
Comment thread tests/server/test_tools_run_cmd.py Outdated
Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit review: warning-only findings detected. See inline comments — no blocking changes required.

@Trecek Trecek force-pushed the annotate-pr-diff-has-100-failure-rate-int-vs-str-type-mismat/1965 branch from 021061a to c35cd0c Compare May 6, 2026 03:00
@Trecek Trecek added this pull request to the merge queue May 6, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to a conflict with the base branch May 6, 2026
Trecek and others added 5 commits May 5, 2026 20:27
Coerce primitive scalar types at the _import_and_call dispatch boundary,
closing the type-safety gap between direct MCP tools (Pydantic-validated)
and run_python callables (raw dict). When LLM JSON passes an int for
a str-annotated param, _coerce_scalar now converts int→str, str→int,
str→float, int→float as needed, with structlog warnings for auditability.

Add str() guards at three subprocess call sites as defense-in-depth:
- smoke_utils.annotate_pr_diff: pr_number
- recipe._cmd_rpc.wait_for_direct_merge: pr_number
- recipe._cmd_rpc.force_push_and_wait_mergeability: review_pr_number

Fix implementation.yaml note that primed LLM to pass integer PR numbers.

Add parametrized test class TestImportAndCallTypeCoercion (8 cases) and
callable-level tests for annotate_pr_diff, wait_for_direct_merge, and
force_push_and_wait_mergeability with int inputs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…esolution

ARCH-003 requires all except-Exception blocks to include a logger call.
The get_type_hints fallback now logs a warning with exc_info.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rd in _coerce_scalar

- Remove redundant `import typing` inside _coerce_scalar (already at module level)
- Remove misleading `a is not None` clause from __args__ filters (type objects, not singletons)
- Add `isinstance(val, bool)` early-return guard to prevent bool→str coercion

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nconvertible_value_not_coerced

Test asserts success=False — name now accurately reflects that coercion
is skipped and the callable fails, not that the value "passes through".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Trecek Trecek force-pushed the annotate-pr-diff-has-100-failure-rate-int-vs-str-type-mismat/1965 branch from d1944b1 to af1c482 Compare May 6, 2026 03:28
@Trecek Trecek added this pull request to the merge queue May 6, 2026
Merged via the queue into develop with commit b652b08 May 6, 2026
2 checks passed
@Trecek Trecek deleted the annotate-pr-diff-has-100-failure-rate-int-vs-str-type-mismat/1965 branch May 6, 2026 03:44
Trecek added a commit that referenced this pull request May 8, 2026
…#1972)

## Summary

The `run_python` MCP tool dispatches callables through
`_import_and_call` using a `dict[str, object]` parameter — a
type-erasure boundary. Direct MCP tools get Pydantic lax-mode validation
via FastMCP (which coerces `str→int` automatically), but `run_python`
callables receive raw unvalidated values. The dispatcher already calls
`inspect.signature(func)` but only uses it for `None→default` coercion
(PR #1602), never reading `param.annotation`. This creates an asymmetry
where `str`-typed callable parameters receiving JSON integers crash at
subprocess boundaries, f-string operations, or Path construction.

The architectural fix: extend `_import_and_call`'s existing coercion
loop to read `param.annotation` and coerce primitive scalar types,
closing the type-safety gap between the two dispatch paths.
Defense-in-depth: add `str()` guards at each subprocess call site.
Structural tests: a parametrized test matrix that exercises every
callable × every param type combination through `_import_and_call`.

Closes #1965

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/remediation-20260505-183247-696032/.autoskillit/temp/rectify/rectify_run_python_type_coercion_2026-05-05_183800.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant