execution-estimation is a user-facing estimation skill for sizing engineering work before implementation. It produces deterministic guidance for:
- Story-point-style level of effort
- Expected or observed file and line churn
- Relative footprint versus the codebase baseline
- Blast radius and required quality controls
- Planning recommendation
- Decomposition recommendation
This skill is useful when you want a consistent pre-implementation answer to questions like:
- How big is this task?
- Is this risky even if the diff is small?
- Should I stop and make a plan before coding?
- Should this work be split into smaller items?
The skill supports two estimation modes:
-
Diff-backed estimation Use this when the work already exists in a branch or diff. The estimator reads the changed file list and actual line churn from git.
-
Proposal-backed estimation Use this when the work is still planned. The estimator reads a newline-delimited list of proposed files and infers likely churn from file types.
In both modes, the estimator returns deterministic JSON that includes:
estimation.storyPointsestimation.confidenceestimation.decompositionRecommendedchange.filesTouchedchange.linesChangedcomparison.trackedFilesTouchedPctcomparison.sourceLinesChangedPctrisk.blastRadiusplanning
The planning object has this shape:
{
"recommended": true,
"matchedRules": [
"high-blast-radius"
],
"rationale": [
"planning rule matched: high-blast-radius - blast radius is high enough that sequencing and controls should be decided before coding"
]
}planning.recommended is always a boolean.
- It does not automatically switch Codex into the app's actual Planning mode.
- It does not inspect runtime behavior or business impact beyond file-path and churn signals.
- It does not replace engineering judgment for ambiguous architecture choices.
Instead, it gives a deterministic recommendation for whether you should stop and plan before implementation.
The estimator script lives in the skill folder, not in the target repository:
scripts/estimate_execution.pyscripts/blast_radius.pyscripts/planning_recommendation.pyreferences/estimation-rubric.md
You always provide:
--repo-root: path to the target git repository
You then choose exactly one mode:
- Diff mode:
--base-refand optional--head-ref - Proposal mode:
--proposed-files
python3 /path/to/execution-estimation/scripts/estimate_execution.py \
--repo-root /path/to/repo \
--base-ref origin/main \
--head-ref HEADCreate a file containing one proposed path per line:
src/api/users.ts
src/db/schema.sql
src/api/users.test.ts
Then run:
python3 /path/to/execution-estimation/scripts/estimate_execution.py \
--repo-root /path/to/repo \
--proposed-files /path/to/proposed-files.txtOptional:
--proposal-lines-changed <n>overrides the inferred line-churn estimate in proposal mode
The estimator starts from line churn, then adds deterministic risk steps.
Base story points by lines changed:
<= 60->161-180->2181-450->3451-900->5901-1700->8> 1700->13
Risk steps are added for:
- Files touched
>= 8 - Files touched
>= 15 - Top-level directories touched
>= 3 - Max single-file churn
>= 300 - Binary changes present
- Proposal mode uncertainty
The total is then mapped upward to this fixed sequence:
1, 2, 3, 5, 8, 13
Blast radius is separate from story points. A small change can still have a high blast radius.
The skill adds path-based signals for areas such as:
- Auth and security
- Database and schema
- Public API contracts
- Shared or multi-consumer code
- Runtime entrypoints
- Build and deploy paths
- Runtime configuration
It also adds structural signals for:
- Wide file fanout
- Cross-boundary changes
- Deep single-file churn
- Binary artifacts
Blast-radius levels:
lowmediumhighvery-high
The estimator also returns recommended controls, such as:
- Targeted automated tests
- Integration or regression coverage
- Adjacent-boundary review
- Rollback or containment review
Planning recommendation is a separate deterministic output. It answers one yes/no question:
Should you stop after estimation and explicitly plan before coding?
The answer lives at:
planning.recommended
It becomes true when any explicit planning rule matches:
decomposition-recommended: the work item should already be split before implementationhigh-blast-radius: blast radius ishighorvery-highmedium-blast-radius-shaping: blast radius ismediumand the work is still proposal-mode or already mid-sizedmid-sized-cross-boundary: story points are at least5and the change spans at least3top-level directorieswide-proposal-change: proposal mode touches at least8filesdeep-single-file-churn: max single-file churn is at least300
If no planning rule matches:
planning.recommendedisfalseplanning.matchedRulesis emptyplanning.rationaleexplains that direct execution is appropriate unless the user explicitly asks for a plan
Important:
- This is a binary recommendation output, not an enum.
- It does not flip the product into true Planning mode.
The skill recommends splitting work when any of these are true:
- Story points
>= 8 - Files touched
>= 18 - Lines changed
>= 1500
When decomposition is recommended, split by workflow boundary or risk boundary, for example:
- Resolver, integration, tests
- Schema, runtime, integration
Typical output sections:
codebase: baseline repository size used for comparisonchange: the direct footprint of the diff or proposalcomparison: percent of the repository touchedrisk.blastRadius: risk signals, level, controls, and investigation areasplanning: boolean recommendation plus matched rules and rationaleestimation: story points, confidence, rationale, and decomposition guidance
Example interpretation:
- High story points plus low blast radius means the work is large but locally contained.
- Low story points plus high blast radius means the change is small but touches sensitive boundaries.
planning.recommended = truemeans you should stop after estimation and produce an execution plan before coding.planning.recommended = falsemeans direct execution is appropriate unless the user explicitly asks for a plan.decompositionRecommended = truemeans the work should be split into smaller items.
high: diff-backed estimatemedium: proposal-backed estimate
Proposal mode is intentionally more conservative because the actual diff does not exist yet.
- Gather the target repo and either a real diff or a proposed file list.
- Run the estimator.
- Report the JSON fields, not just a single story-point number.
- Use blast radius to choose test depth and review breadth.
- Use planning recommendation to decide whether to stop and plan.
- Use decomposition recommendation to decide whether to split the work item.
SKILL.md: agent instructions for using the skillreferences/estimation-rubric.md: thresholds and deterministic rulesscripts/estimate_execution.py: main estimatorscripts/blast_radius.py: blast-radius logicscripts/planning_recommendation.py: planning recommendation logic