Refocus forecast output on effective-token predictions by removing yield and episode metrics#34750
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
forecast output on effective-token predictions by removing yield and episode metrics
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #34750 does not have the 'implementation' label and has only 2 new lines of code in business logic directories (threshold is 100). |
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
✅ PR Code Quality Reviewer completed the code quality review. |
|
🧪 Test Quality Sentinel completed test quality analysis. No test functions were added or modified in PR #34750. The only test file change (pkg/cli/forecast_test.go) removes a single field (Yield: 1) from an existing test fixture to match the production code change that removed yield metrics from forecast output. No new tests to analyze - Test Quality Sentinel skipped. |
There was a problem hiding this comment.
Pull request overview
This PR narrows gh aw forecast output to focus on effective-token (ET) projections by removing throughput (yield) and episode-oriented metrics from both the JSON output contract and the console rendering.
Changes:
- Removed
YieldandEpisodeAnalysisfromForecastWorkflowResultand eliminated their computation inforecastWorkflow. - Simplified console output by removing the
Yield/Periodcolumn and deleting episode-analysis rendering. - Updated forecast rendering tests to match the new ET-focused table/output shape.
Show a summary per file
| File | Description |
|---|---|
| pkg/cli/forecast.go | Removes yield + episode summary types/logic and updates the forecast table rendering to be ET-first. |
| pkg/cli/forecast_test.go | Updates table rendering test data to no longer include Yield. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 2/2 changed files
- Comments generated: 0
There was a problem hiding this comment.
Approved — clean scope reduction
All removal sites are consistent: struct fields, calculations, rendering, and test fixtures are updated together. No orphaned references, no dead code, no logic errors introduced.
Review notes
ForecastEpisodeSummary,Yield, andEpisodeAnalysisare removed uniformly across the struct definition,forecastWorkflowcomputation,forecastTableRowrendering, and the test fixture.workflowRunToRunDataandbuildForecastEpisodeSummaryare only used by the removed path — their deletion is correct.ProjectedEffectiveTokens(based onObservedRunsPerPeriod x AvgEffectiveTokens) is unchanged and remains the primary point estimate.- The
yield/episode_analysisremoval is a documented, intentional breaking change in the JSON output contract — noted explicitly in the PR description. - One minor observation: the only existing test removes
Yield: 1but does not assert that theYield/Periodheader no longer appears in rendered output. Not blocking.
Warning
Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
proxy.golang.org
To allow these domains, add them to the
network.allowedlist in your workflow frontmatter:
network:
allowed:
- defaults
- "proxy.golang.org"See Network Configuration for more information.
🔎 Code quality review by PR Code Quality Reviewer · sonnet46 1.6M
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /zoom-out and /improve-codebase-architecture — clean deletion PR with one notable callout.
📋 Key Themes & Highlights
Key Themes
- Breaking JSON output contract (
yield,episode_analysisremoved): worth a deprecation note or version bump if any consumers exist. - Test coverage signal: the removed episode/yield paths had effectively zero dedicated test coverage, and the remaining ET-focused contract could benefit from a JSON-shape regression guard.
Positive Highlights
- ✅ Net −149/+2: surgical, focused deletion — no accidental scope creep.
- ✅ File-level comment updated accurately to reflect the new ET-only focus.
- ✅
buildForecastEpisodeSummary,workflowRunToRunData, andprintEpisodeBreakdownremoved cleanly with no dangling references. - ✅ Monte Carlo / experiment-variant output paths left intact and unaffected.
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · sonnet46 1.3M
| @@ -97,8 +76,6 @@ type ForecastWorkflowResult struct { | |||
|
|
|||
| // SuccessRate is the fraction of sampled runs that completed successfully (0–1). | |||
| SuccessRate float64 `json:"success_rate"` | |||
There was a problem hiding this comment.
[/zoom-out] Removing yield and episode_analysis from ForecastWorkflowResult is a breaking change to the JSON output contract (gh aw forecast --json). Any scripts or dashboards that parse these fields will silently lose data after the update.
💡 Suggestion
Consider one of:
- Add a one-release deprecation notice in the PR body and the command's help text before removing the fields entirely.
- Keep the fields present but always-zero / always-nil for one minor release cycle (with a deprecation comment in the struct) so consumers have time to migrate.
- Bump a minor version in the CLI or document the removal in a
CHANGELOG/ migration guide.
If forecast is experimental / pre-v1 and no stability guarantee exists, a short note in the PR body (or docs/) confirming that is sufficient.
| @@ -229,7 +229,6 @@ func TestRenderForecastTable_ZeroMonteCarloRangeRendersDash(t *testing.T) { | |||
| WorkflowID: "smoke-copilot", | |||
There was a problem hiding this comment.
[/improve-codebase-architecture] The test change is a single one-liner removal (Yield: 1). The deleted EpisodeAnalysis path and buildForecastEpisodeSummary function had zero dedicated tests — this is noted here not as a blocker, but as a signal that the removed code was lightly validated before removal.
💡 Suggestion
This is not blocking the merge, but for the remaining forecast output contract (MonteCarlo P10/P50/P90, ProjectedEffectiveTokens) consider adding a small table-driven test that asserts the JSON shape directly — e.g.:
func TestForecastWorkflowResultJSONShape(t *testing.T) {
r := ForecastWorkflowResult{
WorkflowID: "my-wf",
SuccessRate: 0.9,
MonteCarlo: &ForecastMonteCarloSummary{P50ProjectedEffectiveTokens: 1000},
}
b, err := json.Marshal(r)
require.NoError(t, err)
assert.NotContains(t, string(b), "yield", "removed field must not appear")
assert.NotContains(t, string(b), "episode_analysis", "removed field must not appear")
}This acts as a regression guard against accidentally re-introducing removed JSON fields.
This change narrows
gh aw forecastto its primary signal: effective-token projections. Forecast output no longer mixes in throughput (yield) or episode-oriented breakdowns, so results are easier to compare and consume for token planning.Scope reduction: output contract
yieldfromForecastWorkflowResultand JSON output.episode_analysisfromForecastWorkflowResultand JSON output.avg_effective_tokens,projected_effective_tokens, Monte Carlo P10/P50/P90).Console rendering: ET-first table
Yield/Periodcolumn from the forecast table.Forecast pipeline cleanup
forecastWorkflow.Tests
Yieldfield in forecast table fixtures).