-
Notifications
You must be signed in to change notification settings - Fork 2
Bump skill turn caps (review 30→200, gen 500→1000) #789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -548,14 +548,16 @@ jobs: | |
| # authors write at open time. GH_TOKEN for auth is already | ||
| # in the job env at the top of this workflow. | ||
| # | ||
| # --max-turns 500: observed gen baselines are 89 turns | ||
| # (silent) to 397 (full content rebuild). 500 gives headroom | ||
| # over the worst legitimate run, while clipping a genuine | ||
| # runaway before it spirals. Hitting the cap produces a | ||
| # loud failure -- raise deliberately if a release needs more. | ||
| # --max-turns 1000: observed gen baselines are 20 turns | ||
| # (silent) to 152 (full content rebuild). 500 was the | ||
| # initial cap; bumped to 1000 for extra headroom on | ||
| # multi-feature releases and to stay well above the | ||
| # suspected-looping 397-turn v3-test run (still clips | ||
| # genuine runaways). Hitting the cap produces a loud | ||
| # failure -- raise deliberately if a release needs more. | ||
| claude_args: | | ||
| --model claude-opus-4-7 | ||
| --max-turns 500 | ||
| --max-turns 1000 | ||
| --allowed-tools "Bash(gh:*)" | ||
| prompt: | | ||
| You are running in GitHub Actions with no interactive user. Follow | ||
|
|
@@ -738,12 +740,18 @@ jobs: | |
| display_report: true | ||
| # gh access parallels skill_gen so the review pass can | ||
| # re-verify claims against PR descriptions and linked | ||
| # issues if needed. --max-turns 30 is 6x the 4-5-turn | ||
| # baseline; if review ever needs more, the cap fails | ||
| # loudly and we raise it. | ||
| # issues if needed. | ||
| # | ||
| # --max-turns 200: initial cap of 30 was sized against | ||
| # silent-release baselines (4-6 turns) and was too tight | ||
| # for real content reviews. v0.24.0 (PR #788) hit it at | ||
| # turn 31 mid-review and failed the run; the editorial | ||
| # pass genuinely needs ~30-100 turns to walk a multi- | ||
| # file content PR. 200 gives 2x-6x headroom over that | ||
| # working range while still clipping a runaway. | ||
|
Comment on lines
+745
to
+751
|
||
| claude_args: | | ||
| --model claude-opus-4-7 | ||
| --max-turns 30 | ||
| --max-turns 200 | ||
| --allowed-tools "Bash(gh:*)" | ||
| prompt: | | ||
| You are running in GitHub Actions with no interactive user. Follow | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updated rationale mixes two different “observed baseline” ranges (20–152) while also referencing a 397-turn run. This is internally inconsistent and makes it hard to interpret the headroom goal. Consider rephrasing to separate “typical/observed” from “anomalous” runs (or update the baseline range to include the 397 outlier if it’s being treated as legitimate).