blog + skill: Spring AI advisor post + Phase 9 codex-loop rewrite#642
Merged
amavashev merged 7 commits intoMay 14, 2026
Merged
Conversation
Walks through how cycles-spring-ai-starter 0.3.1 inserts the reserve-commit-release lifecycle into Spring AI's advisor chain: ChatClientCustomizer wiring at HIGHEST_PRECEDENCE+100, Flux.defer streaming with fail-closed commit in concatWith, SubjectResolver for per-tenant attribution (v0.3.0), jtokkit-backed estimator with canonical USD_MICROCENTS math, opt-in CyclesToolGate.wrap for tool callbacks, and cycles.reservation_id on Micrometer observations for trace↔reservation correlation. Complements (not duplicates) how-scalerx-wired-cycles-into-a-java- agent-runtime.md — that post covers @cycles on raw OpenAI in plain Spring Boot; this one is Spring AI-native at the advisor-chain layer. All claims fact-checked against runcycles/cycles-spring-ai- starter README and v0.3.1/v0.3.0/v0.2.0 release notes.
Applied per blog-process rule (evaluate on merit, not auto-apply). 13 applied / 2 modified / 1 scope-skip / 1 deferred to round 2. Factual: - Lifecycle: three wire calls (reserve + commit-or-release) + one delegation, not four wire calls - Drop "sees every retry" claim; clarify spring.ai.retry wraps the ChatModel below the advisor, so advisor sees one logical call - Clarify chatClient.prompt(...).stream() returns a stream-spec; .chatResponse() yields Flux<ChatResponse> - Soften "4x low" for CJK to "several times low" Code: - Mark streaming pseudocode as illustrative; note element-type adaptation on concatWith elided - Note that release-on-commit-failure trades cost-side accuracy for clean reservation-state invariant - MethodToolCallback example: explicit "illustrative — needs reflected Method, toolObject(...)" comment Clarity: - Fix inverted "trade explicitness for silent under-billing" — now "choose explicit startup signal over silent under-billing" - End-to-end: qualify "reservation IDs appear" to require explicit convention attachment + emit-reservation-id-on-trace enabled - End-to-end: scope "no call-site code changes" to chat calls; tool gating and convention attachment touch wiring Overclaim/tone: - Tool fixed-price assumption: qualify "approximated with one number" and note variable-cost APIs need a future extension - "where Spring AI itself wants it" → "at the framework's own advisor extension point" - "since this bit a release" → factual reference to v0.3.1 patch Skipped: - ChatResponse.Usage typing — upstream starter README uses this shorthand; staying consistent with source-of-truth wording Deferred to round 2: - ALLOW_WITH_CAPS scope question (out of post scope) - Spring AI tool auto-decoration claim verification Title 50/51, description 154/160 unchanged.
3 findings, all applied: - Factual: drop "retry" from the list of advisor-chain responsibilities — round 1 already moved retry to the model layer below the advisor; leaving it in the chain list was a contradiction - Overclaim: drop "small amount of" from the commit-failure release tradeoff — for a long streamed response, the uncommitted cost can be the whole cost - Code: replace null placeholders in MethodToolCallback example with named parameters (Method reflectedMethod, WeatherService weatherTarget) so the snippet is illustrative, not a copy-and- break Round 1 deferred items were accepted by codex without re-litigation: ChatResponse.Usage shorthand stands (matches upstream README wording), ALLOW_WITH_CAPS scope cut stands, Spring AI tool auto-decoration claim stands. Title 50/51, description 154/160 unchanged.
Contributor
Author
Codex review loop completed — 3 rounds, SHIP verdictRan Round 1 — 16 findings + 2 open questionsApplied 13:
Modified 2:
Skipped 1:
Deferred to round 2:
Round 2 — 3 findings, all applied
Codex accepted all 3 round-1 deferred/skipped items without re-litigation. Round 3 — SHIPNo new findings. Codex returned Title 50/51, description 154/160 throughout. Ready for human review. |
The skill's Phase 9 said "apply external reviewer feedback precisely." That contradicted the project's blog-process rule (feedback is input, not directive — evaluate on merit, push back when warranted) and over-softened several spec-backed claims in past reviewer rounds. Phase 9 is now split: - Lead: explicit evaluate / apply / modify / skip rule with a one-line reason required before touching the file, preserved in both the conversation reply and the commit message. - Phase 9a: codex-cli (0.130.0) as automated external reviewer via `codex exec --sandbox read-only` + `codex exec resume --last`. Captures the 0.130.0 gotcha that resume does NOT accept --sandbox or --cd — those inherit from the original session and passing them errors out. Loop until SHIP or stylistic-only; cap at 4 rounds. - Phase 9b: human external reviewer fallback (the original user-relay flow) with the same evaluate-on-merit rule. Validated against #642 (Spring AI advisor post): 3 codex rounds, 19 findings, 18 applied/modified, 1 scope-skipped with reason, codex converged to SHIP in round 3. Both skill copies (.claude/skills/blog and .agents/skills/blog) updated identically.
…isor post Verified both findings against runcycles/cycles-spring-ai-starter/.../advisor/CyclesBudgetStreamAdvisor.java. Both correct; both applied. Finding 1 (medium): streaming commit-failure release path Reviewer was right. The actual source attaches doOnError to the upstream Flux BEFORE concatWith adds the commit Mono. In Reactor, that means doOnError observes upstream terminal signals only — commit-Mono errors propagate to the subscriber as onError but do NOT re-trigger the upstream's doOnError. The source's own javadoc confirms this scope: "commit failures in fail-closed mode propagate as onError to subscribers correctly" — with no mention of release on commit failure. So the post's earlier claim "the doOnError path then releases the reservation rather than leaving it stranded" was wrong, and the "clean reservation-state invariant" framing was wrong. On commit failure the reservation is NOT explicitly released; cleanup relies on the server's reservation TTL expiry. The pseudocode also had the wrong operator order (doOnError after concatWith) which contradicted the actual source. Fix: - Reorder pseudocode to match source: doOnNext → doOnError → doOnCancel → concatWith(commitMono) - Rename concatWith arg to commitThenEmptyOrError(...) to make the failure path visible - Rewrite the prose to: (a) note doOnError observes upstream errors only, (b) say commit-failure cleanup relies on server-side TTL expiry, (c) frame the tradeoff honestly — fail-closed on subscriber signal, deliberate non-handling of commit-failure release Finding 2 (low): "tool that internally calls an LLM" overclaim Reviewer was right. "Goes through the chat advisor" is only true if the tool uses the auto-configured ChatClient. Tools that use a raw provider SDK or build a custom ChatClient.Builder without the starter's ChatClientCustomizer bypass the advisor entirely. Fix: qualify the claim to "via the auto-configured ChatClient" and add an explicit note that bypassing tools (raw SDK / custom builder) get neither the tool-gate commit nor the chat-advisor reservation, leaving their LLM cost invisible to Cycles. Title 50/51, description 154/160 unchanged.
Two changes to .claude/skills/blog/SKILL.md and .agents/skills/blog/SKILL.md (kept in sync): 1. Phase 4 (Claude internal Cycle 1) — split the fact-check step into a text-claim fact-check and a separate source-code audit step. The source-code audit fetches the actual upstream files (gh api / base64 -d) and verifies operator order in reactive/async code, method signatures, error/release paths, fluent-builder requirements, and any quoted identifier. Calls out explicitly that the post's own pseudocode is NOT ground truth — it is the thing being audited. 2. Phase 9a (codex external reviewer prompt template) — expanded the prompt skeleton to require codex to name the upstream source repos and fetch the relevant source files BEFORE judging code-level claims. Same verification list as Phase 4. Added a "why this is mandatory" paragraph citing the PR #642 miss (Reactor doOnError/concatWith operator-order bug that shipped through 3 codex rounds + Claude cycles 1-3 before a sibling codex session with a broader prompt caught it). Rationale: prose-only audits and README cross-checks miss bugs where the prose matches the README's surface description but contradicts the actual source code. The fix is to make source-fetching mandatory in the prompts. No app code or blog content changed by this commit.
…oring
Three changes to .claude/skills/blog/SKILL.md and
.agents/skills/blog/SKILL.md (kept in sync):
1. Add a top-level "Review goal" section listing the eight
quality dimensions (factual, credibility, cross-links, SEO,
code accuracy, structure & flow, terminology, tone & style)
that EVERY review pass — Claude internal, codex, and human —
must cover. No dimension is "owned" by a single phase. Closes
the gap where a clean factual review could ship past a tone
or SEO issue.
2. Phase 4 (Cycle 1) expanded from 4 parallel agents to 5: added
an explicit "Style / tone / terminology audit" agent covering
dims 6, 7, 8. Each agent annotated with which dimensions it
owns, so the parallel set jointly covers all eight.
3. Phase 5 (Cycle 2) re-read checklist made explicit: flow &
integration, consistency, softening of absolutes, filler
removal — split into four numbered checks instead of one
prose line.
4. Phase 9a (codex prompt template) now requires:
- Comprehensive coverage of all eight dimensions, with output
bucketed by FACTUAL / OVERCLAIM / CROSS-LINKS / SEO / CODE
/ STRUCTURE / TERMINOLOGY / TONE / OPEN QUESTIONS — one
bucket per dimension, NONE allowed if clean. Forces codex
to address each dimension rather than picking favorites.
- Source-code fetching (unchanged from prior commit).
5. NEW Phase 10 "Final Scoring & Summary" (renumbers old Phase
10 Publish → Phase 11):
- After all reviews settle, score the final post 1-10 across
all 8 dimensions with one-line justifications, average to a
single overall score that must remain >= 9.0.
- Present a single final summary to the user: title/slug/path,
frontmatter budget status, per-dimension scorecard, overall
score, review cycles run, notable changes summary, open
caveats, explicit "Ready to merge?" ask.
- Wait for user confirmation before merge / final push.
Rationale: the user explicitly asked for "after all reviews are
done, score the final post and present me with final summary."
Encoding it in the skill means future /blog runs do it
automatically rather than relying on conversation memory.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related changes, bundled because the skill update is the process improvement that came directly out of this post's review experience.
1. New blog post (
blog/cycles-spring-ai-starter-advisors-walkthrough.md) — covers howcycles-spring-ai-starter0.3.1 inserts the reserve-commit-release lifecycle into Spring AI's advisor chain. Complements (not duplicates)how-scalerx-wired-cycles-into-a-java-agent-runtime.md: that post is@Cycleson raw OpenAI in plain Spring Boot; this one is Spring AI-native (advisor chain,ChatClient.Builder, Flux streaming,SubjectResolver, jtokkit, tool gating,cycles.reservation_idon Micrometer traces). Times to v0.1.0 → v0.3.1 launches in the last 48h.2.
/blogskill — Phase 9 rewrite (.claude/skills/blog/SKILL.md+.agents/skills/blog/SKILL.md). The old Phase 9 said "apply external reviewer feedback precisely" — which contradicted the project's blog-process rule (feedback is input, not directive — evaluate on merit). New Phase 9:codex-cli(0.130.0) as automated external reviewer viacodex exec --sandbox read-only+codex exec resume --last. Documents the 0.130.0 gotcha thatresumedoes NOT accept--sandboxor--cd— those inherit from the original session. Loop until SHIP or stylistic-only; cap at 4 rounds.Coverage (blog post)
Sections:
CyclesBudgetCallAdvisor— reserve / commit (on realChatResponse.Usage) / releaseCyclesBudgetStreamAdvisor—Flux.deferper-subscription,concatWith(Mono.defer)fail-closed commitSubjectResolverfor per-tenant attribution (v0.3.0)PromptTokenEstimatorvia jtokkit + canonicalUSD_MICROCENTSmath (preempts the v0.3.0 docs 10x bug)CyclesToolGate.wrap(opt-in by design)cycles.reservation_idon Micrometer observations (v0.3.0)Review cycles completed (blog post)
runcycles/cycles-spring-ai-starterREADME + v0.3.1/v0.3.0/v0.2.0 release notes), SEO, full re-read, scorecard at 9.5/10. Commits240f07d(initial draft after internal cycles).688a74f): 16 findings + 2 open questions → 13 applied / 2 modified / 1 skipped / 2 deferred. See#issuecomment-4450047730for the full tally.9677b4d): 3 findings → all applied. Codex accepted all round-1 deferred/skipped items without re-litigation.SHIP— no new findings.Test plan
npm run dev) — verify post lands at/blog/cycles-spring-ai-starter-advisors-walkthrough<title>≤ 60 chars (50-char frontmatter +— Cycles)/blogskill Phase 9 in.claude/skills/blog/SKILL.mdis discovered (the in-repo copy is what the running session uses)