blog: Closing the Estimate-Actual Gap with cost_fn by amavashev · Pull Request #644 · runcycles/docs

amavashev · 2026-05-14T13:09:39Z

Summary

Adds blog/langchain-runcycles-cost-fn-actual-cost.md — the first dedicated post on langchain-runcycles, the LangChain AgentMiddleware integration that adds pre-execution budget authority for create_agent workflows. Focused angle: how v0.2.0's cost_fn parameter closes the estimate-as-actual gap that v0.1.x explicitly shipped with, plus v0.2.3's settlement-HTTP-failure correctness patch.

Differentiator vs existing LangChain/LangGraph coverage

Prior post	Angle
`langgraph-budget-control-durable-execution-retries-fan-out.md` (2026-03-21)	LangGraph graph-level controls (durable execution, retries, fan-out)
`26-integrations-every-ai-framework-one-budget-protocol.md`	Survey across all frameworks
This post	LangChain 1.x `AgentMiddleware`-level cost-actuals story; the layer below the graph

Codex round-1 confirmed: "Differentiator holds: cleanly distinguishes Spring AI advisor-chain, LangGraph graph-level controls, and this middleware-level actual-cost story."

Review cycles completed

Claude internal Cycles 1–3 — 3 parallel agents (link audit, source-code + fact audit, SEO+tone audit), self re-read, scorecard. Cycle-3 internal: 9.6/10.
External human reviewer — 7 findings, 5 applied / 0 modified / 2 skipped with reason (title alternative over 51-char budget; date intentional).
Glossary auto-linker — 5 links added, all verified.
Codex external reviewer — 3 rounds with new comprehensive 8-dimension + source-fetch prompt:
- Round 1: REVISE-MAJOR — 14 findings, including 4 factual errors human reviewer missed
- Round 2: REVISE-MINOR — 2 consistency follow-ups
- Round 3: SHIP

Where codex earned its run (catches human reviewer missed)

Streaming "deferred" — already shipped in v0.2.1 (with tests/test_model_gate_streaming.py + 3 regression tests verified). Removed false bullet.
Fan-out demo "deferred" — already shipped in v0.2.2 (examples/multi_agent_fanout.py + write-up). Removed false bullet.
CostFn alias type — actual source _config.py defines it as Callable[[Any], Amount], not Callable[[ModelResponse], Amount]. Fixed at two occurrences.
OpenAI vocabulary stale — current OpenAI pricing pages say "Input"/"Output", not "prompt"/"completion." Qualified.
"Last week" imprecision — v0.1.5 was 2026-05-10, 5 days before post date. Replaced with absolute date.

The two false-deferred claims would have been instantly checkable against the public GitHub releases page. The skill's source-fetch clause in the codex prompt template earned this run.

Where human reviewer earned its round (codex missed)

Missing Amount / Unit imports in the runnable code example
Actual-cost-greater-than-estimate behavior not addressed in the integration walkthrough
Stale-extractor paragraph too forgiving (didn't distinguish structural-failure-caught from valid-but-stale-silent-drift)
"Orthogonal honesty fix" heading too abstract — clearer as "v0.2.3: failed commits must not look successful"
"Most common silent bug" softened to "one of the easiest silent bugs to ship"

Final scorecard: 9.8 / 10

Dim	Score	One-line
Factual	9.5	All quotes verbatim; all code-level claims verified against source after codex caught 4 errors
Credibility	9.5	"What's still open" now accurate; honest about extractor staleness vs structural failure
Cross-links	10.0	14 internal links, all contextual, all verified
SEO	10.0	Title 44/51, desc 157/160, keywords + tags clean, H2s carry terms
Code	10.0	All imports complete, CostFn type matches source, snippets aligned
Structure	9.5	8 H2s, logical, differentiator holds vs sibling posts
Terminology	10.0	reserve-commit canonical throughout, all framework names cased
Tone	10.0	Casual phrases removed; competitor framing acknowledges before differentiates

Test plan

Render preview locally (npm run dev) — verify post lands at /blog/langchain-runcycles-cost-fn-actual-cost
Confirm <title> is ≤ 60 chars (44-char frontmatter + — Cycles)
Confirm description meta is 157 chars
Sanity-check the 14 internal links resolve in rendered HTML
Verify the runnable Python snippet copies cleanly (all imports present, Amount / Unit correctly imported from runcycles)
Optional second human-reviewer pass

Notes for `/blog` skill follow-up

Codex flagged that the prompt-template example paths (src/langchain_runcycles/...) are stale for this repo's actual layout (langchain_runcycles/..., no src/ prefix). Worth updating the skill's codex-prompt template to instruct codex to discover paths rather than trust example paths verbatim.

…post-internal-review)

…-extractor distinction, v0.2.3 heading, softer absolute

… CostFn alias type, soften absolutes, terminology + tone

… alias annotation

amavashev added 4 commits May 14, 2026 08:52

blog: langchain-runcycles cost_fn — closing the estimate-actual gap (…

d638bdc

…post-internal-review)

blog: apply external reviewer fixes — imports, actual>estimate, stale…

5ea9b95

…-extractor distinction, v0.2.3 heading, softer absolute

blog: apply codex round-1 review — remove stale-deferred bullets, fix…

26eeeeb

… CostFn alias type, soften absolutes, terminology + tone

blog: codex round-2 fixes — align snippet comment + fix second CostFn…

0ce7436

… alias annotation

amavashev mentioned this pull request May 14, 2026

skill(blog): codex prompt should discover paths, not trust examples #645

Merged

2 tasks

Merge branch 'main' into blog/langchain-runcycles-cost-fn-actual-cost

99c3776

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

blog: Closing the Estimate-Actual Gap with cost_fn#644

blog: Closing the Estimate-Actual Gap with cost_fn#644
amavashev wants to merge 5 commits into
mainfrom
blog/langchain-runcycles-cost-fn-actual-cost

amavashev commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amavashev commented May 14, 2026

Summary

Differentiator vs existing LangChain/LangGraph coverage

Review cycles completed

Where codex earned its run (catches human reviewer missed)

Where human reviewer earned its round (codex missed)

Final scorecard: 9.8 / 10

Test plan

Notes for /blog skill follow-up

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Notes for `/blog` skill follow-up