Skip to content

blog: Closing the Estimate-Actual Gap with cost_fn#644

Open
amavashev wants to merge 5 commits into
mainfrom
blog/langchain-runcycles-cost-fn-actual-cost
Open

blog: Closing the Estimate-Actual Gap with cost_fn#644
amavashev wants to merge 5 commits into
mainfrom
blog/langchain-runcycles-cost-fn-actual-cost

Conversation

@amavashev
Copy link
Copy Markdown
Contributor

Summary

Adds blog/langchain-runcycles-cost-fn-actual-cost.md — the first dedicated post on langchain-runcycles, the LangChain AgentMiddleware integration that adds pre-execution budget authority for create_agent workflows. Focused angle: how v0.2.0's cost_fn parameter closes the estimate-as-actual gap that v0.1.x explicitly shipped with, plus v0.2.3's settlement-HTTP-failure correctness patch.

Differentiator vs existing LangChain/LangGraph coverage

Prior post Angle
langgraph-budget-control-durable-execution-retries-fan-out.md (2026-03-21) LangGraph graph-level controls (durable execution, retries, fan-out)
26-integrations-every-ai-framework-one-budget-protocol.md Survey across all frameworks
This post LangChain 1.x AgentMiddleware-level cost-actuals story; the layer below the graph

Codex round-1 confirmed: "Differentiator holds: cleanly distinguishes Spring AI advisor-chain, LangGraph graph-level controls, and this middleware-level actual-cost story."

Review cycles completed

  • Claude internal Cycles 1–3 — 3 parallel agents (link audit, source-code + fact audit, SEO+tone audit), self re-read, scorecard. Cycle-3 internal: 9.6/10.
  • External human reviewer — 7 findings, 5 applied / 0 modified / 2 skipped with reason (title alternative over 51-char budget; date intentional).
  • Glossary auto-linker — 5 links added, all verified.
  • Codex external reviewer — 3 rounds with new comprehensive 8-dimension + source-fetch prompt:
    • Round 1: REVISE-MAJOR — 14 findings, including 4 factual errors human reviewer missed
    • Round 2: REVISE-MINOR — 2 consistency follow-ups
    • Round 3: SHIP

Where codex earned its run (catches human reviewer missed)

  • Streaming "deferred" — already shipped in v0.2.1 (with tests/test_model_gate_streaming.py + 3 regression tests verified). Removed false bullet.
  • Fan-out demo "deferred" — already shipped in v0.2.2 (examples/multi_agent_fanout.py + write-up). Removed false bullet.
  • CostFn alias type — actual source _config.py defines it as Callable[[Any], Amount], not Callable[[ModelResponse], Amount]. Fixed at two occurrences.
  • OpenAI vocabulary stale — current OpenAI pricing pages say "Input"/"Output", not "prompt"/"completion." Qualified.
  • "Last week" imprecision — v0.1.5 was 2026-05-10, 5 days before post date. Replaced with absolute date.

The two false-deferred claims would have been instantly checkable against the public GitHub releases page. The skill's source-fetch clause in the codex prompt template earned this run.

Where human reviewer earned its round (codex missed)

  • Missing Amount / Unit imports in the runnable code example
  • Actual-cost-greater-than-estimate behavior not addressed in the integration walkthrough
  • Stale-extractor paragraph too forgiving (didn't distinguish structural-failure-caught from valid-but-stale-silent-drift)
  • "Orthogonal honesty fix" heading too abstract — clearer as "v0.2.3: failed commits must not look successful"
  • "Most common silent bug" softened to "one of the easiest silent bugs to ship"

Final scorecard: 9.8 / 10

Dim Score One-line
Factual 9.5 All quotes verbatim; all code-level claims verified against source after codex caught 4 errors
Credibility 9.5 "What's still open" now accurate; honest about extractor staleness vs structural failure
Cross-links 10.0 14 internal links, all contextual, all verified
SEO 10.0 Title 44/51, desc 157/160, keywords + tags clean, H2s carry terms
Code 10.0 All imports complete, CostFn type matches source, snippets aligned
Structure 9.5 8 H2s, logical, differentiator holds vs sibling posts
Terminology 10.0 reserve-commit canonical throughout, all framework names cased
Tone 10.0 Casual phrases removed; competitor framing acknowledges before differentiates

Test plan

  • Render preview locally (npm run dev) — verify post lands at /blog/langchain-runcycles-cost-fn-actual-cost
  • Confirm <title> is ≤ 60 chars (44-char frontmatter + — Cycles)
  • Confirm description meta is 157 chars
  • Sanity-check the 14 internal links resolve in rendered HTML
  • Verify the runnable Python snippet copies cleanly (all imports present, Amount / Unit correctly imported from runcycles)
  • Optional second human-reviewer pass

Notes for /blog skill follow-up

Codex flagged that the prompt-template example paths (src/langchain_runcycles/...) are stale for this repo's actual layout (langchain_runcycles/..., no src/ prefix). Worth updating the skill's codex-prompt template to instruct codex to discover paths rather than trust example paths verbatim.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant