v0.14.1
Forge v0.14.1 — Bug fixes for OpenAI-compatible providers (Together.ai, OpenRouter, Groq) and session persistence
Published: 2026-06-09 · Tag: v0.14.1 · Previous release: v0.14.0 (2026-06-09, OpenTelemetry Tracing v1)
Forge v0.14.1 is a patch release focused on operators running agents against OpenAI-compatible LLM providers — Together.ai, OpenRouter, Groq, Fireworks, Anyscale, vLLM, llama.cpp's server, and others — plus a class of session-persistence bugs that broke multi-turn channel conversations on OpenAI reasoning models (gpt-5-nano, o1, o3) and strict OpenAI-compatible gateways.
Seven bug fixes, no new features, no breaking changes. Backward-compatible upgrade from v0.14.0. 18 files changed, +1,683 / −38 lines, 7 issues closed, 7 PRs merged.
If you saw "something went wrong while processing your request, please try again" on Slack thread followups, Anthropic API returned status 401 despite configuring OpenAI, Unsupported parameter: 'max_tokens', or NetworkPolicy blocking calls to api.together.ai after forge package — this release fixes all of them.
The shipped fixes
1. Session persistence no longer poisons followup turns
Long-running Slack threads (and any session-persistent channel) succeeded on the first message and then failed every followup with "something went wrong while processing your request, please try again". The executor unconditionally wrote the LLM's assistant message to memory before checking content. When the provider hit finish_reason: length (output token cap) and returned an assistant turn with empty content AND no tool_calls, that invalid-per-OpenAI-spec shape landed in memory. The in-loop empty-response recovery papered over it for the current task, but persistSession wrote the polluted memory to .forge/sessions/<task_id>.json. The next request recovered the bad shape and strict OpenAI-spec providers (Moonshot, hosted OpenRouter, OpenAI strict mode) returned HTTP 400.
Fix:
- Substitute a placeholder content string (
"(continuing — previous response was truncated by output token limit)") when the LLM returns the bad shape, so newly-written sessions never contain it. - Extend
sanitizeMessagesonLoadFromStorewith a newstripEmptyAssistantTurnspass that rescues sessions already on disk — no manualrm <agent-dir>/.forge/sessions/<task-id>.jsonrequired after upgrade.
2. Duplicate user message at session start no longer trips strict-mode providers
Same surface symptom as #131 — "something went wrong" on Slack thread followups — different root cause. The runner pre-appended params.Message to task.History before calling Execute so SSE clients could observe the inbound message in the in-flight task. The executor's !recovered first-interaction path then iterated task.History AND appended *msg separately, producing two consecutive identical user turns at the start of every fresh conversation. OpenAI reasoning models (gpt-5-nano, o1, o3) and strict OpenAI-compatible gateways (Together's Kimi) reject consecutive same-role messages with HTTP 400.
Fix:
- Strip the trailing
task.Historyentry when it equals*msgbefore iterating (newa2aMessagesEqualhelper). - Extend
sanitizeMessageswith acollapseConsecutiveDuplicatespass onLoadFromStore. Surgical by design — only EXACT same-role + same-content + tool-call-free pairs collapse; workflow nudges ("Your response was empty...") and tool-bearing assistant turns survive.
3. code-review skill no longer routes Anthropic-first when both API keys are set
The skill's code-review-diff.sh and code-review-file.sh picked Anthropic whenever ANTHROPIC_API_KEY was non-empty, even when the operator's forge.yaml pointed at an OpenAI-compatible provider (Together.ai, OpenRouter, Groq, Fireworks, Anyscale, vLLM) via OPENAI_BASE_URL and REVIEW_MODEL was clearly a non-Anthropic model. Operators with a stale ANTHROPIC_API_KEY co-resident with a live OPENAI_API_KEY in .forge/secrets.enc got Anthropic API returned status 401 and assumed the skill was broken.
Fix: new REVIEW_PROVIDER env var (values anthropic or openai) that wins always. When unset, auto-detected from REVIEW_MODEL prefix (claude-* or anthropic/* → Anthropic; anything else → OpenAI), then by sole API key, then defaults to OpenAI when both keys exist with no other signal.
The same PR also fixes a separate bug: the OPENAI_BASE_URL-as-Responses-API-toggle was exactly backwards — Together, OpenRouter, Groq, Fireworks, Anyscale all implement /chat/completions only, not OpenAI's proprietary /responses. Setting OPENAI_BASE_URL=https://api.together.ai/v1 silently POSTed to https://api.together.ai/v1/responses → 404. Decoupled into a separate OPENAI_USE_RESPONSES_API=1 opt-in for the Codex/OAuth flow.
4. code-review skill uses max_completion_tokens (not deprecated max_tokens)
OpenAI deprecated max_tokens in favor of max_completion_tokens for Chat Completions. Reasoning models (o1, o1-preview, o3, gpt-5) and strict OpenAI-compatible providers (Together.ai's Kimi-K2.6 series, Moonshot) reject the legacy field with HTTP 400:
"Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead."
Fix: one-token edit per script. The Anthropic branch keeps max_tokens (correct field for Anthropic's API). The Responses API branch uses no max-tokens field (auto-sizes).
5. Skill subprocesses inherit OPENAI_BASE_URL and the other standard SDK base-URL env vars
SkillCommandExecutor built a whitelist-only env for skill subprocesses where OPENAI_ORG_ID was always-passed but the standard SDK base-URL pointers (OPENAI_BASE_URL, ANTHROPIC_BASE_URL, OLLAMA_BASE_URL, GEMINI_BASE_URL) were not — unless each SKILL.md author remembered to declare each variable in env.optional. Every LLM-calling skill that forgot silently broke for OpenAI-compatible deployments. The operator's main agent loop hit Together.ai correctly; the same agent's skill subprocess hit api.openai.com with the Together.ai key and 401d.
Fix: four standard SDK variables special-cased alongside OPENAI_ORG_ID. Every skill that uses industry-standard env conventions just works — present, future, in-tree, and third-party.
6. CLI wizard recognizes encrypted one_of keys and stops writing first-key placeholders
forge skills add did not validate one_of groups at all — operators got no confirmation that their encrypted key (e.g. OPENAI_API_KEY in .forge/secrets.enc) was detected. forge init's fallback wrote opts.EnvVars[OneOfEnv[0]] = "" (the first key in the list — ANTHROPIC_API_KEY for code-review), producing an empty .env line that misled operators into thinking they needed an Anthropic key when they had configured OpenAI.
Fix:
forge skills addnow mirrorsRequiredEnv's three-source check forone_ofgroups:os.Getenv/.env/loadSecretPlaceholders. Output now showsOPENAI_API_KEY (secrets) — okregardless of list order.forge initno longer pre-writes a placeholder. The runtime resolver already surfaces missingone_ofgroups clearly atforge runif neither key is set..envwriter no longer emits empty placeholder lines for everyone_ofmember — only keys with non-empty values get written.
7. forge package and forge run auto-add the LLM provider's custom base URL to the egress allowlist
Operators configuring an OpenAI-compatible provider via OPENAI_BASE_URL=https://api.together.ai/v1 shipped a NetworkPolicy that blocked the provider's hostname — deployed agents 401d or timed out depending on which side noticed first. Same trap Phase 6 of OTel Tracing v1 fixed for the OTLP collector (#107), but for the LLM provider.
Fix: added ModelRef.BaseURL and ModelFallback.BaseURL fields to the forge.yaml schema, plus two new helpers:
security.LLMProviderDomains(cfg)— extracts hostnames fromcfg.Model.BaseURL+cfg.Model.Fallbacks[].BaseURL. Cfg-driven; used by both build and runtime.security.LLMProviderEnvDomains(envVars)— extracts hostnames from the four canonical SDK base-URL env vars in the resolved env. Runtime safety net for deployments that haven't migrated to the schema field.
Both wired into egress_stage.go and runner.go alongside the existing AuthDomains / MCPDomains / OTelDomain merges. Operators going forward declare the URL once in forge.yaml:
model:
provider: openai
name: moonshotai/Kimi-K2.6
base_url: https://api.together.ai/v1…and forge build && forge package && kubectl apply produces a NetworkPolicy that admits api.together.ai automatically.
Compatibility-restored deployment matrix
| Setup | Pre-v0.14.1 | Post-v0.14.1 |
|---|---|---|
Agent on Together.ai / OpenRouter / Groq with OPENAI_BASE_URL set |
Skill subprocess 401'd against OpenAI | ✓ hits the configured provider |
gpt-5-nano or other OpenAI reasoning model |
Followup turns 400'd on duplicate user message | ✓ multi-turn conversations work |
code-review skill with stale Anthropic key in .forge/secrets.enc |
Provider mis-routed to Anthropic, 401 | ✓ explicit REVIEW_PROVIDER override |
code-review against Moonshot / Kimi via Together.ai |
max_tokens rejected, HTTP 400 |
✓ max_completion_tokens accepted |
forge package with OPENAI_BASE_URL=together.ai |
NetworkPolicy blocked the provider |
✓ host auto-added |
Long-running Slack thread where LLM hit finish_reason: length |
Empty-assistant-turn poison broke followups | ✓ placeholder substitution + LoadFromStore sanitization |
forge skills add with OPENAI_API_KEY encrypted |
Wizard prompted for missing key | ✓ shows OPENAI_API_KEY (secrets) — ok |
Upgrade guide
Backward-compatible. No forge.yaml schema break. No code changes required for existing agents.
Three small things changed defaults; if you depended on them, you can override:
-
code-reviewskill provider selection — previously preferred Anthropic when both keys were set. Now prefers OpenAI. To preserve the old behavior: addREVIEW_PROVIDER=anthropicto your env. -
code-reviewOPENAI_BASE_URLno longer triggers the Responses API. Previously the variable was both a base URL pointer AND a Responses API toggle. Now decoupled: settingOPENAI_BASE_URLalways uses/chat/completions. If you relied on the Responses API: addOPENAI_USE_RESPONSES_API=1. -
forge initno longer pre-writes emptyone_ofplaceholders to.env. If your CI scripts grep.envforANTHROPIC_API_KEY=orOPENAI_API_KEY=to detect "user hasn't filled this in," migrate to checkingforge skills add's install output instead — it now prints the satisfaction state forone_ofgroups.
Sessions corrupted by issues #131 or #143 on a pre-v0.14.1 build do not require a rm. Both fixes ship defense-in-depth on LoadFromStore that sanitizes the bad shape on load. Just upgrade and send the next message in the affected channel thread.
Installation
Download a pre-built binary from the assets attached below: forge-Darwin-arm64.tar.gz, forge-Darwin-x86_64.tar.gz, forge-Linux-arm64.tar.gz, forge-Linux-x86_64.tar.gz, forge-Windows-arm64.zip, forge-Windows-x86_64.zip. Verify with checksums.txt.
Container image: ghcr.io/initializ/forge:v0.14.1.
Homebrew: brew upgrade initializ/tap/forge.
From source:
git clone --branch v0.14.1 https://github.com/initializ/forge
cd forge
go install ./forge-cli/cmd/forgeWhat's new since v0.14.0 — per-PR changelog
- #132
fix(runtime): stop persisting empty assistant turns onfinish_reason=length(closes #131) - #134
fix(skills/code-review): provider routing + OpenAI-compatible endpoint (closes #133) - #136
fix(skills): one_of env validation visibility + stop writing first-key placeholder (closes #135) - #138
fix(skills): always passOPENAI_BASE_URL/ANTHROPIC_BASE_URL/OLLAMA_BASE_URL/GEMINI_BASE_URLto skill subprocesses (closes #137) - #140
fix(security): auto-add LLM provider base URL to egress allowlist at build + runtime (closes #139) - #142
fix(skills/code-review):max_completion_tokensfor OpenAI Chat Completions (closes #141) - #144
fix(runtime): stop duplicating the inbound user message at the start of every fresh session (closes #143)
Full per-PR detail in CHANGELOG.md.
Compatibility
- Go: 1.25+
- A2A: 0.3.0
- Forge agent schema: unchanged from v0.14.0 (additive only — new
ModelRef.BaseURLandModelFallback.BaseURLfields) - LLM providers verified: OpenAI (Chat Completions + Responses API), Anthropic, Ollama, Gemini, Together.ai, OpenRouter, Groq, Fireworks, Anyscale, Moonshot (Kimi-K2.6 series), Bedrock proxy patterns
- Container image:
ghcr.io/initializ/forge:v0.14.1
Documentation
- Observability — Tracing (unchanged from v0.14.0)
forge.yamlschema (now includesmodel.base_urlandmodel.fallbacks[].base_url)- CLI reference
- Environment variables
- Egress control
- CHANGELOG.md
Acknowledgements
All seven fixes landed within hours of being reported by operators running real production-shaped agents on Slack-channel-backed deployments. The session-poison bug (#131) was caught on a long-running PR-review thread; the duplicate-user bug (#143) was caught on a gpt-5-nano agent; the egress / wizard / subprocess-env bugs (#135, #137, #139) were caught in sequence on the same Together.ai-backed deployment, with each fix exposing the next layer. The 7-PR plan landed with the same discipline that delivered OpenTelemetry Tracing v1 in v0.14.0.