Forge v0.14.1 — Bug fixes for OpenAI-compatible providers (Together.ai, OpenRouter, Groq) and session persistence

Published: 2026-06-09 · Tag: v0.14.1 · Previous release: v0.14.0 (2026-06-09, OpenTelemetry Tracing v1)

Forge v0.14.1 is a patch release focused on operators running agents against OpenAI-compatible LLM providers — Together.ai, OpenRouter, Groq, Fireworks, Anyscale, vLLM, llama.cpp's server, and others — plus a class of session-persistence bugs that broke multi-turn channel conversations on OpenAI reasoning models (gpt-5-nano, o1, o3) and strict OpenAI-compatible gateways.

Seven bug fixes, no new features, no breaking changes. Backward-compatible upgrade from v0.14.0. 18 files changed, +1,683 / −38 lines, 7 issues closed, 7 PRs merged.

If you saw "something went wrong while processing your request, please try again" on Slack thread followups, Anthropic API returned status 401 despite configuring OpenAI, Unsupported parameter: 'max_tokens', or NetworkPolicy blocking calls to api.together.ai after forge package — this release fixes all of them.

The shipped fixes

1. Session persistence no longer poisons followup turns

Issue #131 · PR #132

Long-running Slack threads (and any session-persistent channel) succeeded on the first message and then failed every followup with "something went wrong while processing your request, please try again". The executor unconditionally wrote the LLM's assistant message to memory before checking content. When the provider hit finish_reason: length (output token cap) and returned an assistant turn with empty content AND no tool_calls, that invalid-per-OpenAI-spec shape landed in memory. The in-loop empty-response recovery papered over it for the current task, but persistSession wrote the polluted memory to .forge/sessions/<task_id>.json. The next request recovered the bad shape and strict OpenAI-spec providers (Moonshot, hosted OpenRouter, OpenAI strict mode) returned HTTP 400.

Fix:

Substitute a placeholder content string ("(continuing — previous response was truncated by output token limit)") when the LLM returns the bad shape, so newly-written sessions never contain it.
Extend sanitizeMessages on LoadFromStore with a new stripEmptyAssistantTurns pass that rescues sessions already on disk — no manual rm <agent-dir>/.forge/sessions/<task-id>.json required after upgrade.

2. Duplicate user message at session start no longer trips strict-mode providers

Issue #143 · PR #144

Same surface symptom as #131 — "something went wrong" on Slack thread followups — different root cause. The runner pre-appended params.Message to task.History before calling Execute so SSE clients could observe the inbound message in the in-flight task. The executor's !recovered first-interaction path then iterated task.History AND appended *msg separately, producing two consecutive identical user turns at the start of every fresh conversation. OpenAI reasoning models (gpt-5-nano, o1, o3) and strict OpenAI-compatible gateways (Together's Kimi) reject consecutive same-role messages with HTTP 400.

Fix:

Strip the trailing task.History entry when it equals *msg before iterating (new a2aMessagesEqual helper).
Extend sanitizeMessages with a collapseConsecutiveDuplicates pass on LoadFromStore. Surgical by design — only EXACT same-role + same-content + tool-call-free pairs collapse; workflow nudges ("Your response was empty...") and tool-bearing assistant turns survive.

3. `code-review` skill no longer routes Anthropic-first when both API keys are set

Issue #133 · PR #134

The skill's code-review-diff.sh and code-review-file.sh picked Anthropic whenever ANTHROPIC_API_KEY was non-empty, even when the operator's forge.yaml pointed at an OpenAI-compatible provider (Together.ai, OpenRouter, Groq, Fireworks, Anyscale, vLLM) via OPENAI_BASE_URL and REVIEW_MODEL was clearly a non-Anthropic model. Operators with a stale ANTHROPIC_API_KEY co-resident with a live OPENAI_API_KEY in .forge/secrets.enc got Anthropic API returned status 401 and assumed the skill was broken.

Fix: new REVIEW_PROVIDER env var (values anthropic or openai) that wins always. When unset, auto-detected from REVIEW_MODEL prefix (claude-* or anthropic/* → Anthropic; anything else → OpenAI), then by sole API key, then defaults to OpenAI when both keys exist with no other signal.

The same PR also fixes a separate bug: the OPENAI_BASE_URL-as-Responses-API-toggle was exactly backwards — Together, OpenRouter, Groq, Fireworks, Anyscale all implement /chat/completions only, not OpenAI's proprietary /responses. Setting OPENAI_BASE_URL=https://api.together.ai/v1 silently POSTed to https://api.together.ai/v1/responses → 404. Decoupled into a separate OPENAI_USE_RESPONSES_API=1 opt-in for the Codex/OAuth flow.

4. `code-review` skill uses `max_completion_tokens` (not deprecated `max_tokens`)

Issue #141 · PR #142

OpenAI deprecated max_tokens in favor of max_completion_tokens for Chat Completions. Reasoning models (o1, o1-preview, o3, gpt-5) and strict OpenAI-compatible providers (Together.ai's Kimi-K2.6 series, Moonshot) reject the legacy field with HTTP 400:

"Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead."

Fix: one-token edit per script. The Anthropic branch keeps max_tokens (correct field for Anthropic's API). The Responses API branch uses no max-tokens field (auto-sizes).

5. Skill subprocesses inherit `OPENAI_BASE_URL` and the other standard SDK base-URL env vars

Issue #137 · PR #138

SkillCommandExecutor built a whitelist-only env for skill subprocesses where OPENAI_ORG_ID was always-passed but the standard SDK base-URL pointers (OPENAI_BASE_URL, ANTHROPIC_BASE_URL, OLLAMA_BASE_URL, GEMINI_BASE_URL) were not — unless each SKILL.md author remembered to declare each variable in env.optional. Every LLM-calling skill that forgot silently broke for OpenAI-compatible deployments. The operator's main agent loop hit Together.ai correctly; the same agent's skill subprocess hit api.openai.com with the Together.ai key and 401d.

Fix: four standard SDK variables special-cased alongside OPENAI_ORG_ID. Every skill that uses industry-standard env conventions just works — present, future, in-tree, and third-party.

6. CLI wizard recognizes encrypted `one_of` keys and stops writing first-key placeholders

Issue #135 · PR #136

forge skills add did not validate one_of groups at all — operators got no confirmation that their encrypted key (e.g. OPENAI_API_KEY in .forge/secrets.enc) was detected. forge init's fallback wrote opts.EnvVars[OneOfEnv[0]] = "" (the first key in the list — ANTHROPIC_API_KEY for code-review), producing an empty .env line that misled operators into thinking they needed an Anthropic key when they had configured OpenAI.

Fix:

forge skills add now mirrors RequiredEnv's three-source check for one_of groups: os.Getenv / .env / loadSecretPlaceholders. Output now shows OPENAI_API_KEY (secrets) — ok regardless of list order.
forge init no longer pre-writes a placeholder. The runtime resolver already surfaces missing one_of groups clearly at forge run if neither key is set.
.env writer no longer emits empty placeholder lines for every one_of member — only keys with non-empty values get written.

7. `forge package` and `forge run` auto-add the LLM provider's custom base URL to the egress allowlist

Issue #139 · PR #140

Operators configuring an OpenAI-compatible provider via OPENAI_BASE_URL=https://api.together.ai/v1 shipped a NetworkPolicy that blocked the provider's hostname — deployed agents 401d or timed out depending on which side noticed first. Same trap Phase 6 of OTel Tracing v1 fixed for the OTLP collector (#107), but for the LLM provider.

Fix: added ModelRef.BaseURL and ModelFallback.BaseURL fields to the forge.yaml schema, plus two new helpers:

security.LLMProviderDomains(cfg) — extracts hostnames from cfg.Model.BaseURL + cfg.Model.Fallbacks[].BaseURL. Cfg-driven; used by both build and runtime.
security.LLMProviderEnvDomains(envVars) — extracts hostnames from the four canonical SDK base-URL env vars in the resolved env. Runtime safety net for deployments that haven't migrated to the schema field.

Both wired into egress_stage.go and runner.go alongside the existing AuthDomains / MCPDomains / OTelDomain merges. Operators going forward declare the URL once in forge.yaml:

model:
  provider: openai
  name: moonshotai/Kimi-K2.6
  base_url: https://api.together.ai/v1

…and forge build && forge package && kubectl apply produces a NetworkPolicy that admits api.together.ai automatically.

Compatibility-restored deployment matrix

Setup	Pre-v0.14.1	Post-v0.14.1
Agent on Together.ai / OpenRouter / Groq with `OPENAI_BASE_URL` set	Skill subprocess 401'd against OpenAI	✓ hits the configured provider
`gpt-5-nano` or other OpenAI reasoning model	Followup turns 400'd on duplicate user message	✓ multi-turn conversations work
`code-review` skill with stale Anthropic key in `.forge/secrets.enc`	Provider mis-routed to Anthropic, 401	✓ explicit `REVIEW_PROVIDER` override
`code-review` against Moonshot / Kimi via Together.ai	`max_tokens` rejected, HTTP 400	✓ `max_completion_tokens` accepted
`forge package` with `OPENAI_BASE_URL=together.ai`	`NetworkPolicy` blocked the provider	✓ host auto-added
Long-running Slack thread where LLM hit `finish_reason: length`	Empty-assistant-turn poison broke followups	✓ placeholder substitution + LoadFromStore sanitization
`forge skills add` with `OPENAI_API_KEY` encrypted	Wizard prompted for missing key	✓ shows `OPENAI_API_KEY (secrets) — ok`

Upgrade guide

Backward-compatible. No forge.yaml schema break. No code changes required for existing agents.

Three small things changed defaults; if you depended on them, you can override:

code-review skill provider selection — previously preferred Anthropic when both keys were set. Now prefers OpenAI. To preserve the old behavior: add REVIEW_PROVIDER=anthropic to your env.
code-review OPENAI_BASE_URL no longer triggers the Responses API. Previously the variable was both a base URL pointer AND a Responses API toggle. Now decoupled: setting OPENAI_BASE_URL always uses /chat/completions. If you relied on the Responses API: add OPENAI_USE_RESPONSES_API=1.
forge init no longer pre-writes empty one_of placeholders to .env. If your CI scripts grep .env for ANTHROPIC_API_KEY= or OPENAI_API_KEY= to detect "user hasn't filled this in," migrate to checking forge skills add's install output instead — it now prints the satisfaction state for one_of groups.

Sessions corrupted by issues #131 or #143 on a pre-v0.14.1 build do not require a rm. Both fixes ship defense-in-depth on LoadFromStore that sanitizes the bad shape on load. Just upgrade and send the next message in the affected channel thread.

Installation

Download a pre-built binary from the assets attached below: forge-Darwin-arm64.tar.gz, forge-Darwin-x86_64.tar.gz, forge-Linux-arm64.tar.gz, forge-Linux-x86_64.tar.gz, forge-Windows-arm64.zip, forge-Windows-x86_64.zip. Verify with checksums.txt.

Container image: ghcr.io/initializ/forge:v0.14.1.

Homebrew: brew upgrade initializ/tap/forge.

From source:

git clone --branch v0.14.1 https://github.com/initializ/forge
cd forge
go install ./forge-cli/cmd/forge

What's new since v0.14.0 — per-PR changelog

#132 fix(runtime): stop persisting empty assistant turns on finish_reason=length (closes #131)
#134 fix(skills/code-review): provider routing + OpenAI-compatible endpoint (closes #133)
#136 fix(skills): one_of env validation visibility + stop writing first-key placeholder (closes #135)
#138 fix(skills): always pass OPENAI_BASE_URL / ANTHROPIC_BASE_URL / OLLAMA_BASE_URL / GEMINI_BASE_URL to skill subprocesses (closes #137)
#140 fix(security): auto-add LLM provider base URL to egress allowlist at build + runtime (closes #139)
#142 fix(skills/code-review): max_completion_tokens for OpenAI Chat Completions (closes #141)
#144 fix(runtime): stop duplicating the inbound user message at the start of every fresh session (closes #143)

Full per-PR detail in CHANGELOG.md.

Compatibility

Go: 1.25+
A2A: 0.3.0
Forge agent schema: unchanged from v0.14.0 (additive only — new ModelRef.BaseURL and ModelFallback.BaseURL fields)
LLM providers verified: OpenAI (Chat Completions + Responses API), Anthropic, Ollama, Gemini, Together.ai, OpenRouter, Groq, Fireworks, Anyscale, Moonshot (Kimi-K2.6 series), Bedrock proxy patterns
Container image: ghcr.io/initializ/forge:v0.14.1

Documentation

Observability — Tracing (unchanged from v0.14.0)
forge.yaml schema (now includes model.base_url and model.fallbacks[].base_url)
CLI reference
Environment variables
Egress control
CHANGELOG.md

Acknowledgements

All seven fixes landed within hours of being reported by operators running real production-shaped agents on Slack-channel-backed deployments. The session-poison bug (#131) was caught on a long-running PR-review thread; the duplicate-user bug (#143) was caught on a gpt-5-nano agent; the egress / wizard / subprocess-env bugs (#135, #137, #139) were caught in sequence on the same Together.ai-backed deployment, with each fix exposing the next layer. The 7-PR plan landed with the same discipline that delivered OpenTelemetry Tracing v1 in v0.14.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.14.1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Forge v0.14.1 — Bug fixes for OpenAI-compatible providers (Together.ai, OpenRouter, Groq) and session persistence

The shipped fixes

1. Session persistence no longer poisons followup turns

2. Duplicate user message at session start no longer trips strict-mode providers

3. `code-review` skill no longer routes Anthropic-first when both API keys are set

4. `code-review` skill uses `max_completion_tokens` (not deprecated `max_tokens`)

5. Skill subprocesses inherit `OPENAI_BASE_URL` and the other standard SDK base-URL env vars

6. CLI wizard recognizes encrypted `one_of` keys and stops writing first-key placeholders

7. `forge package` and `forge run` auto-add the LLM provider's custom base URL to the egress allowlist

Compatibility-restored deployment matrix

Upgrade guide

Installation

What's new since v0.14.0 — per-PR changelog

Compatibility

Documentation

Acknowledgements

Uh oh!

v0.14.1

Forge v0.14.1 — Bug fixes for OpenAI-compatible providers (Together.ai, OpenRouter, Groq) and session persistence

The shipped fixes

1. Session persistence no longer poisons followup turns

2. Duplicate user message at session start no longer trips strict-mode providers

3. code-review skill no longer routes Anthropic-first when both API keys are set

4. code-review skill uses max_completion_tokens (not deprecated max_tokens)

5. Skill subprocesses inherit OPENAI_BASE_URL and the other standard SDK base-URL env vars

6. CLI wizard recognizes encrypted one_of keys and stops writing first-key placeholders

7. forge package and forge run auto-add the LLM provider's custom base URL to the egress allowlist

Compatibility-restored deployment matrix

Upgrade guide

Installation

What's new since v0.14.0 — per-PR changelog

Compatibility

Documentation

Acknowledgements

Uh oh!

3. `code-review` skill no longer routes Anthropic-first when both API keys are set

4. `code-review` skill uses `max_completion_tokens` (not deprecated `max_tokens`)

5. Skill subprocesses inherit `OPENAI_BASE_URL` and the other standard SDK base-URL env vars

6. CLI wizard recognizes encrypted `one_of` keys and stops writing first-key placeholders

7. `forge package` and `forge run` auto-add the LLM provider's custom base URL to the egress allowlist