Skip to content

fix(discord): raise thread title max tokens for reasoning models#64172

Merged
steipete merged 3 commits intoopenclaw:mainfrom
hanamizuki:fix/discord-thread-title-reasoning-max-tokens
Apr 10, 2026
Merged

fix(discord): raise thread title max tokens for reasoning models#64172
steipete merged 3 commits intoopenclaw:mainfrom
hanamizuki:fix/discord-thread-title-reasoning-max-tokens

Conversation

@hanamizuki
Copy link
Copy Markdown
Contributor

@hanamizuki hanamizuki commented Apr 10, 2026

Summary

  • Problem: autoThreadName: "generated" silently fails to rename auto-created threads when the selected simple-completion model is a reasoning model (e.g. MiniMax M2, Claude thinking models, OpenAI o-series).
  • Why it matters: The feature introduced in feat(discord): add autoThreadName 'generated' strategy #43366 appears to do nothing for any setup whose default completion model happens to reason — threads stay named after the raw first message with no error or user-visible log.
  • What changed: DISCORD_THREAD_TITLE_MAX_TOKENS raised from 24 to 512 in extensions/discord/src/monitor/thread-title.ts, plus the matching expectation in the existing unit test.
  • What did NOT change (scope boundary): No changes to the rename flow, provider selection, model-ref resolution, prompt text, or the non-reasoning code path.

Change Type

  • Bug fix

Scope

  • Integrations (Discord)

Linked Issue/PR

Root Cause

The simple-completion call in generateThreadTitle sets maxTokens: 24 as a cost/latency guard, assuming the model's entire output budget will go to a 3-6 word title. That assumption holds for instruct-only models but breaks for reasoning models whose API response contains a thinking content block before the text block:

  1. The provider emits content: [{ type: "thinking", thinking: "..." }, { type: "text", text: "..." }].
  2. With maxTokens: 24, the entire budget is consumed by the thinking block before any text token is produced.
  3. extractAssistantText(response) walks the content array looking for text blocks, finds none, and returns "".
  4. normalizeGeneratedThreadTitle("") → ""generated || null → nullmaybeRenameDiscordAutoThread early-returns.
  5. The only log emitted on that path is a logVerbose — invisible unless the gateway is explicitly in verbose mode — so the failure is silent in normal operation.

Bumping the ceiling to 512 gives enough headroom for a short thinking pass plus the title output. The generous ceiling only costs more tokens when the provider actually reasons; instruct-only models still emit a short title and stop early at natural end-of-sequence.

Test plan

  • pnpm test:extension discord — 928/928 tests passing locally (112 files)
  • Live end-to-end verification on a running gateway with the default simple-completion model set to a MiniMax M2 reasoning model served through an Anthropic-compatible API endpoint:
    • Before: extractAssistantText returned ""; generateThreadTitle returned null; maybeRenameDiscordAutoThread early-returned with no visible log; the thread kept the raw first-message name.
    • After: the response included both a thinking block and a text block (total token usage well under 512); rawText was non-empty; a concise title was produced; the Discord PATCH /channels/{id} call succeeded; the thread was visibly renamed in the Discord UI.
  • Existing unit test thread-title.generate.test.ts updated to expect maxTokens: 512 (it was the only test asserting the constant value).

AI-Assisted

  • Marked as AI-assisted
  • Tooling: Claude Code (Claude Opus 4.6, 1M context)
  • Degree of testing: fully tested — Discord extension test lane green + live end-to-end validation on a running gateway
  • I understand what the code does and why the fix is correct
  • Session logs available on request

Notes

  • Unrelated pnpm tsgo errors in extensions/discord/src/components.ts and siblings (DiscordModalEntry, DiscordComponentModalEntry, etc.) reproduce on an otherwise-clean upstream/main checkout — pre-existing and untouched by this PR.

@openclaw-barnacle openclaw-barnacle bot added channel: discord Channel integration: discord size: XS labels Apr 10, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 10, 2026

Greptile Summary

Raises DISCORD_THREAD_TITLE_MAX_TOKENS from 24 to 512 in extensions/discord/src/monitor/thread-title.ts to fix a silent failure when autoThreadName: \"generated\" is used with reasoning models (e.g. MiniMax-M2, Claude thinking models, o-series). The root cause is that the old 24-token ceiling is entirely consumed by the thinking content block, leaving no budget for the text output and causing extractAssistantText to return empty. The companion test expectation is updated to match.

Confidence Score: 5/5

Safe to merge — the change is minimal, correctly targets the root cause, and is backed by live end-to-end verification.

Only a single P2 style finding (preferred test model constant not updated when the test file was touched). No logic, correctness, or security issues found. The token budget increase is well-justified and scoped correctly.

No files require special attention.

Comments Outside Diff (1)

  1. extensions/discord/src/monitor/thread-title.generate.test.ts, line 27-37 (link)

    P2 Preferred test model constant not updated

    The project's testing guidelines (CLAUDE.md) say to prefer sonnet-4.6 for Anthropic model constants in tests and to update older examples when touching those test files. This file was touched in this PR, so the claude-opus-4-6 fixture values on these lines (and line 131) should be updated to sonnet-4.6.

    Context Used: CLAUDE.md (source)

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: extensions/discord/src/monitor/thread-title.generate.test.ts
    Line: 27-37
    
    Comment:
    **Preferred test model constant not updated**
    
    The project's testing guidelines (CLAUDE.md) say to prefer `sonnet-4.6` for Anthropic model constants in tests and to update older examples when touching those test files. This file was touched in this PR, so the `claude-opus-4-6` fixture values on these lines (and line 131) should be updated to `sonnet-4.6`.
    
    
    
    **Context Used:** CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8))
    
    How can I resolve this? If you propose a fix, please make it concise.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/discord/src/monitor/thread-title.generate.test.ts
Line: 27-37

Comment:
**Preferred test model constant not updated**

The project's testing guidelines (CLAUDE.md) say to prefer `sonnet-4.6` for Anthropic model constants in tests and to update older examples when touching those test files. This file was touched in this PR, so the `claude-opus-4-6` fixture values on these lines (and line 131) should be updated to `sonnet-4.6`.

```suggestion
  prepareSimpleCompletionModelForAgentMock.mockResolvedValue({
    selection: {
      provider: "anthropic",
      modelId: "claude-sonnet-4-6",
      agentDir: "/tmp/openclaw-agent",
    },
    model: {
      provider: "anthropic",
      id: "claude-sonnet-4-6",
    },
    auth: {
      apiKey: "sk-test",
      source: "env:TEST_API_KEY",
      mode: "api-key",
    },
  } as Awaited<ReturnType<typeof agentRuntimeModule.prepareSimpleCompletionModelForAgent>>);
```

**Context Used:** CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8))

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(discord): raise thread title max tok..." | Re-trigger Greptile

@hanamizuki hanamizuki force-pushed the fix/discord-thread-title-reasoning-max-tokens branch from 0237f06 to 87ac961 Compare April 10, 2026 07:43
hanamizuki added a commit to hanamizuki/openclaw that referenced this pull request Apr 10, 2026
Local cherry-pick of upstream PR openclaw#64172. Keeps the live
gateway build on this fork in sync with the upstream fix while the PR is
under review.

Root cause: when the simple-completion model used for thread-title
generation is a reasoning model, the 24-token output budget is consumed
entirely by the internal thinking block, leaving no tokens for the text
output. extractAssistantText returns empty, the rename is silently
skipped. Raising the ceiling to 512 gives enough headroom for thinking
plus a short title.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hanamizuki hanamizuki force-pushed the fix/discord-thread-title-reasoning-max-tokens branch from 87ac961 to 228e6ed Compare April 10, 2026 08:03
hanamizuki added a commit to hanamizuki/openclaw that referenced this pull request Apr 10, 2026
Follow repo testing guideline to prefer sonnet-4.6 for Anthropic model
constants in tests (per CLAUDE.md, flagged by Greptile review on openclaw#64172).
@hanamizuki
Copy link
Copy Markdown
Contributor Author

The failing CI checks (build-artifacts, build-smoke, check, check-additional, checks-fast-contracts-protocol) are unrelated to this PR — they all fail on pre-existing TypeScript errors in extensions/zalo/src/setup-core.ts and extensions/zalo/src/setup-surface.ts (Cannot find name 'promptZaloAllowFrom', accountId type mismatch). I verified the same check job is failing on main as well, so these are not introduced by this PR. Could a maintainer re-run after the zalo issue is fixed, or let me know if a rebase would help?

Also addressed the Greptile review's P2 note by updating the claude-opus-4-6 test fixture to claude-sonnet-4-6 per the repo testing guideline (commit 2d1c1cd).

@steipete steipete force-pushed the fix/discord-thread-title-reasoning-max-tokens branch from 2d1c1cd to 78301d9 Compare April 10, 2026 12:53
steipete pushed a commit to hanamizuki/openclaw that referenced this pull request Apr 10, 2026
Follow repo testing guideline to prefer sonnet-4.6 for Anthropic model
constants in tests (per CLAUDE.md, flagged by Greptile review on openclaw#64172).
steipete added a commit to hanamizuki/openclaw that referenced this pull request Apr 10, 2026
steipete pushed a commit to hanamizuki/openclaw that referenced this pull request Apr 10, 2026
Follow repo testing guideline to prefer sonnet-4.6 for Anthropic model
constants in tests (per CLAUDE.md, flagged by Greptile review on openclaw#64172).
steipete added a commit to hanamizuki/openclaw that referenced this pull request Apr 10, 2026
@steipete steipete force-pushed the fix/discord-thread-title-reasoning-max-tokens branch from 78301d9 to 0c54fbb Compare April 10, 2026 12:54
steipete pushed a commit to hanamizuki/openclaw that referenced this pull request Apr 10, 2026
Follow repo testing guideline to prefer sonnet-4.6 for Anthropic model
constants in tests (per CLAUDE.md, flagged by Greptile review on openclaw#64172).
@steipete steipete force-pushed the fix/discord-thread-title-reasoning-max-tokens branch from 0c54fbb to 8e36c35 Compare April 10, 2026 13:00
steipete added a commit to hanamizuki/openclaw that referenced this pull request Apr 10, 2026
hanamizuki and others added 3 commits April 10, 2026 14:06
When the simple-completion model selected for thread-title generation is a
reasoning model (e.g. MiniMax M2, Claude thinking models, OpenAI o-series),
the 24-token output budget is entirely consumed by the internal thinking
block before any user-visible text is emitted. extractAssistantText then
returns an empty string, generateThreadTitle returns null, and the
auto-thread rename is silently skipped while the feature appears to do
nothing.

Raise DISCORD_THREAD_TITLE_MAX_TOKENS to 512 so there is enough headroom
for a short thinking pass plus the 3-6 word title output. The generous
ceiling only matters when the provider actually reasons; non-reasoning
models still emit a short title and stop early at end-of-sequence.

Verified live against a MiniMax M2 reasoning model served through an
Anthropic-compatible API endpoint: before the fix, the rename never fired;
after the fix, the thread is renamed with a concise generated title.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Follow repo testing guideline to prefer sonnet-4.6 for Anthropic model
constants in tests (per CLAUDE.md, flagged by Greptile review on openclaw#64172).
@steipete steipete force-pushed the fix/discord-thread-title-reasoning-max-tokens branch from 8e36c35 to 4d721a4 Compare April 10, 2026 13:07
@steipete steipete merged commit d2b9d91 into openclaw:main Apr 10, 2026
9 checks passed
steipete pushed a commit that referenced this pull request Apr 10, 2026
Follow repo testing guideline to prefer sonnet-4.6 for Anthropic model
constants in tests (per CLAUDE.md, flagged by Greptile review on #64172).
@steipete
Copy link
Copy Markdown
Contributor

Landed via rebase onto main.

  • Gate: pnpm check; pnpm test; pnpm build; reran pnpm check + pnpm test extensions/discord/src/monitor/thread-title.generate.test.ts after each final rebase.
  • PR head before GitHub rebase merge: 4d721a4152137054e968fd65de58c9dfc1a06ff5
  • Main tip after merge: d2b9d918af8a12f53bc66705e151e7119c40fefe

Thanks @hanamizuki!

steipete pushed a commit to 100yenadmin/openclaw-1 that referenced this pull request Apr 10, 2026
Follow repo testing guideline to prefer sonnet-4.6 for Anthropic model
constants in tests (per CLAUDE.md, flagged by Greptile review on openclaw#64172).
steipete added a commit to 100yenadmin/openclaw-1 that referenced this pull request Apr 10, 2026
hanamizuki added a commit to hanamizuki/openclaw that referenced this pull request Apr 11, 2026
…models

Follow-up to openclaw#64172. That PR raised DISCORD_THREAD_TITLE_MAX_TOKENS
from 24 to 512, which unblocked very short messages but left two
failure modes for moderately complex inputs on reasoning models:

1. 512 is still too tight when the thinking block alone exceeds the
   budget - extractAssistantText returns empty, generateThreadTitle
   returns null, rename is silently skipped.
2. DEFAULT_THREAD_TITLE_TIMEOUT_MS = 10_000 kills ~15% of real
   production samples: observed MiniMax M2.7 thinking latency ranges
   1.7s to 17s for the same task, median 7s, p95 16.9s (n=20).

Raise MAX_TOKENS 512 -> 4096 and TIMEOUT 10_000 -> 60_000. Both
headrooms are safe because maybeRenameDiscordAutoThread is dispatched
without await - a longer rename cannot block message delivery. Worst
case is the thread keeps its original title for up to a minute longer
before being renamed, which is strictly better than the 15% silent
failure rate today.

Update the matching maxTokens assertion in the unit test. Full data,
percentile distribution, and why-these-numbers analysis in the PR
description.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: discord Channel integration: discord size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants