Skip to content

v4.4.0: streaming perf + Gemini thinking + DSML parser hardening#408

Merged
CJackHwang merged 26 commits into
mainfrom
dev
May 2, 2026
Merged

v4.4.0: streaming perf + Gemini thinking + DSML parser hardening#408
CJackHwang merged 26 commits into
mainfrom
dev

Conversation

@CJackHwang
Copy link
Copy Markdown
Owner

@CJackHwang CJackHwang commented May 2, 2026

Summary

  • Streaming perf: TTFT optimization with new accumulation buffer architecture,
    bufio.Scanner→Reader migration, keep-alive heartbeat, assistant turn refactor
  • Gemini thinking: full thinking block support, enabled by default for supported models
  • DSML parser hardening: hyphen/underscore tag variants, CDATA markdown fence resilience,
    line-offset tracking fixes, Node + Go parser alignment
  • Legacy cleanup: removed compatibility config/UI, history split logic, centralized
    reference marker stripping
  • Bug fixes: thinking-only empty output retry, continuation replay overlap prevention

Change Type

  • ✨ feat
  • 🐛 fix
  • ♻️ refactor
  • 📝 docs

Test plan

  • Streaming TTFT regression on chat endpoint
  • DSML tool-call parsing edge cases (hyphen/underscore/markdown fence variants)
  • Gemini thinking blocks functional verification
  • Empty output retry loop with thinking-only responses

VanceHud and others added 13 commits May 1, 2026 14:29
修复了使用Zeabur部署会失败的问题
feat(openai): retrieve uploaded file metadata
Core changes:
- stream.go: New accumulation buffer architecture with scanner goroutine
  + select loop, MinChars=16, MaxWait=10ms, first-flush-immediate
- dedupe.go: Add TrimContinuationOverlapFromBuilder to avoid string copies
- claude/stream_runtime_core.go: Integrate toolstream for incremental text
- claude/stream_runtime_finalize.go: toolstream flush support
- stream_emitter.js: Reduce DeltaCoalescer thresholds (160->16 chars, 80->20ms)
- empty_retry: Add thinking-aware empty output detection
- Fix reasoning_content leak and finish_reason=null in edge cases
- Fix tail content truncation when max_tokens exceeded

Tests: sync test expectations with upstream for thinking content
…tion

perf(streaming): optimize TTFT and reduce buffering latency
@vercel
Copy link
Copy Markdown

vercel Bot commented May 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
ds2api Ready Ready Preview, Comment May 2, 2026 11:33pm

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eccd8c957b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/httpapi/gemini/handler_stream_runtime.go Outdated
Comment thread internal/sse/stream.go Outdated
@CJackHwang CJackHwang changed the title 4.3.1 4.4.0 May 2, 2026
… into new assistantturn and completionruntime packages
…ing and track emitted text to prevent redundant output blocks
CJackHwang and others added 3 commits May 3, 2026 01:00
…rent input file handling into the completion runtime pipeline.
…tecture docs

Fill gaps identified in architecture audit: add artifacts/ and static/ to
directory tree, and document 7 auxiliary internal/ packages (textclean,
claudeconv, compat, rawsample, devcapture, util, version) in Section 3.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CJackHwang and others added 2 commits May 3, 2026 03:09
Add compatibility for <dsml-tool-calls>/<dsml-invoke>/<dsml-parameter>
tag forms alongside the canonical pipe-prefixed DSML shell. Hyphenated
forms only activate when a DSML prefix is detected, preventing false
matches on bare XML lookalikes. Go and Node parsers aligned, with tests
covering here-doc CDATA, streaming sieve, and negative lookalike cases.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…tripping

- ValidateTurn no longer errors on thinking-only responses, deferring to
  ShouldRetryEmptyOutput which now also covers thinking-only outputs.
- Empty output retry uses multi-turn follow-up with a regeneration prompt
  suffix and parent_message_id in the same DeepSeek session.
- Centralize StripReferenceMarkersEnabled into textclean package to
  eliminate duplicated hardcoded booleans across 4 protocol handlers.
- Log a deprecation warning when the legacy "compat" config key is used.
- Document thinking-only retry and reference marker stripping in API.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ariants

Support `<dsml_tool_calls>`, `<dsml_invoke>`, `<dsml_parameter>` in
addition to the existing pipe, space, hyphen, and collapsed forms.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@CJackHwang CJackHwang changed the title 4.4.0 v4.4.0: streaming perf + Gemini thinking + DSML parser hardening May 2, 2026
@CJackHwang CJackHwang merged commit c32fe30 into main May 2, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

模型输出经常乱码[Bug] [Bug] 读取工具「文件未变更」优化机制,在长会话场景下引发编辑死锁

3 participants