Skip to content

v0.5.1

Latest

Choose a tag to compare

@youssefvdel youssefvdel released this 16 Jun 01:22
· 12 commits to main since this release

What's Changed

Features

  • Show unlock timestamp in throttle logs and API responses

Bug Fixes

  • Auto-switch accounts on rate limit
  • Non-streaming path now extracts local MCP tool calls
  • Fix emittedToolCallCount index for concurrent tool call SSE events
  • Remove prewarm pool + persist throttle across restarts
  • Delete stale prewarmed sessions from Qwen servers
  • Release session on stream timeout (chat leak)
  • Remove top-level tools from Qwen payload to preserve full thinking
  • Fix answer content lost with thinking_format=full
  • Full thinking_format support with real-time reasoning_content streaming
  • Remove timeout error message from client, silently terminate with [DONE]
  • Dedup feature_config + switch thinking_format to full
  • Delete Qwen session immediately after response (was 60s delay)
  • Always write [DONE] on error paths, prevent SSE client hang
  • Real error logging, thinking capture, upstream SSE error detection

Performance

  • Skip filter pipeline when inside tool call block (depth > 0)
  • Streaming parse optimization + sliding window fix

Refactor

  • Remove dead toolResultContents field across pipeline

Tests

  • Fix getAll test env var collision in CI
  • Delete throttleAccount test that wrote test data to real accounts.json

CI

  • Switch CI from setup-node/npm to oven-sh/setup-bun/bun

Chores

  • Cleanup reports/docs, thinking format improvements
  • Bump version to 0.5.1