Skip to content

NovaMLX v1.1.0

Choose a tag to compare

@cnshsliu cnshsliu released this 10 May 13:52
· 112 commits to main since this release

What's New in v1.1.0

New Features

  • Python SDK (sdk/python/) — full client library with admin, streaming, tool calling examples and tests
  • Tokenhub integration — types + menu bar page
  • Audio transcription (/v1/audio/transcriptions) — Qwen3-ASR via Swift/MLX
  • Image generation (/v1/images/generations) — SDXL-Turbo via Swift/MLX
  • Modelfile system — user-authored model recipes with system prompt and sampling overrides
  • Per-request keep_alive — override model TTL per request
  • Harmony streaming — GPT-OSS channel-aware format
  • reasoning_effort parameter — OpenAI-standard thinking budget control
  • Logprobs supportlogprobs and top_logprobs (OpenAI standard)
  • Auto-load coordinator — SSE keep-alive for cold model loads
  • Nova capabilitiesnova.capabilities exposed on /v1/models

Logging Overhaul

  • Rotating log files — keeps last 5 rotated copies instead of truncating
  • Runtime log levelGET/PUT /admin/api/log-level admin endpoint
  • Spam reduction — SSE, RunLoop, generate noise demoted to debug
  • Module prefix convention — all logs use [Module] format (Engine, SSE, Auth, etc.)
  • AuthClient fix — replaced per-call file I/O with os.Logger

Infrastructure

  • E2E model test suiteScripts/test-all-models.sh (load → 4 API tests → unload per model)
  • Architecture doc — comprehensive architecture.md with module deep dives, request lifecycle, diagnostic playbook
  • Updated docs — CHANGELOG, DEVELOPMENT.md, features.md, features.zh-CN.md with corrected ports and new sections
  • .gitignore — added build artifact patterns, vendors, .grok

Bug Fixes

  • Tool message mapping preserves tool_calls and tool_call_id (OpenAI + Anthropic)
  • Streaming prompt_tokens plumbed through to usage stats
  • Benign macOS memory pressure warnings demoted from WARN to DEBUG
  • SSE finished(nil) demoted from WARN to DEBUG

Full Changelog

  • feat(api): add reasoning_effort parameter (OpenAI standard)
  • feat(api): add logprobs and top_logprobs support (OpenAI standard)
  • feat(api): add auto-load coordinator with SSE keep-alive for cold loads
  • docs: consolidate TODOs into TODO.md and retire P1.1 (VRAM recovery)
  • test(scheduler): chaos tests + assertions for race regression coverage
  • test(api): tool message mapping edge cases (OpenAI + Anthropic)
  • feat(api): expose nova.capabilities on /v1/models
  • chore(log): demote benign macOS .warning pressure to debug
  • fix(api): plumb prompt_tokens through to streaming usage
  • chore(log): demote SSE finished(nil) WARN to DEBUG
  • feat: audio transcription, image generation, modelfiles, keep_alive, harmony streaming
  • feat: logging overhaul, tokenhub, Python SDK, docs update