Release NovaMLX v1.1.0 · cnshsliu/novamlx

What's New in v1.1.0

Python SDK (sdk/python/) — full client library with admin, streaming, tool calling examples and tests
Tokenhub integration — types + menu bar page
Audio transcription (/v1/audio/transcriptions) — Qwen3-ASR via Swift/MLX
Image generation (/v1/images/generations) — SDXL-Turbo via Swift/MLX
Modelfile system — user-authored model recipes with system prompt and sampling overrides
Per-request keep_alive — override model TTL per request
Harmony streaming — GPT-OSS channel-aware format
reasoning_effort parameter — OpenAI-standard thinking budget control
Logprobs support — logprobs and top_logprobs (OpenAI standard)
Auto-load coordinator — SSE keep-alive for cold model loads
Nova capabilities — nova.capabilities exposed on /v1/models

Rotating log files — keeps last 5 rotated copies instead of truncating
Runtime log level — GET/PUT /admin/api/log-level admin endpoint
Spam reduction — SSE, RunLoop, generate noise demoted to debug
Module prefix convention — all logs use [Module] format (Engine, SSE, Auth, etc.)
AuthClient fix — replaced per-call file I/O with os.Logger

E2E model test suite — Scripts/test-all-models.sh (load → 4 API tests → unload per model)
Architecture doc — comprehensive architecture.md with module deep dives, request lifecycle, diagnostic playbook
Updated docs — CHANGELOG, DEVELOPMENT.md, features.md, features.zh-CN.md with corrected ports and new sections
.gitignore — added build artifact patterns, vendors, .grok

Tool message mapping preserves tool_calls and tool_call_id (OpenAI + Anthropic)
Streaming prompt_tokens plumbed through to usage stats
Benign macOS memory pressure warnings demoted from WARN to DEBUG
SSE finished(nil) demoted from WARN to DEBUG

feat(api): add reasoning_effort parameter (OpenAI standard)
feat(api): add logprobs and top_logprobs support (OpenAI standard)
feat(api): add auto-load coordinator with SSE keep-alive for cold loads
docs: consolidate TODOs into TODO.md and retire P1.1 (VRAM recovery)
test(scheduler): chaos tests + assertions for race regression coverage
test(api): tool message mapping edge cases (OpenAI + Anthropic)
feat(api): expose nova.capabilities on /v1/models
chore(log): demote benign macOS .warning pressure to debug
fix(api): plumb prompt_tokens through to streaming usage
chore(log): demote SSE finished(nil) WARN to DEBUG
feat: audio transcription, image generation, modelfiles, keep_alive, harmony streaming
feat: logging overhaul, tokenhub, Python SDK, docs update