Skip to content

v4.0.0 — Full MLX LLM stack

Choose a tag to compare

@MKS-01 MKS-01 released this 20 Jun 16:28

Summary LLM and vision OCR now run in-process via mlx-lm + mlx-vlm on Apple Silicon, unifying with CSM-1B TTS under one framework. Ollama removed — no external daemon needed.

What changed

  • OllamaConfigLLMConfig; config key ollama:llm: (old key auto-migrated)
  • ollama dependency replaced by mlx-lm + mlx-vlm
  • +25–30% generation speed
  • Model discovery scans HF cache instead of Ollama API
  • Default model: mlx-community/Qwen3.5-9B-4bit