v1.0.0 — the engine overhaul
TurboLLM 1.0.0 — engines reimagined: hardware-aware, self-updating, and bring-your-own from any source.
Added
- Hardware-aware recommendation + a unified, fit-labeled engine catalog (llama.cpp, KoboldCpp, llamafile, MLX, vLLM, + ik_llama / TurboQuant forks).
- KoboldCpp and llamafile as first-class engine kinds (GGUF, OpenAI-compatible), verified end-to-end.
- Guided "Add your own engine" (folder scan) + a build-from-source guide (Windows + CUDA: prereq check + commands).
- Honest engine updates — real upstream check, per-engine Off/Notify/Auto (default Notify), rollback-safe apply, "Rebuild available" for source builds.
- "Register my engine" funnel, HF-cache default model dir (zero-config onboarding), grouped engine/version dropdown.
Changed
- Redesigned, beginner-first Engines screen (status hero + Running-now switcher → unified Install & manage catalog → collapsed Advanced).
- De-pinned official llama.cpp for updates. Route-level code-splitting (~1 MB → ~314 kB initial JS).
Fixed
- The misleading "you're on the latest" for official llama.cpp (now checks real upstream).
- llamafile launch on current versions (
--no-webui); cross-engine KV-cache-type bleed (turbo* gated to supporting engines). - Loopback guard on engine add/scan (block LAN-triggered arbitrary binary execution).
Full changelog: see CHANGELOG.md.