Headline
- Speculative decoding now accepts draft/MTP/EAGLE3 checkpoints, with new Gemma-4 MTP models ready to pull.
- ROCm installation and GPU detection are restored for Radeon RX RDNA2/3/4 dGPUs on Windows and Linux.
- Backend installs are now crash-safe, with resilient downloads that fall back gracefully when a release lookup or cache snapshot is unavailable.
lemonade bench --response-logcaptures each model's responses and run metadata to a JSONL file for later quality evaluation.- The
lemonade backendscommand now lists only supported recipes and backends by default; uselemonade backends --allto see every available option.
Breaking Changes
- The
--model-draft,-md, and--spec-draft-modelflags are now reserved for internal speculation-decoding support and can no longer be passed manually throughllamacpp_args.
Lemonade Server
| Operating System | Downloads |
|---|---|
| Windows | lemonade.msi |
| Ubuntu 24.04+ | Launchpad PPA |
| Debian 13 | lemonade-server_10.8.1-debian13_amd64.deb |
| Fedora 43 | lemonade-server-10.8.1-fc43.x86_64.rpm |
| Fedora 44 | lemonade-server-10.8.1-fc44.x86_64.rpm |
| macOS | Lemonade-10.8.1-Darwin.pkg |
Other platforms? See our Installation Options for Docker, Snap, Arch, Debian, and more.
Embeddable Lemonade
Portable binaries for bundling into your own installer. Run lemond ./ as a subprocess.
| Platform | Download |
|---|---|
| Ubuntu x64 | lemonade-embeddable-10.8.1-ubuntu-x64.tar.gz |
| Windows x64 | lemonade-embeddable-10.8.1-windows-x64.zip |
| macOS arm64 | lemonade-embeddable-10.8.1-macos-arm64.tar.gz |
What's Changed
Thanks @GabrielReusRodriguez, @Kushal1213, @Phqen1x, @abn, @bitgamma, @blackdeathdrow, @ckuethe, @fl0rianr, @github-actions, @ianbmacdonald, @jeremyfowers, @jtlayton, @kenvandine, @lucifer-vali, @matthewjhunter, @ramkrishna2910, @sagebind, @superm1 for your awesome contributions to this release!
Click to expand changelog
- ci: support tagging releases from a release branch by @jeremyfowers in #2272
- ci: add repo-manager workflow by @jeremyfowers in #2276
- fix(whisper): drop gfx103X from rocm whisper supported archs by @ramkrishna2910 in #2274
- test(gguf): add unit tests for MTP / capability label detection (#2176) by @ramkrishna2910 in #2281
- fix(rocm): accept wildcard GPU arch families in TheRock install gate (#2093 follow-up) by @ramkrishna2910 in #2280
- fix(test): require Whisper model load before language transcription test by @fl0rianr in #2278
- docs(cli): correct launch note - LEMONADE_* recipe env vars no longer honored by @ramkrishna2910 in #2275
- ci(repo-manager): automatic repo-manager fully operational by @jeremyfowers in #2285
- docs(release): update release process for repo-manager automation by @jeremyfowers in #2288
- fix(ci): restart MacOS server before whisper metal tests by @fl0rianr in #2303
- auto update and validate sd-cpp by @fl0rianr in #2139
- fix(macOS): updates whisper from v1.8.4 to v.1.8.5 for metal by @fl0rianr in #2309
- Fix Problem with .devcontainer. It did not copy .devcontainer/reinstall-cmake.sh when building container by @GabrielReusRodriguez in #2273
- Add support for additional draft checkpoint by @bitgamma in #2317
- docs: fix Debian 13 installation docs for issue #2299 by @superm1 in #2323
- docs: add documentation style guide for community and AI-assisted contributions by @kenvandine in #2054
- Fix copy-to-clipboard buttons silently failing in web-app over HTTP by @blackdeathdrow in #2260
- Fix(cli): Lemonade backends now shows only supported backends . Added --all option to show all backends. by @GabrielReusRodriguez in #2254
- fix(gpu): support wildcards in GPU detection logic by @jtlayton in #2295
- Fix ROCm whisper-server startup: add TheRock lib dir to LD_LIBRARY_PATH by @matthewjhunter in #2293
- Capture output from
lemonade benchby @ckuethe in #2214 - fix(windows): use ProcessManager::run_command for 7z extraction instead of system() Fixes #2313 by @Phqen1x in #2322
- fix(server): fall back to installed llama.cpp binary when "latest" release lookup fails by @ianbmacdonald in #2279
- fix(moonshine): return 400 on invalid audio by @abn in #2326
- fix(backends): stage and verify backend install before removing the working binary by @ianbmacdonald in #2315
- docs: system-stats API by @jeremyfowers in #2284
- fix: return 400 instead of 500 when request body is empty or not valid JSON by @Kushal1213 in #2232
- devcontainers - Create the python env and install reqs to allow python test execution by @GabrielReusRodriguez in #2336
- systemd: add user-service symlink at /usr/lib/systemd/user/ by @lucifer-vali in #2173
- fix: resolve shared-repo GGUF variants orphaned by refs/main advance by @ianbmacdonald in #2311
- Fix: eviction_engine does not run nvidia-smi on AMD configs anymore and some testing problems fixed. by @GabrielReusRodriguez in #2331
- Update stable-diffusion.cpp to master-709-92a3b73 by @github-actions[bot] in #2335
- Support
image[]parameter for/v1/images/editsby @sagebind in #2321 - Update llama.cpp to b9747 by @github-actions[bot] in #2333
- Better auto context size estimate by @bitgamma in #2337
- Version bump to 10.8.1 to prepare for the release by @kenvandine in #2374
New Contributors
- @jtlayton made their first contribution in #2295
- @matthewjhunter made their first contribution in #2293
- @sagebind made their first contribution in #2321
Full Changelog: v10.8.0...v10.8.1
Windows installers are signed. Free code signing provided by SignPath.io, certificate by SignPath Foundation. See our Code Signing Policy.