v2.7.0

cebtenzzre released this 08 Feb 18:44

· 285 commits to main since this release

What's Changed

Add 12 new model architectures for CPU and Metal inference (#1914)
These are Baichuan, BLOOM, CodeShell, GPT-2, Orion, Persimmon, Phi and Phi-2, Plamo, Qwen, Qwen2, Refact, and StableLM.
We don't have official downloads for these yet, but TheBloke offers plenty of compatible GGUF quantizations.
Restore minimum window size of 720x480 (1b524c4)
Use ChatML for Mistral OpenOrca to make its output formatting more reliable (#1935)

Bug Fixes

Fix VRAM not being freed when CPU fallback occurs - this makes switching models more reliable (#1901)
Disable offloading of Mixtral to GPU because we crash otherwise (#1931)
Limit extensions scanned by LocalDocs to txt, pdf, md, rst - other formats were inserting useless binary data (#1912)
Fix missing scrollbar for chat history (490404d)
Accessibility improvements (4258bb1)

New Contributors

@boshk0 made their first contribution in #1924

Full Changelog: v2.6.2...v2.7.0

Contributors

boshk0

Assets 5