feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants #2232

mudler · 2024-05-03T21:54:12Z

We can simply try to autoload the backends extracted in the asset dir, instead of hardcoding those in the binary. Besides simplifying maintenance as we don't need to keep the index up-to-date, it also simplifies building as we have to just drop the backends in the folder which is then embedded in the binary during build time.

This will allow ALSO to build variants of the same backend (for e.g. with different instructions sets) without having to specify an hardcoded name, easing out to have a single binary for all the variants.

In this PR I've also added two build variants for llama.cpp: noavx which disables only AVX, and fallback which disables specific instruction sets.

Partly related to #1888

Potentially fixes many other issues which root cause is an old CPU without support for CPU instructions used in compile time.

At least the bug that should be closed because of insufficient CPU flagsets (using prefix to close on merge of this PR):

netlify · 2024-05-03T21:54:26Z

✅ Deploy Preview for localai canceled.

Name	Link
🔨 Latest commit	`f9f516e`
🔍 Latest deploy log	https://app.netlify.com/sites/localai/deploys/663646d13c0aef000866cf49

pkg/model/initializers.go

We can simply try to autoload the backends extracted in the asset dir. This will allow to build variants of the same backend (for e.g. with different instructions sets), so to have a single binary for all the variants. Signed-off-by: mudler <mudler@localai.io>

Make it so are idempotent and that we can re-build Signed-off-by: mudler <mudler@localai.io>

mudler · 2024-05-03T23:03:09Z

note to myself: we might want to add ccache to the images to speedup compilation times for different llama.cpp variants

mudler · 2024-05-03T23:11:36Z

This finally makes possible to have multiple versions along, so we can fix issues like #2220 #973 #288 #1447 #1968

1:09AM DBG Loading from the following backends (in order): [llama-cpp llama-cpp-noavx stablediffusion]

Signed-off-by: mudler <mudler@localai.io>

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler · 2024-05-05T15:28:28Z

This also fixed #1916

…5.0@f178386 by renovate (#21846) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | minor | `v2.14.0` -> `v2.15.0` | --- > [!WARNING] > Some dependencies could not be looked up. Check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>mudler/LocalAI (docker.io/localai/localai)</summary> ### [`v2.15.0`](https://togithub.com/mudler/LocalAI/releases/tag/v2.15.0) [Compare Source](https://togithub.com/mudler/LocalAI/compare/v2.14.0...v2.15.0) ![local-ai-release](https://togithub.com/mudler/LocalAI/assets/2420543/8d3738d8-7973-4c2d-9116-9d48b08ad61f) ### 🎉 LocalAI v2.15.0! 🚀 Hey awesome people! I'm happy to announce the release of LocalAI version 2.15.0! This update introduces several significant improvements and features, enhancing usability, functionality, and user experience across the board. Dive into the key highlights below, and don't forget to check out the full changelog for more detailed updates. ##### 🌍 WebUI Upgrades: Turbocharged! ##### 🚀 Vision API Integration The Chat WebUI now seamlessly integrates with the Vision API, making it easier for users to test image processing models directly through the browser interface - this is a very simple and hackable interface in less then 400L of code with Alpine.JS and HTMX! ![output](https://togithub.com/mudler/LocalAI/assets/2420543/36d357ca-861d-46a9-899d-71f62fe4f977) ##### 💬 System Prompts in Chat System prompts can be set in the WebUI chat, which guide the user through interactions more intuitively, making our chat interface smarter and more responsive. ![output](https://togithub.com/mudler/LocalAI/assets/2420543/555a4ad2-18d4-41e5-91a5-436d5001b9f1) ##### 🌟 Revamped Welcome Page New to LocalAI or haven't installed any models yet? No worries! The updated welcome page now guides users through the model installation process, ensuring you're set up and ready to go without any hassle. This is a great first step for newcomers - thanks for your precious feedback! ![output](https://togithub.com/mudler/LocalAI/assets/2420543/77e286fc-e045-4650-8b70-0d482ca43f0f) ##### 🔄 Background Operations Indicator Don't get lost with our new background operations indicator on the WebUI, which shows when tasks are running in the background. ![output](https://togithub.com/mudler/LocalAI/assets/2420543/5be17b69-7c3b-48cb-b85f-74684b292818) ##### 🔍 Filter Models by Tag and Category As our model gallery balloons, you can now effortlessly sift through models by tag and category, making finding what you need a breeze. ![output](https://togithub.com/mudler/LocalAI/assets/2420543/d180e9e4-0d38-42a5-84cc-98778dc7a0ad) ##### 🔧 Single Binary Release LocalAI is expanding into offering single binary releases, simplifying the deployment process and making it easier to get LocalAI up and running on any system. For the moment we have condensed the builds which disables AVX and SSE instructions set. We are also planning to include cuda builds as well. ##### 🧠 Expanded Model Gallery This release introduces several exciting new models to our gallery, such as 'Soliloquy', 'tess', 'moondream2', 'llama3-instruct-coder' and 'aurora', enhancing the diversity and capability of our AI offerings. Our selection of one-click-install models is growing! We pick carefully model from the most trending ones on huggingface, feel free to submit your requests in a github issue, hop to our Discord or contribute by hosting your gallery, or.. even by adding models directly to LocalAI! ![local-ai-gallery](https://togithub.com/mudler/LocalAI/assets/2420543/c664f67f-dde3-404c-b516-d21db1131fcd) ![local-ai-gallery-new](https://togithub.com/mudler/LocalAI/assets/2420543/e0f16a10-a353-411d-9f62-2524df95bf86) Want to share your model configurations and customizations? See the docs: https://localai.io/docs/getting-started/customize-model/ #### 📣 Let's Make Some Noise! A gigantic THANK YOU to everyone who’s contributed—your feedback, bug squashing, and feature suggestions are what make LocalAI shine. To all our heroes out there supporting other users and sharing their expertise, you’re the real MVPs! Remember, LocalAI thrives on community support—not big corporate bucks. If you love what we're building, show some love! A shoutout on social (@LocalAI_OSS and @mudler_it on twitter/X), joining our sponsors, or simply starring us on GitHub makes all the difference. Also, if you haven't yet joined our Discord, come on over! Here's the link: https://discord.gg/uJAeKSAGDy Thanks a ton, and.. enjoy this release! *** #### What's Changed ##### Bug fixes 🐛 - fix(webui): correct documentation URL for text2img by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2233 - fix(ux): fix small glitches by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2265 ##### Exciting New Features 🎉 - feat: update ROCM and use smaller image by [@cryptk](https://togithub.com/cryptk) in [mudler/LocalAI#2196 - feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2232 - fix(webui): display small navbar with smaller screens by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2240 - feat(startup): show CPU/GPU information with --debug by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2241 - feat(single-build): generate single binaries for releases by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2246 - feat(webui): ux improvements by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2247 - fix: OpenVINO winograd always disabled by [@fakezeta](https://togithub.com/fakezeta) in [mudler/LocalAI#2252 - UI: flag `trust_remote_code` to users // favicon support by [@dave-gray101](https://togithub.com/dave-gray101) in [mudler/LocalAI#2253 - feat(ui): prompt for chat, support vision, enhancements by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2259 ##### 🧠 Models - fix(gallery): hermes-2-pro-llama3 models checksum changed by [@Nold360](https://togithub.com/Nold360) in [mudler/LocalAI#2236 - models(gallery): add moondream2 by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2237 - models(gallery): add llama3-llava by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2238 - models(gallery): add llama3-instruct-coder by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2242 - models(gallery): update poppy porpoise by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2243 - models(gallery): add lumimaid by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2244 - models(gallery): add openbiollm by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2245 - gallery: Added some OpenVINO models by [@fakezeta](https://togithub.com/fakezeta) in [mudler/LocalAI#2249 - models(gallery): Add Soliloquy by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2260 - models(gallery): add tess by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2266 - models(gallery): add lumimaid variant by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2267 - models(gallery): add kunocchini by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2268 - models(gallery): add aurora by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2270 - models(gallery): add tiamat by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2269 ##### 📖 Documentation and examples - docs: updated Transformer parameters description by [@fakezeta](https://togithub.com/fakezeta) in [mudler/LocalAI#2234 - Update readme: add ShellOracle to community integrations by [@djcopley](https://togithub.com/djcopley) in [mudler/LocalAI#2254 - Add missing Homebrew dependencies by [@michaelmior](https://togithub.com/michaelmior) in [mudler/LocalAI#2256 ##### 👒 Dependencies - ⬆️ Update docs version mudler/LocalAI by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2228 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2229 - ⬆️ Update ggerganov/whisper.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2230 - build(deps): bump tqdm from 4.65.0 to 4.66.3 in /examples/langchain/langchainpy-localai-example in the pip group across 1 directory by [@dependabot](https://togithub.com/dependabot) in [mudler/LocalAI#2231 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2239 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2251 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2255 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2263 ##### Other Changes - test: check the response URL during image gen in `app_test.go` by [@dave-gray101](https://togithub.com/dave-gray101) in [mudler/LocalAI#2248 #### New Contributors - [@Nold360](https://togithub.com/Nold360) made their first contribution in [mudler/LocalAI#2236 - [@djcopley](https://togithub.com/djcopley) made their first contribution in [mudler/LocalAI#2254 - [@michaelmior](https://togithub.com/michaelmior) made their first contribution in [mudler/LocalAI#2256 **Full Changelog**: mudler/LocalAI@v2.14.0...v2.15.0 </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate).

mudler added the enhancement New feature or request label May 3, 2024

mudler force-pushed the backends_from_assetdir branch from 37ea989 to 6d436f0 Compare May 3, 2024 22:24

dave-gray101 reviewed May 3, 2024

View reviewed changes

pkg/model/initializers.go Outdated Show resolved Hide resolved

mudler added 2 commits May 4, 2024 00:48

refactor(prepare): refactor out llama.cpp prepare steps

5e0e93c

Make it so are idempotent and that we can re-build Signed-off-by: mudler <mudler@localai.io>

mudler force-pushed the backends_from_assetdir branch from 6d436f0 to f5f6a1c Compare May 3, 2024 22:53

mudler force-pushed the backends_from_assetdir branch from f5f6a1c to 72f1ae9 Compare May 4, 2024 08:00

mudler added 2 commits May 4, 2024 10:06

[TEST] feat(build): build noavx version along

585c9cc

Signed-off-by: mudler <mudler@localai.io>

build: make build parallel

7676d9c

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the backends_from_assetdir branch from 72f1ae9 to 7676d9c Compare May 4, 2024 08:09

mudler added 3 commits May 4, 2024 10:37

build: do not override CMAKE_ARGS

019ad4b

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

build: add fallback variant

4b445af

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Fixups

04b4d9d

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the backends_from_assetdir branch from 19aad95 to 04b4d9d Compare May 4, 2024 09:13

mudler added 3 commits May 4, 2024 14:44

fix(huggingface-langchain): fail if no token is set

3fc8c4a

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

fix(huggingface-langchain): rename

111d3da

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

fix: do not autoload local-store

a76e4d0

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler changed the title ~~feat(initializer): do not specify backends to autoload~~ feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants May 4, 2024

mudler force-pushed the backends_from_assetdir branch from d3b0752 to 344d11f Compare May 4, 2024 14:27

fix: give priority between the listed backends

f9f516e

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler force-pushed the backends_from_assetdir branch from 344d11f to f9f516e Compare May 4, 2024 14:31

mudler merged commit 530bec9 into master May 4, 2024
53 checks passed

mudler deleted the backends_from_assetdir branch May 4, 2024 15:56

mudler mentioned this pull request May 5, 2024

feat(single-build): generate single binaries for releases #2246

Merged

mudler mentioned this pull request May 5, 2024

can we build non avx cpu aio images? #1916

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants #2232

feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants #2232

mudler commented May 3, 2024 •

edited

netlify bot commented May 3, 2024 •

edited

mudler commented May 3, 2024

mudler commented May 3, 2024 •

edited

mudler commented May 5, 2024

feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants #2232

feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants #2232

Conversation

mudler commented May 3, 2024 • edited

netlify bot commented May 3, 2024 • edited

✅ Deploy Preview for localai canceled.

mudler commented May 3, 2024

mudler commented May 3, 2024 • edited

mudler commented May 5, 2024

mudler commented May 3, 2024 •

edited

netlify bot commented May 3, 2024 •

edited

mudler commented May 3, 2024 •

edited