feat: migrate python backends from conda to uv #2215

cryptk · 2024-05-02T00:29:39Z

Description

Doing some work to try and speed up builds by migrating away from huge conda environments to a smaller venv setup and using UV instead of pip for installation speed.

This is a draft until I get the backends that use common-env/transformers migrated. That is when we will know if it's actually worth it to make the switch.

Notes for Reviewers

There is also a new feature for the Dockerfile that I added to allow me to build with only one/some of the "extras" backends to speed up testing cycle time. By default, if EXTRA_BACKENDS is not set to anything, it will build all of the backends just as before.

The logic behind the new feature is that if you set IMAGE_TYPE=extras on it's own, you get all of the backends. If you set IMAGE_TYPE=extras and you set EXTRA_BACKENDS to a string, such as EXTRA_BACKENDS="diffusers,bark" then it will build with only diffusers and bark.

If you do not set IMAGE_TYPE=extras, then you will get no extra backends, no matter what EXTRA_BACKENDS is set to.

Signed commits

Yes, I signed my commits.

netlify · 2024-05-02T00:29:54Z

✅ Deploy Preview for localai canceled.

Name	Link
🔨 Latest commit	`1979064`
🔍 Latest deploy log	https://app.netlify.com/sites/localai/deploys/663ce7378d9ea500089eb039

cryptk · 2024-05-04T05:50:55Z

Some benchmarks from before and after this change:

Times and sizes are compared between my desktop and the docker registry in my lab with a 2.5GbE network link between them
Frequently I was I/O bound rather than network bound.
Push and Pull times are for the entire process, including compression/decompression

v2.14.0-cublas-cuda12-ffmpeg

46.2GB uncompressed size
22.57GB compress
9.68GB max layer size pushed
5.93GB max layer size pulled
7m22s to push
10m47s to pull

new (same config as image above) image:

35.67GB uncompressed
16.41GB compress
8.9GB max layer size pushed
4.71GB max layer size pulled
7m14s to push
10m16s to pull

So while the images are smaller, it does not save much time on the push/pull. What it does do is HEAVILY reduce the amount of time needed to build the images. For example the hipblas-extras image that builds as part of the PR tests takes ~55m to 1 hour before this change and with this change it takes about 30 minutes.

golgeek · 2024-05-06T16:28:07Z

backend/python/vllm/requirements.txt

+grpcio==1.63.0
+protobuf
+certifi
+transformers==4.38.2


Looks like you're not done as this is still a draft, but as an FYI vllm has transformers>=4.40.0 in its requirements.txt.

good catch, I'll remove the version constraints from our requirements.txt files so builds will use the newest version that satisfies the requirements of other dependencies

golgeek · 2024-05-06T16:29:49Z

backend/python/vllm/install.sh

+uv pip install --requirement ${MY_DIR}/requirements.txt
+
+if [ -f "requirements-${BUILD_TYPE}.txt" ]; then
+    uv pip install --requirement ${MY_DIR}/requirements-${BUILD_TYPE}.txt


This would need a cuda file to install flash-attn as it's not a vLLM direct dependency, but it benefits from it if it's installed.

I'll get that added in

- replace conda with UV for diffusers install (prototype for all extras backends) - add ability to build docker with one/some/all extras backends instead of all or nothing Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

Makefile

golgeek

Built and ran a few tests on different backends and it looks great!

Thanks a lot for working on this, it's indeed significantly faster to build, and it will simplify maintaining backend updates by a lot!

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

lunamidori5 · 2024-05-09T19:59:52Z

Just making sure, this is hitting master soon? @cryptk / @golgeek (I have downstream things that will need to be changed if so)

cryptk · 2024-05-09T21:19:30Z

Just making sure, this is hitting master soon? @cryptk / @golgeek (I have downstream things that will need to be changed if so)

I think that Mudler was talking about bringing it in shortly after the next release, and 2.15 is building now, but your downstream things shouldn't break when this is merged, you will just be able to do those downstream things a bit faster

lunamidori5 · 2024-05-10T00:28:27Z

Just making sure, this is hitting master soon? @cryptk / @golgeek (I have downstream things that will need to be changed if so)

I think that Mudler was talking about bringing it in shortly after the next release, and 2.15 is building now, but your downstream things shouldn't break when this is merged, you will just be able to do those downstream things a bit faster

noted thank you sir!

mudler · 2024-05-10T13:07:51Z

@cryptk great work as always! I'd say it's safe to merge now as 2.15.0 is out, time to test this on master.

fakezeta · 2024-05-11T15:09:12Z

Hi @cryptk, this PR broke OpenVINO support since optimum is not installed.
Opening an issue to track, currently I cannot work on this since I'm not at home. Hope to work on it tomorrow.

…6.0 by renovate (#22420) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | minor | `v2.15.0-cublas-cuda11-ffmpeg-core` -> `v2.16.0-cublas-cuda11-ffmpeg-core` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | minor | `v2.15.0-cublas-cuda11-core` -> `v2.16.0-cublas-cuda11-core` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | minor | `v2.15.0-cublas-cuda12-ffmpeg-core` -> `v2.16.0-cublas-cuda12-ffmpeg-core` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | minor | `v2.15.0-cublas-cuda12-core` -> `v2.16.0-cublas-cuda12-core` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | minor | `v2.15.0-ffmpeg-core` -> `v2.16.0-ffmpeg-core` | | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | minor | `v2.15.0` -> `v2.16.0` | --- > [!WARNING] > Some dependencies could not be looked up. Check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>mudler/LocalAI (docker.io/localai/localai)</summary> ### [`v2.16.0`](https://togithub.com/mudler/LocalAI/releases/tag/v2.16.0) [Compare Source](https://togithub.com/mudler/LocalAI/compare/v2.15.0...v2.16.0) ![local-ai-release-2 16](https://togithub.com/mudler/LocalAI/assets/2420543/bd3a6ace-8aec-4ac7-b457-b3e8cb5bb29e) ##### Welcome to LocalAI's latest update! ##### 🎉🎉🎉 woot woot! So excited to share this release, a lot of new features are landing in LocalAI!!!!! 🎉🎉🎉 ![](https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExZ2cycjRqbXFld2toenpqcjcyN3E3eWw1NHI5cm12Njc3Y2lzZWtyZyZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/AR92HqL0HcenC/giphy.gif) ##### 🌟 Introducing Distributed Llama.cpp Inferencing Now it is possible to distribute the inferencing workload across different workers with llama.cpp models ! This feature has landed with [mudler/LocalAI#2324 and is based on the upstream work of [@rgerganov](https://togithub.com/rgerganov) in [ggerganov/llama.cpp#6829. **How it works:** a front-end server manages the requests compatible with the OpenAI API (LocalAI) and workers (llama.cpp) are used to distribute the workload. This makes possible to run larger models split across different nodes! ##### How to use it To start workers to offload the computation you can run: local-ai llamacpp-worker <listening_address> <listening_port> However, you can also follow the llama.cpp README and building the rpc-server (https://github.com/ggerganov/llama.cpp/blob/master/examples/rpc/README.md), which is still compatible with LocalAI. When starting the LocalAI server, which is going to accept the API requests, you can set a list of workers IP/address by specifying the addresses with `LLAMACPP_GRPC_SERVERS`: ```bash LLAMACPP_GRPC_SERVERS="address1:port,address2:port" local-ai run ``` At this point the workload hitting in the LocalAI server should be distributed across the nodes! ##### 🤖 Peer2Peer llama.cpp LocalAI is the **first** AI Free, Open source project offering complete, decentralized, peer2peer while private, LLM inferencing on top of the libp2p protocol. There is no "public swarm" to offload the computation, but rather empowers you to build your own cluster of local and remote machines to distribute LLM computation. ![](https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExZTdrZW9rc3hrMWxoZTV1OGo0ajF3d2MwMHFmeXVoMThqOGg1eHR4ZCZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/q0KrtRcr10Bhu/giphy.gif) This feature leverages the ability of llama.cpp to distribute the workload explained just above and features from one of my other projects, https://github.com/mudler/edgevpn. LocalAI builds on top of the twos, and allows to create a private peer2peer network between nodes, without the need of centralizing connections or manually configuring IP addresses: it unlocks totally decentralized, private, peer-to-peer inferencing capabilities. Works also behind different NAT-ted networks (uses DHT and mDNS as discovery mechanism). **How it works:** A pre-shared token can be generated and shared between workers and the server to form a private, decentralized, p2p network. You can see the feature in action here: ![output](https://togithub.com/mudler/LocalAI/assets/2420543/8ca277cf-c208-4562-8929-808b2324b584) ##### How to use it 1. Start the server with `--p2p`: ```bash ./local-ai run --p2p ##### 1:02AM INF loading environment variables from file envFile=.env ##### 1:02AM INF Setting logging to info ##### 1:02AM INF P2P mode enabled ##### 1:02AM INF No token provided, generating one ##### 1:02AM INF Generated Token: ##### XXXXXXXXXXX ##### 1:02AM INF Press a button to proceed ``` A token is displayed, copy it and press enter. You can re-use the same token later restarting the server with `--p2ptoken` (or `P2P_TOKEN`). 2. Start the workers. Now you can copy the local-ai binary in other hosts, and run as many workers with that token: ```bash TOKEN=XXX ./local-ai p2p-llama-cpp-rpc ##### 1:06AM INF loading environment variables from file envFile=.env ##### 1:06AM INF Setting logging to info ##### {"level":"INFO","time":"2024-05-19T01:06:01.794+0200","caller":"config/config.go:288","message":"connmanager disabled\n"} ##### {"level":"INFO","time":"2024-05-19T01:06:01.794+0200","caller":"config/config.go:295","message":" go-libp2p resource manager protection enabled"} ##### {"level":"INFO","time":"2024-05-19T01:06:01.794+0200","caller":"config/config.go:409","message":"max connections: 100\n"} ##### 1:06AM INF Starting llama-cpp-rpc-server on '127.0.0.1:34371' ##### {"level":"INFO","time":"2024-05-19T01:06:01.794+0200","caller":"node/node.go:118","message":" Starting EdgeVPN network"} ##### create_backend: using CPU backend ##### Starting RPC server on 127.0.0.1:34371, backend memory: 31913 MB ##### 2024/05/19 01:06:01 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). # See https://github.com/quic-go/quic-go/wiki/UDP-Buffer-Sizes for details. ##### {"level":"INFO","time":"2024-05-19T01:06:01.805+0200","caller":"node/node.go:172","message":" Node ID: 12D3KooWJ7WQAbCWKfJgjw2oMMGGss9diw3Sov5hVWi8t4DMgx92"} ##### {"level":"INFO","time":"2024-05-19T01:06:01.806+0200","caller":"node/node.go:173","message":" Node Addresses: [/ip4/127.0.0.1/tcp/44931 /ip4/127.0.0.1/udp/33251/quic-v1/webtransport/certhash/uEiAWAhZ-W9yx2ZHnKQm3BE_ft5jjoc468z5-Rgr9XdfjeQ/certhash/uEiB8Uwn0M2TQBELaV2m4lqypIAY2S-2ZMf7lt_N5LS6ojw /ip4/127.0.0.1/udp/35660/quic-v1 /ip4/192.168.68.110/tcp/44931 /ip4/192.168.68.110/udp/33251/quic-v1/webtransport/certhash/uEiAWAhZ-W9yx2ZHnKQm3BE_ft5jjoc468z5-Rgr9XdfjeQ/certhash/uEiB8Uwn0M2TQBELaV2m4lqypIAY2S-2ZMf7lt_N5LS6ojw /ip4/192.168.68.110/udp/35660/quic-v1 /ip6/::1/tcp/41289 /ip6/::1/udp/33160/quic-v1/webtransport/certhash/uEiAWAhZ-W9yx2ZHnKQm3BE_ft5jjoc468z5-Rgr9XdfjeQ/certhash/uEiB8Uwn0M2TQBELaV2m4lqypIAY2S-2ZMf7lt_N5LS6ojw /ip6/::1/udp/35701/quic-v1]"} ##### {"level":"INFO","time":"2024-05-19T01:06:01.806+0200","caller":"discovery/dht.go:104","message":" Bootstrapping DHT"} ``` (Note you can also supply the token via args) At this point, you should see in the server logs messages stating that new workers are found 3. Now you can start doing inference as usual on the server (the node used on step 1) Interested in to try it out? As we are still updating the documentation, you can read the full instructions here [mudler/LocalAI#2343 ##### 📜 Advanced Function calling support with Mixed JSON Grammars LocalAI gets better at function calling with mixed grammars! With this release, LocalAI introduces a transformative capability: support for mixed JSON BNF grammars. It allows to specify a grammar for the LLM that allows to output structured JSON and free text. **How to use it:** To enable mixed grammars, you can set in the `YAML` configuration file `function.mixed_mode = true`, for example: ```yaml function: ##### disable injecting the "answer" tool disable_no_action: true grammar: ##### This allows the grammar to also return messages mixed_mode: true ``` This feature significantly enhances LocalAI's ability to interpret and manipulate JSON data coming from the LLM through a more flexible and powerful grammar system. Users can now combine multiple grammar types within a single JSON structure, allowing for dynamic parsing and validation scenarios. Grammars can also turned off entirely and leave the user to determine how the data is parsed from the LLM to be correctly interpretated by LocalAI to be still compliant to the OpenAI REST spec. For example, to interpret Hermes results, one can just annotate regexes in `function.json_regex_match` to extract the LLM response: ```yaml function: grammar: disable: true ##### disable injecting the "answer" tool disable_no_action: true return_name_in_function_response: true json_regex_match: - "(?s)<tool_call>(.*?)</tool_call>" - "(?s)<tool_call>(.*?)" replace_llm_results: ##### Drop the scratchpad content from responses - key: "(?s)<scratchpad>.*</scratchpad>" value: "" replace_function_results: ##### Replace everything that is not JSON array or object, just in case. - key: '(?s)^[^{\[]*' value: "" - key: '(?s)[^}\]]*$' value: "" ##### Drop the scratchpad content from responses - key: "(?s)<scratchpad>.*</scratchpad>" value: "" ``` Note that regex can still be used when enabling mixed grammars is enabled. This is especially important for models which does not support grammars - such as transformers or OpenVINO models, that now can support as well function calling. As we update the docs, further documentation can be found in the PRs that you can find in the changelog below. ##### 🚀 New Model Additions and Updates ![local-ai-yi-updates](https://togithub.com/mudler/LocalAI/assets/2420543/5d646703-0c64-4299-b551-a39074f63c2d) Our model gallery continues to grow with exciting new additions like Aya-35b, Mistral-0.3, Hermes-Theta and updates to existing models ensuring they remain at the cutting edge. This release is having major enhancements on tool calling support. Besides working on making our default models in AIO images more performant - now you can try an enhanced out-of-the-box experience with function calling in the Hermes model family ( [Hermes-2-Pro-Mistral](https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B-GGUF) and [Hermes-2-Theta-Llama-3](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF)) ##### Our LocalAI function model! ![local-ai-functioncall-model](https://togithub.com/mudler/LocalAI/assets/2420543/b2955459-49b6-4a57-96e8-242966ccef12) I have fine-tuned a function call model specific to leverage entirely the grammar support of LocalAI, you can find it in the model gallery already and on [huggingface](https://huggingface.co/mudler/LocalAI-Llama3-8b-Function-Call-v0.2) ##### 🔄 Single Binary Release: Simplified Deployment and Management In our continuous effort to streamline the user experience and deployment process, LocalAI v2.16.0 proudly introduces a single binary release. This enhancement, thanks to [@sozercan](https://togithub.com/sozercan)'s contributions, consolidates all variants (CUDA and non-cuda releases) and dependencies into one compact executable file. This change simplifies the installation and update processes, reduces compatibility issues, and speeds up the setup for new users and existing deployments as now binary releases are even more portable than ever! ##### 🔧 Bug Fixes and Improvements A host of bug fixes have been implemented to ensure smoother operation and integration. Key fixes include enhancements to the Intel build process, stability adjustments for setuptools in Python backends, and critical updates ensuring the successful build of p2p configurations. ##### Migrating Python Backends: From Conda to UV LocalAI has migrated its Python backends from Conda to UV. This transition, thanks to [@cryptk](https://togithub.com/cryptk) contributions, enhances the efficiency and scalability of our backend operations. Users will experience faster setup times and reduced complexity, streamlining the development process and making it easier to manage dependencies across different environments. ##### 📣 Let's Make Some Noise! A gigantic THANK YOU to everyone who’s contributed—your feedback, bug squashing, and feature suggestions are what make LocalAI shine. To all our heroes out there supporting other users and sharing their expertise, you’re the real MVPs! Remember, LocalAI thrives on community support—not big corporate bucks. If you love what we're building, show some love! A shoutout on social (@LocalAI_OSS and @mudler_it on twitter/X), joining our sponsors, or simply starring us on GitHub makes all the difference. Also, if you haven't yet joined our Discord, come on over! Here's the link: https://discord.gg/uJAeKSAGDy Thanks a ton, and.. enjoy this release! ##### What's Changed ##### Bug fixes 🐛 - build: do not specify a BUILD_ID by default by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2284 - fix: add missing openvino/optimum/etc libraries for Intel, fixes [#2289](https://togithub.com/mudler/LocalAI/issues/2289) by [@cryptk](https://togithub.com/cryptk) in [mudler/LocalAI#2292 - add setuptools for openvino by [@fakezeta](https://togithub.com/fakezeta) in [mudler/LocalAI#2301 - fix: add setuptools to all requirements-intel.txt files for python backends by [@cryptk](https://togithub.com/cryptk) in [mudler/LocalAI#2333 - ci: correctly build p2p in GO_TAGS by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2369 - ci: generate specific image for intel builds by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2374 - fix: stablediffusion binary by [@sozercan](https://togithub.com/sozercan) in [mudler/LocalAI#2385 ##### Exciting New Features 🎉 - feat: migrate python backends from conda to uv by [@cryptk](https://togithub.com/cryptk) in [mudler/LocalAI#2215 - feat: create bash library to handle install/run/test of python backends by [@cryptk](https://togithub.com/cryptk) in [mudler/LocalAI#2286 - feat(grammar): support models with specific construct by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2291 - feat(ui): display number of available models for installation by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2298 - feat: auto select llama-cpp cpu variant by [@sozercan](https://togithub.com/sozercan) in [mudler/LocalAI#2305 - feat(llama.cpp): add `flash_attention` and `no_kv_offloading` by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2310 - feat(functions): support models with no grammar and no regex by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2315 - feat(functions): allow to set JSON matcher by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2319 - feat: auto select llama-cpp cuda runtime by [@sozercan](https://togithub.com/sozercan) in [mudler/LocalAI#2306 - feat(llama.cpp): add distributed llama.cpp inferencing by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2324 - feat(functions): mixed JSON BNF grammars by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2328 - feat(functions): simplify parsing, read functions as list by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2340 - feat(functions): Enable true regex replacement for the regexReplacement option by [@lenaxia](https://togithub.com/lenaxia) in [mudler/LocalAI#2341 - feat(backends): add openvoice backend by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2334 - feat(webui): statically embed js/css assets by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2348 - feat(functions): allow to use JSONRegexMatch unconditionally by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2349 - feat(functions): don't use yaml.MapSlice by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2354 - build: add sha by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2356 - feat(llama.cpp): Totally decentralized, private, distributed, p2p inference by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2343 - feat(functions): relax mixedgrammars by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2365 - models(gallery): add mistral-0.3 and command-r, update functions by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2388 ##### 🧠 Models - models(gallery): add aloe by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2283 - models(gallery): add Llama-3-8B-Instruct-abliterated by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2288 - models(gallery): add l3-chaoticsoliloquy-v1.5-4x8b by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2295 - models(gallery): add jsl-medllama-3-8b-v2.0 by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2296 - models(gallery): add llama-3-refueled by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2297 - models(gallery): add aura-llama-Abliterated by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2309 - models(gallery): add Bunny-llama by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2311 - models(gallery): add lumimaidv2 by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2312 - models(gallery): add orthocopter by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2313 - fix(gallery) Correct llama3-8b-instruct model file by [@tannisroot](https://togithub.com/tannisroot) in [mudler/LocalAI#2330 - models(gallery): add hermes-2-theta-llama-3-8b by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2331 - models(gallery): add yi 6/9b, sqlcoder, sfr-iterative-dpo by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2335 - models(gallery): add anita by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2344 - models(gallery): add master-yi by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2345 - models(gallery): update poppy porpoise mmproj by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2346 - models(gallery): add LocalAI-Llama3-8b-Function-Call-v0.2-GGUF by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2355 - models(gallery): add stheno by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2358 - fix(gallery): checksum Meta-Llama-3-70B-Instruct.Q4\_K_M.gguf - [#2364](https://togithub.com/mudler/LocalAI/issues/2364) by [@Nold360](https://togithub.com/Nold360) in [mudler/LocalAI#2366 - models(gallery): add phi-3-medium-4k-instruct by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2367 - models(gallery): add hercules and helpingAI by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2376 - ci(checksum_checker): do get sha from hf API when available by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2380 - models(gallery): ⬆️ update checksum by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2383 - models(gallery): ⬆️ update checksum by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2386 - models(gallery): add aya-35b by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2391 ##### 📖 Documentation and examples - docs: Update semantic-todo/README.md by [@eltociear](https://togithub.com/eltociear) in [mudler/LocalAI#2294 - Add Home Assistant Integration by [@valentinfrlch](https://togithub.com/valentinfrlch) in [mudler/LocalAI#2387 - Add warning for running the binary on MacOS by [@mauromorales](https://togithub.com/mauromorales) in [mudler/LocalAI#2389 ##### 👒 Dependencies - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2281 - ⬆️ Update docs version mudler/LocalAI by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2280 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2285 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2290 - feat(swagger): update swagger by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2302 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2303 - ⬆️ Update ggerganov/whisper.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2317 - ⬆️ Update ggerganov/whisper.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2326 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2316 - ⬆️ Update ggerganov/whisper.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2329 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2337 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2339 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2342 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2351 - ⬆️ Update ggerganov/whisper.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2352 - dependencies(grpcio): bump to fix CI issues by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2362 - deps(llama.cpp): update and adapt API changes by [@mudler](https://togithub.com/mudler) in [mudler/LocalAI#2381 - ⬆️ Update ggerganov/whisper.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2361 - ⬆️ Update go-skynet/go-bert.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#1225 - ⬆️ Update ggerganov/llama.cpp by [@localai-bot](https://togithub.com/localai-bot) in [mudler/LocalAI#2360 ##### Other Changes - refactor: Minor improvements to BackendConfigLoader by [@dave-gray101](https://togithub.com/dave-gray101) in [mudler/LocalAI#2353 ##### New Contributors - [@tannisroot](https://togithub.com/tannisroot) made their first contribution in [mudler/LocalAI#2330 - [@lenaxia](https://togithub.com/lenaxia) made their first contribution in [mudler/LocalAI#2341 - [@valentinfrlch](https://togithub.com/valentinfrlch) made their first contribution in [mudler/LocalAI#2387 **Full Changelog**: mudler/LocalAI@v2.15.0...v2.16.0 </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate).

cryptk force-pushed the feat_migrate_to_uv branch 10 times, most recently from 34651f0 to 30f7575 Compare May 2, 2024 13:52

cryptk changed the title ~~feat: migrate diffusers backend from conda to uv~~ feat: migrate python backends from conda to uv May 2, 2024

cryptk force-pushed the feat_migrate_to_uv branch from 30f7575 to 3989dea Compare May 3, 2024 15:54

dave-gray101 mentioned this pull request May 3, 2024

test: e2e /reranker endpoint #2211

Merged

cryptk force-pushed the feat_migrate_to_uv branch 5 times, most recently from dac03e0 to 9a5f5f2 Compare May 4, 2024 04:46

cryptk force-pushed the feat_migrate_to_uv branch from 9a5f5f2 to 8931ca3 Compare May 6, 2024 15:07

golgeek reviewed May 6, 2024

View reviewed changes

cryptk added enhancement New feature or request area/backends labels May 7, 2024

lunamidori5 added dependencies area/container python Pull requests that update Python code and removed dependencies labels May 7, 2024

cryptk added 10 commits May 7, 2024 20:18

fix: adjust file perms on all install/run/test scripts

29da1e9

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

fix: add missing acclerate dependencies

8f55b67

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

fix: add some more missing dependencies to python backends

4df197a

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

fix: parler tests venv py dir fix

541cf7d

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

fix: correct filename for transformers-musicgen tests

2e15003

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

fix: adjust the pwd for valle tests

59dec93

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

feat: cleanup and optimization work for uv migration

8b566c4

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

fix: add setuptools to requirements-install for mamba

cb2559f

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

feat: more size optimization work

72f1dcd

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

feat: make installs and tests more consistent, cleanup some deps

66230ce

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

cryptk force-pushed the feat_migrate_to_uv branch from 8931ca3 to 66230ce Compare May 8, 2024 01:18

fix: cleanup

86ccfae

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

cryptk force-pushed the feat_migrate_to_uv branch from a24a9de to 86ccfae Compare May 8, 2024 02:56

cryptk marked this pull request as ready for review May 8, 2024 03:16

fix: mamba backend is cublas only

7407254

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

golgeek reviewed May 9, 2024

View reviewed changes

Makefile Outdated Show resolved Hide resolved

golgeek previously approved these changes May 9, 2024

View reviewed changes

fix: uncomment lines in makefile

1979064

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

cryptk dismissed golgeek’s stale review via 1979064 May 9, 2024 15:09

golgeek approved these changes May 9, 2024

View reviewed changes

mudler approved these changes May 10, 2024

View reviewed changes

mudler merged commit 28a421c into mudler:master May 10, 2024
32 checks passed

cryptk deleted the feat_migrate_to_uv branch May 10, 2024 15:42

fakezeta mentioned this pull request May 11, 2024

OpenVINO libraries not installed in docker image after #2215 #2289

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: migrate python backends from conda to uv #2215

feat: migrate python backends from conda to uv #2215

cryptk commented May 2, 2024 •

edited

Loading

netlify bot commented May 2, 2024 •

edited

Loading

cryptk commented May 4, 2024

golgeek May 6, 2024

cryptk May 6, 2024

golgeek May 6, 2024

cryptk May 6, 2024

golgeek left a comment

lunamidori5 commented May 9, 2024

cryptk commented May 9, 2024

lunamidori5 commented May 10, 2024

mudler commented May 10, 2024

fakezeta commented May 11, 2024

feat: migrate python backends from conda to uv #2215

feat: migrate python backends from conda to uv #2215

Conversation

cryptk commented May 2, 2024 • edited Loading

netlify bot commented May 2, 2024 • edited Loading

✅ Deploy Preview for localai canceled.

cryptk commented May 4, 2024

golgeek May 6, 2024

Choose a reason for hiding this comment

cryptk May 6, 2024

Choose a reason for hiding this comment

golgeek May 6, 2024

Choose a reason for hiding this comment

cryptk May 6, 2024

Choose a reason for hiding this comment

golgeek left a comment

Choose a reason for hiding this comment

lunamidori5 commented May 9, 2024

cryptk commented May 9, 2024

lunamidori5 commented May 10, 2024

mudler commented May 10, 2024

fakezeta commented May 11, 2024

cryptk commented May 2, 2024 •

edited

Loading

netlify bot commented May 2, 2024 •

edited

Loading