feat: bump llama.cpp, add gguf support #943

mudler · 2023-08-22T22:05:39Z

Description

This PR syncs up the llama backend to use gguf (go-skynet/go-llama.cpp#180). It also adds llama-stable to the targets so we can still load ggml. It adapts the current tests to use the llama-backend for ggml and uses a gguf model to run tests on the new backend.

In order to consume the new version of go-llama.cpp, it also bump go to 1.21 (images, pipelines, etc)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler · 2023-08-23T23:19:36Z

let's see how it goes!

Bump llama.cpp, add gguf support

d6b47ed

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

mudler added the breaking-change label Aug 22, 2023

Bump go

29daf36

mudler added the high prio label Aug 22, 2023

mudler added 5 commits August 23, 2023 18:41

Drop ngqa and rmsnormeps

4886a73

ci: adapt pipelines

eb4f70f

ci: split gguf tests

3f8f535

Add llama-stable to build targets

224c9e1

Fix tests for llama-gguf model

e4d6b02

mudler merged commit 1120847 into master Aug 23, 2023
14 checks passed

mudler deleted the gguf branch August 23, 2023 23:18

mudler added dependencies and removed dependencies labels Aug 24, 2023

RussellPacheco mentioned this pull request Sep 26, 2023

LLama2 failing to load in Docker with cuBLAS #1109

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: bump llama.cpp, add gguf support #943

feat: bump llama.cpp, add gguf support #943

mudler commented Aug 22, 2023 •

edited

mudler commented Aug 23, 2023

feat: bump llama.cpp, add gguf support #943

feat: bump llama.cpp, add gguf support #943

Conversation

mudler commented Aug 22, 2023 • edited

mudler commented Aug 23, 2023

mudler commented Aug 22, 2023 •

edited