Skip to content

Commit

Permalink
llama: update llama.cpp to latest version (#244)
Browse files Browse the repository at this point in the history
* llama: update llama.cpp to latest version

This commit updates llama.cpp to the latest/later version.

The motivation for this is that the current version of llama.cpp is
a little outdated and there have been changes to the llama.cpp API and
also the model format. Currently it is not possible to use the new GGUF
format and many of the available models are in this new format which
can make it challenging to use this crate at the moment.

The following changes have been made:
* update llama.cpp to latest version using
  git submodule update --remote --merge llama.cpp

* Manually copied the generated bindings.rs file from the target
  directory to the src directory. Hope this was the correct thing to do.

* Updated the llm-chain-llama crate to use llama_decode instead of
  llm_eval which has now been deprecated.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* build: increase timeout to 30 mins for ci jobs

This is an attempt to fix builds from returning:

```
Error: The operation was canceled.
```

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* build: add fail-fast: false to build strategy

This is an attempt to prevent the currently failing windows build from
causing the other builds to be cancelled (at least that is what I think
is happening).

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* hnsw: update to hnsw_rs 0.2

This commit is an attempt to update the hnsw dependency to version 0.2
and to fix the currently failing windows build of hswn_rs
version 0.1.19 which is current failing to compile on windows:

```console
error[E0308]: mismatched types
   --> C:\Users\runneradmin\.cargo\registry\src\index.crates.io-6f17d22bba15001f\hnsw_rs-0.1.19\src\libext.rs:439:39
    |
439 |     let c_dist = DistCFFI::<f32>::new(c_func);
    |                  -------------------- ^^^^^^ expected `u32`, found `u64`
    |                  |
    |                  arguments to this function are incorrect
    |
    = note: expected fn pointer `extern "C" fn(_, _, u32) -> _`
               found fn pointer `extern "C" fn(_, _, u64) -> _`
note: associated function defined here
   --> C:\Users\runneradmin\.cargo\registry\src\index.crates.io-6f17d22bba15001f\hnsw_rs-0.1.19\src\dist.rs:990:12
    |
990 |     pub fn new(f:DistCFnPtr<T>) -> Self {
    |            ^^^ ---------------
```

I was able to reproduce this issue locally by cross compiling (which
produces the above error). But cross compiling with version 0.2 work and
so I've attempted to upgrade to that version.

This is very much a suggestion as I'm not familiar with the hnsw code
but perhaps it will be useful to someone else and save some time
investigating the issue.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* build: write bindings to src directory

This commit changes the build.rs script to write the generated
bindings to the src directory to avoid manual copying of the
bindings.rs file.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* doc: add instructions for updating llama.cpp

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* src: update llama.cpp submodule

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* squash! llama: update llama.cpp to latest version

This commit revert the change to the StopSequence option in
llm-chain-llama/src/options.rs.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* squash! llama: update llama.cpp to latest version

This commit removes the `From<llama_batch> for LlamaBatch` impl as it
is no longer needed.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* squash! llama: update llama.cpp to latest version

This commit creates a new LlamaBatch for new token sampled instead of
reusing the same one.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

* squash! llama: update llama.cpp to latest version

This commit extracts the logic for checking if the prompt is a question
into a separate conditional check. I've tried to clarify the comment of
this check as well so it is hopefully easier to understand now.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>

---------

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
  • Loading branch information
danbev authored Dec 15, 2023
1 parent 323a72a commit cad5646
Show file tree
Hide file tree
Showing 23 changed files with 4,080 additions and 863 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/cicd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,12 @@ jobs:

build_and_test:
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, macos-13, windows-latest]
rust-version: [stable]
runs-on: ${{ matrix.os }}
timeout-minutes: 30
steps:
- name: Checkout repository
uses: actions/checkout@v3
Expand Down
164 changes: 149 additions & 15 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion crates/llm-chain-hnsw/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ repository = "https://github.com/sobelio/llm-chain/"

[dependencies]
async-trait.workspace = true
hnsw_rs = "0.1.19"
hnsw_rs = "0.2"
llm-chain = { path = "../llm-chain", version = "0.13.0", default-features = false }
serde.workspace = true
serde_json.workspace = true
Expand Down
17 changes: 10 additions & 7 deletions crates/llm-chain-hnsw/examples/dump_load.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
use hnsw_rs::{hnswio::*, prelude::*};
use std::path::PathBuf;
use std::sync::Arc;

use llm_chain::{
Expand All @@ -16,7 +18,7 @@ async fn main() {
let hnsw_index_fn = "hnsw_index".to_string();
let mut embeddings = llm_chain_openai::embeddings::Embeddings::default();
let document_store = Arc::new(Mutex::new(InMemoryDocumentStore::<EmptyMetadata>::new()));
let mut hnsw_vs = HnswVectorStore::new(
let hnsw_vs = HnswVectorStore::new(
HnswArgs::default(),
Arc::new(embeddings),
document_store.clone(),
Expand Down Expand Up @@ -56,12 +58,13 @@ async fn main() {
// Load
println!("Loading hnsw index from file");
embeddings = llm_chain_openai::embeddings::Embeddings::default();
hnsw_vs = HnswVectorStore::load_from_file(
hnsw_index_fn,
Arc::new(embeddings),
document_store.clone(),
)
.unwrap();

let mut hnswio = HnswIo::new(PathBuf::from("."), hnsw_index_fn);
let hnsw_loaded = hnswio.load_hnsw::<f32, DistCosine>().unwrap();
let hnsw_vs =
HnswVectorStore::load_from_file(hnsw_loaded, Arc::new(embeddings), document_store.clone())
.unwrap();

println!("Loaded!");

let response = hnsw_vs
Expand Down
Loading

0 comments on commit cad5646

Please sign in to comment.