llama-cli: Could not load model: InvalidMagic { path: ... } #59

mguinhos · 2023-03-22T15:34:50Z

Model sucessfully runs on llama.cpp but not in llama-rs

Command:

cargo run --release -- -m C:\Users\Usuário\Downloads\LLaMA\7B\ggml-model-q4_0.bin -p "Tell me how cool the Rust programming language is:"

PS C:\Users\Usuário\Desktop\llama-rs> cargo run --release -- -m C:\Users\Usuário\Downloads\LLaMA\7B\ggml-model-q4_0.bin -p "Tell me how cool the Rust programming language is:"
    Finished release [optimized] target(s) in 2.83s
     Running `target\release\llama-cli.exe -m C:\Users\Usuário\Downloads\LLaMA\7B\ggml-model-q4_0.bin -p "Tell me how cool the Rust programming language is:"`
thread 'main' panicked at 'Could not load model: InvalidMagic { path: "C:\\Users\\Usuário\\Downloads\\LLaMA\\7B\\ggml-model-q4_0.bin" }', llama-cli\src\main.rs:147:10
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
error: process didn't exit successfully: `target\release\llama-cli.exe -m C:\Users\Usuário\Downloads\LLaMA\7B\ggml-model-q4_0.bin -p "Tell me how cool the Rust programming language is:"` (exit code: 101)

The text was updated successfully, but these errors were encountered:

philpax · 2023-03-22T15:37:59Z

ggerganov/llama.cpp#252 changed the model format, and we're not compatible with it yet. Thanks for spotting this - we'll need to expedite the fix.

In the meantime, you can re-quantize the model with a version of llama.cpp that predates that, or find a quantized model floating around the internet from before then.

mguinhos · 2023-03-22T15:39:36Z

Humm... thanks i will try to re-quantize the model to a previous version!

mguinhos · 2023-03-22T15:45:16Z

ggerganov/llama.cpp#252 changed the model format, and we're not compatible with it yet. Thanks for spotting this - we'll need to expedite the fix.

In the meantime, you can re-quantize the model with a version of llama.cpp that predates that, or find a quantized model floating around the internet from before then.

Got it! Tried with the previous alpaca version that i had!

philpax · 2023-03-22T15:46:23Z

Great! We'll leave this issue open as a reminder that we'll need to update to handle the new format.

mguinhos · 2023-03-22T16:41:28Z

Changing the code a bit is sufficient for running the new versioned file formats.

  #[error("file is pre-versioned, generate another please! at {path:?}")]
  PreVersioned { path: PathBuf },
  #[error("invalid magic number for {path:?}")]
  InvalidMagic { path: PathBuf },
  #[error("invalid version number for {path:?}")]
  InvalidVersion { path: PathBuf },

...

// Verify magic
{
    let magic = read_i32(&mut reader)?;
    if magic == 0x67676d6c {
        return Err(LoadError::PreVersioned {
            path: main_path.to_owned(),
        });
    }
    
    if magic != 0x67676d66 {
        return Err(LoadError::InvalidMagic {
            path: main_path.to_owned(),
        });
    }
}

// Verify the version
{
    let format_version = read_i32(&mut reader)?;
    if format_version != 1 {
        return Err(LoadError::InvalidVersion {
            path: main_path.to_owned(),
        });
    }
}

...

// Load vocabulary
let mut vocab = Vocabulary::default();
for i in 0..hparams.n_vocab {
    let len = read_i32(&mut reader)?;
    if let Ok(word) = read_string(&mut reader, len as usize) {
        vocab.mapping.push(word);
        
    } else {
        load_progress_callback(LoadProgress::BadToken {
            index: i.try_into()?,
        });
        vocab.mapping.push("�".to_string());
    }

    let score: f32 = read_i32(&mut reader)? as f32;
    vocab.score.push(score);
}

It works without issues.
But i dont know if its sufficiente, nothing panicked and it did the inference.

mguinhos · 2023-03-22T16:46:05Z

I think that the change in the binary format was just the inclusion of the version number, and the score in the load vocabulary. but i am not sure.

ghost · 2023-03-23T00:18:07Z

After fixing this bug, and adding score: Vec<f32> to the Vocabulary struct, the 7B model works, but 65B does not. It crashes with an allocation error in the ggml library.

mguinhos · 2023-03-23T07:51:51Z

Related pull request: #61

RoyVorster · 2023-03-23T07:53:20Z

@mguinhos thanks for referencing. Hadn't even seen the issue. Feel free to modify the PR. Was just running llama-rs for the first time and running into this issue, figured it'd be best to share the small fixes.

RoyVorster · 2023-03-24T04:05:17Z

Can probably close this issue now?

vv9k · 2023-04-05T13:21:24Z

I'm using current main branch of llama-rs, got the 7B model and used the python script to convert it and then quantized it with the latest commit of llama.cpp and getting this error:

thread 'main' panicked at 'Could not load model: InvalidMagic { path: "LLaMA/7B/ggml-model-q4_0.bin" }', llama-cli/src/main.rs:206:6

llama.cpp works with the same model:

❯ ./main -m LLaMA/7B/ggml-model-q4_0.bin -p "test"
main: seed = 1680698608
llama_model_load: loading model from 'LLaMA/7B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: ggml map size = 4017.70 MB
llama_model_load: ggml ctx size =  81.25 KB
llama_model_load: mem required  = 5809.78 MB (+ 1026.00 MB per state)
llama_model_load: loading tensors from '/home/wojtek/special_downloads/LLaMA/7B/ggml-model-q4_0.bin'
llama_model_load: model size =  4017.27 MB / num tensors = 291
llama_init_from_file: kv self size  =  256.00 MB

system_info: n_threads = 12 / 12 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
sampling: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.100000
generate: n_ctx = 512, n_batch = 8, n_predict = 128, n_keep = 0


 test_suite = True

# The suite of tests to see running.
test_suite( 'TestSuite' ) [end of text]

llama_print_timings:        load time =  1165.30 ms
llama_print_timings:      sample time =    14.24 ms /    27 runs   (    0.53 ms per run)
llama_print_timings: prompt eval time =   763.14 ms /     2 tokens (  381.57 ms per token)
llama_print_timings:        eval time =  4288.87 ms /    26 runs   (  164.96 ms per run)
llama_print_timings:       total time =  5468.84 ms

EDIT:

I managed to get it to work reconverting and requantizing the model with llama.cpp commit 5cb63e2 before change of format. I'm using current main of llama-rs so it should work with the newer format but I'm getting the above error with InvalidMagic

mguinhos changed the title ~~llama-cli: Could not load model: InvalidMagic { path: ... }', llama-cli\src\main.rs:147:10~~ llama-cli: Could not load model: InvalidMagic { path: ... } Mar 22, 2023

philpax mentioned this issue Mar 23, 2023

Update to latest llama.cpp #62

Closed

This was referenced Mar 23, 2023

65B model does not run #65

Closed

fix 65B model #66

Merged

philpax closed this as completed Mar 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama-cli: Could not load model: InvalidMagic { path: ... } #59

llama-cli: Could not load model: InvalidMagic { path: ... } #59

mguinhos commented Mar 22, 2023 •

edited

Loading

philpax commented Mar 22, 2023

mguinhos commented Mar 22, 2023

mguinhos commented Mar 22, 2023

philpax commented Mar 22, 2023

mguinhos commented Mar 22, 2023

mguinhos commented Mar 22, 2023 •

edited

Loading

ghost commented Mar 23, 2023

mguinhos commented Mar 23, 2023

RoyVorster commented Mar 23, 2023 •

edited

Loading

RoyVorster commented Mar 24, 2023

vv9k commented Apr 5, 2023 •

edited

Loading

llama-cli: Could not load model: InvalidMagic { path: ... } #59

llama-cli: Could not load model: InvalidMagic { path: ... } #59

Comments

mguinhos commented Mar 22, 2023 • edited Loading

philpax commented Mar 22, 2023

mguinhos commented Mar 22, 2023

mguinhos commented Mar 22, 2023

philpax commented Mar 22, 2023

mguinhos commented Mar 22, 2023

mguinhos commented Mar 22, 2023 • edited Loading

ghost commented Mar 23, 2023

mguinhos commented Mar 23, 2023

RoyVorster commented Mar 23, 2023 • edited Loading

RoyVorster commented Mar 24, 2023

vv9k commented Apr 5, 2023 • edited Loading

mguinhos commented Mar 22, 2023 •

edited

Loading

mguinhos commented Mar 22, 2023 •

edited

Loading

RoyVorster commented Mar 23, 2023 •

edited

Loading

vv9k commented Apr 5, 2023 •

edited

Loading