Quantized Phi-3 example fails "cannot find llama.attention.head_count in metadata" #2154

MoonKraken · 2024-05-02T21:01:24Z

cargo run --example quantized-phi --release -- --prompt "what is the best thing about rust?" --which phi-3
   Compiling candle-examples v0.5.0 (/Users/kenk/Documents/Code/OpenSource/candle/candle-examples)
    Finished release [optimized] target(s) in 3.21s
     Running `target/release/examples/quantized-phi --prompt 'what is the best thing about rust?' --which phi-3`
avx: false, neon: true, simd128: false, f16c: false
temp: 0.80 repeat-penalty: 1.10 repeat-last-n: 64
Running on CPU, to run on GPU(metal), build this example with `--features metal`
loaded 195 tensors (2.39GB) in 0.08s
Error: cannot find llama.attention.head_count in metadata

hardware: M1 macbook

This issue does not occur when using phi-2 or with any other example that I've tried. It also still occurs even with the metal feature enabled.

The text was updated successfully, but these errors were encountered:

socathie · 2024-05-03T12:48:28Z

I believe this is not a candle issue. I have downloaded the model a few days ago and has no error running, while my colleague has the same error as @MoonKraken mentioned.

Upon investigation, I found that my model file has a SHA256 hash of 1cd9a9df07350196623f93bf4829cf228959e07ad32f787b8fdd7f5956f5b9de but his is 8a83c7fb9049a9b2e92266fa7ad04933bb53aa1e85136b7b30f1b8000ff2edef - his hash matches the one currently on the model page.

Googling my hash, I found mine on a branch that doesn't exist on the model page here and seems like the author has also continued to push to this branch, suggesting that this is the correct one.

This might be a serious issue, suggesting that the commit has somehow "switched" to another one (the wrong one) without the author knowing it.

LaurentMazare · 2024-05-03T12:57:43Z

Thanks for looking into this, that seems pretty odd for the hash to be modified like this. I've just modified the example code in #2156 so that it forces the use of the separate branch that you mentioned so hopefully that will fix it for your colleague and others.

socathie · 2024-05-03T13:02:36Z

Thanks - this is a great temporary solution. Wondering if there is any way we can also flag to the huggingface team about this?

LaurentMazare · 2024-05-03T20:09:17Z

Yeah not sure what they would think of this.
Anyway I've also put together #2157 which uses the new naming convention with an implementation closer to the phi-3 codebase rather than re-using the llama ones so hopefully we should also be good to cover upcoming phi-3 models if they use the current "main" branch.

LaurentMazare mentioned this issue May 5, 2024

Problem loading metadata of gguf file #2152

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantized Phi-3 example fails "cannot find llama.attention.head_count in metadata" #2154

Quantized Phi-3 example fails "cannot find llama.attention.head_count in metadata" #2154

MoonKraken commented May 2, 2024 •

edited

socathie commented May 3, 2024 •

edited

LaurentMazare commented May 3, 2024

socathie commented May 3, 2024

LaurentMazare commented May 3, 2024

Quantized Phi-3 example fails "cannot find llama.attention.head_count in metadata" #2154

Quantized Phi-3 example fails "cannot find llama.attention.head_count in metadata" #2154

Comments

MoonKraken commented May 2, 2024 • edited

socathie commented May 3, 2024 • edited

LaurentMazare commented May 3, 2024

socathie commented May 3, 2024

LaurentMazare commented May 3, 2024

MoonKraken commented May 2, 2024 •

edited

socathie commented May 3, 2024 •

edited