Trying to convert intfloat/e5-mistral-7b-instruct to GGUF #4786

AshD · 2024-01-05T17:53:59Z

AshD
Jan 5, 2024

Could not find a GGUF version of https://huggingface.co/intfloat/e5-mistral-7b-instruct

I am trying to convert it to GGUF using convert.py but getting a KeyError: 'embed_tokens.weight'

Any suggestions?

Thanks,
Ash

python convert.py models/e5-mistral-7b-instruct/
Loading model file models\e5-mistral-7b-instruct\model-00001-of-00002.safetensors
Loading model file models\e5-mistral-7b-instruct\model-00001-of-00002.safetensors
Loading model file models\e5-mistral-7b-instruct\model-00002-of-00002.safetensors
Traceback (most recent call last):
  File "C:\AI\llama.cpp\convert.py", line 1295, in <module>
    main()
  File "C:\AI\llama.cpp\convert.py", line 1223, in main
    model_plus = load_some_model(args.model)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\llama.cpp\convert.py", line 1144, in load_some_model
    model_plus = merge_multifile_models(models_plus)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\llama.cpp\convert.py", line 637, in merge_multifile_models
    model = merge_sharded([mp.model for mp in models_plus])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\llama.cpp\convert.py", line 616, in merge_sharded
    return {name: convert(name) for name in names}
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\llama.cpp\convert.py", line 616, in <dictcomp>
    return {name: convert(name) for name in names}
                  ^^^^^^^^^^^^^
  File "C:\AI\llama.cpp\convert.py", line 591, in convert
    lazy_tensors: list[LazyTensor] = [model[name] for model in models]
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\llama.cpp\convert.py", line 591, in <listcomp>
    lazy_tensors: list[LazyTensor] = [model[name] for model in models]
                                      ~~~~~^^^^^^
KeyError: 'embed_tokens.weight'

Answered by s3nh

Jan 18, 2024

I probably did it, just by modify the tensor_mapping.py and update TensorNameMap dictionary using names stricly from Lora adapter.
Upload modified one here:
https://gist.github.com/s3nh/a06f827bc492eb4b667db09d44b922e7

Then:

convert to fp16.bin base model and Lora
merge them
quantize with llama.cpp/quantize
I got feedback that It looks ok so you can give it a try and prove me wrong eventually.

https://huggingface.co/s3nh/intfloat-e5-mistral-7b-instruct-GGUF

View full answer

AshD · 2024-01-07T02:45:41Z

AshD
Jan 7, 2024
Author

Is this error because embedding models are not supported by the conversion tool?

1 reply

ggerganov Jan 7, 2024
Maintainer

Likely the tensor name map has to be updated to support this name:

llama.cpp/gguf-py/gguf/tensor_mapping.py

Lines 10 to 22 in c75ca5d

    
           # Token embeddings 
        
           MODEL_TENSOR.TOKEN_EMBD: ( 
        
               "gpt_neox.embed_in",                         # gptneox 
        
               "transformer.wte",                           # gpt2 gpt-j mpt refact qwen 
        
               "transformer.word_embeddings",               # falcon 
        
               "word_embeddings",                           # bloom 
        
               "model.embed_tokens",                        # llama-hf 
        
               "tok_embeddings",                            # llama-pth 
        
               "embeddings.word_embeddings",                # bert 
        
               "language_model.embedding.word_embeddings",  # persimmon 
        
               "wte",                                       # gpt2 
        
               "transformer.embd.wte",                      # phi2 
        
           ),

gltanaka · 2024-01-16T04:14:30Z

gltanaka
Jan 16, 2024

were you able to get this working?

0 replies

AshD · 2024-01-16T04:21:38Z

AshD
Jan 16, 2024
Author

No. I did not know how to do it.

…

On Mon, Jan 15, 2024 at 8:14 PM Greg Tanaka ***@***.***> wrote: were you able to get this working? — Reply to this email directly, view it on GitHub <#4786 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABY4ODPOLJNMJVNABGXKITYOX5DFAVCNFSM6AAAAABBOYKJMCVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DCMZZGY2TK> . You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

s3nh · 2024-01-18T08:44:27Z

s3nh
Jan 18, 2024

I probably did it, just by modify the tensor_mapping.py and update TensorNameMap dictionary using names stricly from Lora adapter.
Upload modified one here:
https://gist.github.com/s3nh/a06f827bc492eb4b667db09d44b922e7

Then:

convert to fp16.bin base model and Lora
merge them
quantize with llama.cpp/quantize
I got feedback that It looks ok so you can give it a try and prove me wrong eventually.

https://huggingface.co/s3nh/intfloat-e5-mistral-7b-instruct-GGUF

3 replies

distel-mw Feb 5, 2024

Hey @s3nh, could you give a hint on how to do exactly this steps, because I am running into the exact same error as described by AshD with your version tensor_mapping.py.
If you could explain which commands / tools you've used for converting to fp16.bin base model + Lora and the merge step - would appreciate it!

AshD Feb 5, 2024
Author

Thanks @s3nh
Sorry for the delayed response. I missed this notification.

I could load your GGUF model with llama.cpp fine.

I did some preliminary testing with Microsoft Kernel memory and the matches are not good for search terms. Maybe, a BERT embedding model is the way to go.

Thanks,
Ash

dranger003 Feb 10, 2024

@distel-mw You need to first replace tensor_mapping.py with the one from s3nh above, then use convert-lora-to-ggml.py to convert the lora from intfloat/e5-mistral-7b-instruct and that will give you a .bin file. You then need to use convert.py and convert the base model from mistralai/Mistral-7B-v0.1 to a GGUF file then finally use export-lora.exe to merge the GGUF and the lora .bin files into a new GGUF file. You can optionally quantize it with quantize.exe.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to convert intfloat/e5-mistral-7b-instruct to GGUF #4786

{{title}}

Replies: 4 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Trying to convert intfloat/e5-mistral-7b-instruct to GGUF #4786

AshD Jan 5, 2024

Replies: 4 comments · 4 replies

AshD Jan 7, 2024 Author

ggerganov Jan 7, 2024 Maintainer

gltanaka Jan 16, 2024

AshD Jan 16, 2024 Author

s3nh Jan 18, 2024

distel-mw Feb 5, 2024

AshD Feb 5, 2024 Author

dranger003 Feb 10, 2024

AshD
Jan 5, 2024

Replies: 4 comments 4 replies

AshD
Jan 7, 2024
Author

ggerganov Jan 7, 2024
Maintainer

gltanaka
Jan 16, 2024

AshD
Jan 16, 2024
Author

s3nh
Jan 18, 2024

AshD Feb 5, 2024
Author