Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The convert-hf-to-gguf-update.py seems doesn't work. #7088

Closed
LiyanJin opened this issue May 5, 2024 · 15 comments
Closed

The convert-hf-to-gguf-update.py seems doesn't work. #7088

LiyanJin opened this issue May 5, 2024 · 15 comments

Comments

@LiyanJin
Copy link

LiyanJin commented May 5, 2024

Ubuntu 20.04, cudatoolkit12.2
GPU: Nvidia A100 24G
RAM 10G(available)

When I  use the 'convert-hf-to-gguf-update.py' in llama.cpp to convert ‘hf’ to 'gguf', neither does it report any error nor does it generate the 'gguf' file.
error1

When I use the 'convert-hf-to-gguf.py' in llama.cpp to convert ‘hf’ to 'gguf', the error occurred:
error2

Has anyone faced this problem? Does anyone know how to fix this problem?

@CrispStrobe
Copy link
Contributor

CrispStrobe commented May 5, 2024

the update script takes as parameter the hf token. (and before that, you must add your own model to the script, unless it is = one of the already listed ones. you do this like this: after the line
{"name": "llama-bpe", "tokt": TOKENIZER_TYPE.BPE, "repo": "https://huggingface.co/meta-llama/Meta-Llama-3-8B", },",
you add a line like
{"name": "llama-bpe-1", "tokt": TOKENIZER_TYPE.BPE, "repo": "https://huggingface.co/your/repo", },
seemingly you then copypaste the generated function into convert-hf-to-gguf.py replacing the function there,
cp llama.cpp/models/tokenizers/llama-bpe/* /your/model/path
run the script &
python convert-hf-to-gguf.py /your/model/path --outtype f16 --outfile /path/to/outputmodelfile.bin
(which takes these parameters: convert-hf-to-gguf.py [-h] [--vocab-only] [--awq-path AWQ_PATH] [--outfile OUTFILE] [--outtype {f32,f16}] [--bigendian] [--use-temp-file] model)

@teleprint-me
Copy link
Contributor

python convert-hf-to-gguf-update.py <hf-read-token>
python convert-hf-to-gguf.py /path/to/models/meta-llama/Meta-Llama-3-8B-Instruct-HF --outtype f16

Llama-3-8B is already included in the hash sums, so you shouldn't need to do it. If you run into issues with the conversion process, then you'll need to generate the vocabs, follow the instructions in the output, and copy over the retrieved vocabs to the models path. Then you should be able to convert.

@CrispStrobe
Copy link
Contributor

but from the screenshot op wants to work on a "merged" model, which might result in different hash?

@LiyanJin
Copy link
Author

LiyanJin commented May 6, 2024

@CrispStrobe @teleprint-me Thank you for the answers from the two experts. I tried the following command
python convert-hf-to-gguf.py ~/llama3_hf_merged/ --outfile ~/model/gml-model-f16.gguf --outtype f16
and encountered the following problems
error3
Then I attempted this commandcp llama.cpp/models/tokenizers/llama-bpe/* /your/model/path
But I found that I couldn't find the file with this name. So I copied all the files in the llama.cpp/models/ to my folder, like this
cp ~/llama.cpp/models/* ~/llama3_hf_merged/
I executed the following command again
python convert-hf-to-gguf.py ~/llama3_hf_merged/ --outfile ~/model/gml-model-f16.gguf --outtype f16
but still encountered this error
error4
Could you please take another look? Is there a mistake I made somewhere?

@teleprint-me
Copy link
Contributor

teleprint-me commented May 7, 2024

You're almost there 🥲

Some things to note:

  • I am not an expert. I'm learning as I go. Experience is a good teacher. 😅
  • The error is complaining about the vocabulary. Try with --vocab-type bpe flag. Reading the errors helps you understand what might have gone wrong. This is a great tip, especially if you're not a programmer.
  • Merged models will have mixed results. Something to do with the merging/conversion process. Not sure.
  • The tokenizers encodings are hashed, not the language model. So if the vocabulary was modified in any way, shape, or form, the hashes will not match. You'll need to add it, generate the new method, then copy it over to the conversion script to apply it.
python convert.py /path/to/models/meta-llama/Meta-Llama-3-8B-Instruct-HF --vocab-type bpe --outtype f16

Even if the conversion process succeeds, I suspect you'll not get the expected output from the model due to the fact that the models weights were merged. I don't have enough insight to answer why this is the case.

Note that this won't work with the convert-hf-to-gguf.py because the convert.py script has a vocabulary factory implemented in it. Not too sure about the HF script. Would need to check.

@CrispStrobe
Copy link
Contributor

CrispStrobe commented May 8, 2024

convert.py is not yet adapted to the bpe fix, so if that is needed, use convert-hf-to-gguf.py instead

you should check all the paths and contents involved. (edit:) The message about the tokenizer.model missing might be misleading in that you should not place this file there, but rather make sure that the update script fixes things so that the bpe tokenizer is recognized. This involves: the updated function in the convert script AND the copying of the new tokenizer files that the update script generates to the model directory.

(if you are in another situation, you could of course download such a tokenizer.model file as e.g. per
wget "https://huggingface.co/repo/model/resolve/main/tokenizer.model?download=true" -O ~/llama3_hf_merged/tokenizer.model or you could just generate it from the tokenizer.json file with: python convert.py path/to/where/the/tokenizer.json/lies --vocab-only --outfile /path/to/where/you/need/it/tokenizer.model --vocab-type bpe)

& of course, the placeholders ("/your/model/path", "repo/model") must be changed

& not
cp ~/llama.cpp/models/* ~/llama3_hf_merged/
but
cp ~/llama.cpp/models/tokenizers/llama-bpe/* ~/llama3_hf_merged/
(IF the former is correct, and ofc AFTER first script run)

@LiyanJin
Copy link
Author

LiyanJin commented May 8, 2024

You're almost there 🥲

Some things to note:

  • I am not an expert. I'm learning as I go. Experience is a good teacher. 😅
  • The error is complaining about the vocabulary. Try with --vocab-type bpe flag. Reading the errors helps you understand what might have gone wrong. This is a great tip, especially if you're not a programmer.
  • Merged models will have mixed results. Something to do with the merging/conversion process. Not sure.
  • The tokenizers encodings are hashed, not the language model. So if the vocabulary was modified in any way, shape, or form, the hashes will not match. You'll need to add it, generate the new method, then copy it over to the conversion script to apply it.
python convert.py /path/to/models/meta-llama/Meta-Llama-3-8B-Instruct-HF --vocab-type bpe --outtype f16

Even if the conversion process succeeds, I suspect you'll not get the expected output from the model due to the fact that the models weights were merged. I don't have enough insight to answer why this is the case.

Note that this won't work with the convert-hf-to-gguf.py because the convert.py script has a vocabulary factory implemented in it. Not too sure about the HF script. Would need to check.

@teleprint-me Appreciate the time you took to address this problem. I wanted to express my gratitude for your thorough explanation. It helped me understand the problem much better.

@LiyanJin
Copy link
Author

LiyanJin commented May 8, 2024

convert.py is not yet adapted to the bpe fix, so if that is needed, use convert-hf-to-gguf.py instead

you should check all the paths and contents involved: maybe the tokenizer.model simply was not present in the input path?

can easily be downloaded eg per wget "https://huggingface.co/repo/model/resolve/main/tokenizer.model?download=true" -O ~/llama3_hf_merged/tokenizer.model

& of course, the placeholders ("/your/model/path", "repo/model") must be changed

& not cp ~/llama.cpp/models/* ~/llama3_hf_merged/ but cp ~/llama.cpp/models/tokenizers/llama-bpe/* ~/llama3_hf_merged/ (IF the former is correct, and ofc AFTER first script run)
But I can't find the sub-directory named 'tokenizers' under the directory of 'models'.

@CrispStrobe Thank you for your valuable insights on this issue. Your expertise has been incredibly helpful. Your help has been invaluable.

@CrispStrobe
Copy link
Contributor

did it work for you now?
the models/tokenizers/ path is created and filled by the convert-hf-to-gguf-update.py script

@satyaloka93
Copy link

The tokenizers.json has changed, starting line 2332. Look at Meta's official repo and check your file. That bpe error goes away.

@satyaloka93
Copy link

Also the tokenizer_config.json, chat template and eos token changes. NousResearch was a bad choice to download from it appears.

@LiyanJin
Copy link
Author

@CrispStrobe Thank you for your concern, I am still trying, and I will let you know when I have results!

@CrispStrobe
Copy link
Contributor

CrispStrobe commented May 10, 2024

maybe you find this quickly hacked kaggle notebook useful for an illustration

@github-actions github-actions bot added the stale label Jun 10, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

@SanjayDevadiga
Copy link

SanjayDevadiga commented Jul 10, 2024

Ubuntu 20.04, cudatoolkit12.2 GPU: Nvidia A100 24G RAM 10G(available)

When I  use the 'convert-hf-to-gguf-update.py' in llama.cpp to convert ‘hf’ to 'gguf', neither does it report any error nor does it generate the 'gguf' file. error1

When I use the 'convert-hf-to-gguf.py' in llama.cpp to convert ‘hf’ to 'gguf', the error occurred: error2

Has anyone faced this problem? Does anyone know how to fix this problem?

I encountered the same problem when I was converting phi-2 model
This is how I fixed :
1. check convert_hf_to_gguf_update.py line 66
For me there was no model link for phi 2 so I added the below line into that list
model = [
{"name": "phi-2", "tokt": TOKENIZER_TYPE.BPE, "repo": "https://huggingface.co/microsoft/phi-2", },
....rest of the items
}
]
2. Run the convert_hf_to_gguf_update.py
python convert_hf_to_gguf_update.py Your hugging face token
You can create or find a token in your hugging face profile settings(I created a token with read/write access)

  1. Now run the conversion script after the previous scripts completes
    python convert-hf-to-gguf.py ~/llama3_hf_merged/ --outfile ~/model/gml-model-f16.gguf --outtype f16

I hope it works

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants