Skip to content

Format conversion issue after downstream SFT #236

@LiuZhihhxx

Description

@LiuZhihhxx

I used my own dataset to finetune the model bitnet-b1.58-2B-4T-bf16 for downstream task. The saved checkpoint directory is as follows:

path/to/my/ckpt  
    ├── chat_template.jinja  
    ├── config.json  
    ├── generation_config.json  
    ├── model.safetensors  
    ├── optimizer.pt  
    ├── rng_state.pth  
    ├── scheduler.pt  
    ├── special_tokens_map.jso  
    ├── tokenizer.json  
    ├── tokenizer_config.json  
    ├── trainer_state.json  
    └── training_args.bin  

Now I'm trying to convert this model to the gguf format, as what it is in readme file:

python setup_env.py -md models/BitNet-b1.58-2B-4T -q i2_s

(replaced models/BitNet-b1.58-2B-4T with my actual model path)

And I modified the setup_env.py file to support my local model path, but I'm encountering an error when running the command, as follows:

INFO:hf-to-gguf:Loading model: checkpoint-1800
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 4096
INFO:hf-to-gguf:gguf: embedding length = 2560
INFO:hf-to-gguf:gguf: feed forward length = 6912
INFO:hf-to-gguf:gguf: head count = 20
INFO:hf-to-gguf:gguf: key-value head count = 5
INFO:hf-to-gguf:gguf: rope theta = 500000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 0
INFO:hf-to-gguf:Set model tokenizer
Traceback (most recent call last):
  File "/data/personal/liuzhi/projects/LLaMA-Factory/BitNet/utils/convert-hf-to-gguf-bitnet.py", line 1165, in <module>
    main()
  File "/data/personal/liuzhi/projects/LLaMA-Factory/BitNet/utils/convert-hf-to-gguf-bitnet.py", line 1150, in main
    model_instance.set_vocab()
  File "/data/personal/liuzhi/projects/LLaMA-Factory/BitNet/utils/convert-hf-to-gguf-bitnet.py", line 957, in set_vocab
    self._set_vocab_sentencepiece()
  File "/data/personal/liuzhi/projects/LLaMA-Factory/BitNet/utils/convert-hf-to-gguf-bitnet.py", line 383, in _set_vocab_sentencepiece
    raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: /data/personal/liuzhi/projects/LLaMA-Factory/ckpt/bitnet-b1.58-2B-4T-bf16/20250428_ele_v2/checkpoint-1800/tokenizer.model

It seems that the tokenizer.model file is missing but this file doesn't exist in the offical code and model files.

Could you please provide some guidance on how to convert my model to the gguf format and inference, like the official code did? Appreciate!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions