Format conversion issue after downstream SFT

I used my own dataset to finetune the model `bitnet-b1.58-2B-4T-bf16` for downstream task. The saved checkpoint directory is as follows:  
``` shell
path/to/my/ckpt  
    ├── chat_template.jinja  
    ├── config.json  
    ├── generation_config.json  
    ├── model.safetensors  
    ├── optimizer.pt  
    ├── rng_state.pth  
    ├── scheduler.pt  
    ├── special_tokens_map.jso  
    ├── tokenizer.json  
    ├── tokenizer_config.json  
    ├── trainer_state.json  
    └── training_args.bin  
```
Now I'm trying to convert this model to the `gguf` format, as what it is in readme file:  
```shell
python setup_env.py -md models/BitNet-b1.58-2B-4T -q i2_s
```
(replaced `models/BitNet-b1.58-2B-4T` with my actual model path)

And I modified the `setup_env.py` file to support my local model path, but I'm encountering an error when running the command, as follows:  

```shell
INFO:hf-to-gguf:Loading model: checkpoint-1800
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 4096
INFO:hf-to-gguf:gguf: embedding length = 2560
INFO:hf-to-gguf:gguf: feed forward length = 6912
INFO:hf-to-gguf:gguf: head count = 20
INFO:hf-to-gguf:gguf: key-value head count = 5
INFO:hf-to-gguf:gguf: rope theta = 500000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 0
INFO:hf-to-gguf:Set model tokenizer
Traceback (most recent call last):
  File "/data/personal/liuzhi/projects/LLaMA-Factory/BitNet/utils/convert-hf-to-gguf-bitnet.py", line 1165, in <module>
    main()
  File "/data/personal/liuzhi/projects/LLaMA-Factory/BitNet/utils/convert-hf-to-gguf-bitnet.py", line 1150, in main
    model_instance.set_vocab()
  File "/data/personal/liuzhi/projects/LLaMA-Factory/BitNet/utils/convert-hf-to-gguf-bitnet.py", line 957, in set_vocab
    self._set_vocab_sentencepiece()
  File "/data/personal/liuzhi/projects/LLaMA-Factory/BitNet/utils/convert-hf-to-gguf-bitnet.py", line 383, in _set_vocab_sentencepiece
    raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: /data/personal/liuzhi/projects/LLaMA-Factory/ckpt/bitnet-b1.58-2B-4T-bf16/20250428_ele_v2/checkpoint-1800/tokenizer.model
```

It seems that the `tokenizer.model` file is missing but this file doesn't exist in the offical code and model files.  

Could you please provide some guidance on how to convert my model to the `gguf` format and inference, like the official code did? Appreciate!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Format conversion issue after downstream SFT #236

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Format conversion issue after downstream SFT #236

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions