-
Notifications
You must be signed in to change notification settings - Fork 13.2k
Open
Labels
Description
Name and Version
./build/bin/llama-cli --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
version: 6653 (e74c92e84)
built with cc (GCC) 15.2.1 20250813 for x86_64-pc-linux-gnu
Requirement already satisfied: tokenizers in /home/alpha/AI/exui_v3/venv/lib/python3.13/site-packages (0.22.1)
Requirement already satisfied: transformers in /home/alpha/AI/exui_v3/venv/lib/python3.13/site-packages (4.56.2)
Requirement already satisfied: safetensors in /home/alpha/AI/exui_v3/venv/lib/python3.13/site-packages (0.6.2)
Requirement already satisfied: huggingface-hub<2.0,>=0.16.4 in /home/alpha/AI/exui_v3/venv/lib/python3.13/site-packages (from tokenizers) (0.35.0rc0)
Requirement already satisfied: filelock in /usr/lib/python3.13/site-packages (from transformers) (3.19.1)
Requirement already satisfied: numpy>=1.17 in /home/alpha/AI/exui_v3/venv/lib/python3.13/site-packages (from transformers) (2.3.2)
Requirement already satisfied: packaging>=20.0 in /home/alpha/AI/exui_v3/venv/lib/python3.13/site-packages (from transformers) (25.0)
Requirement already satisfied: pyyaml>=5.1 in /usr/lib/python3.13/site-packages (from transformers) (6.0.2)
Requirement already satisfied: regex!=2019.12.17 in /usr/lib/python3.13/site-packages (from transformers) (2025.9.18)
Requirement already satisfied: requests in /usr/lib/python3.13/site-packages (from transformers) (2.32.5)
Requirement already satisfied: tqdm>=4.27 in /usr/lib/python3.13/site-packages (from transformers) (4.67.1)
Requirement already satisfied: fsspec>=2023.5.0 in /home/alpha/AI/exui_v3/venv/lib/python3.13/site-packages (from huggingface-hub<2.0,>=0.16.4->tokenizers) (2024.12.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /home/alpha/AI/exui_v3/venv/lib/python3.13/site-packages (from huggingface-hub<2.0,>=0.16.4->tokenizers) (4.15.0)
Requirement already satisfied: hf-xet<2.0.0,>=1.1.3 in /home/alpha/AI/exui_v3/venv/lib/python3.13/site-packages (from huggingface-hub<2.0,>=0.16.4->tokenizers) (1.1.7)
Requirement already satisfied: charset_normalizer<4,>=2 in /usr/lib/python3.13/site-packages (from requests->transformers) (3.4.3)
Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3.13/site-packages (from requests->transformers) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/lib/python3.13/site-packages (from requests->transformers) (2.5.0)
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
Python/Bash scripts
Command line
python convert_hf_to_gguf.py --outfile /home/alpha/Models/GGUF/GLM-4.6-BF16/zai-org_GLM-4.6-bf16 --outtype bf16 /home/alpha/Models/Raw/zai-org_GLM-4.6 --split-max-size 100G
Problem description & steps to reproduce
convert_hf_to_gguf errors out with missing tensors (from mtp.safetensors) when attempting to convert to GGUF.
GLM 4.5 appears to convert normally.
I've SHA256 checked local files with: https://huggingface.co/zai-org/GLM-4.6
But manually redownloaded mtp.safetensors, just in case.
It seems GLM 4.6 conversion worked for you, per: #16359
Did you use any workarounds? What command did you use?
First Bad Commit
Relevant log output
...
INFO:hf-to-gguf:blk.90.attn_v.weight, torch.bfloat16 --> BF16, shape = {5120, 1024}
INFO:hf-to-gguf:gguf: loading model part 'model-00092-of-00092.safetensors'
INFO:hf-to-gguf:output.weight, torch.bfloat16 --> BF16, shape = {5120, 151552}
INFO:hf-to-gguf:blk.91.attn_norm.weight, torch.bfloat16 --> F32, shape = {5120}
INFO:hf-to-gguf:blk.91.ffn_down_exps.weight, torch.bfloat16 --> BF16, shape = {1536, 5120, 160}
INFO:hf-to-gguf:blk.91.ffn_gate_exps.weight, torch.bfloat16 --> BF16, shape = {5120, 1536, 160}
INFO:hf-to-gguf:blk.91.ffn_up_exps.weight, torch.bfloat16 --> BF16, shape = {5120, 1536, 160}
INFO:hf-to-gguf:blk.91.exp_probs_b.bias, torch.float32 --> F32, shape = {160}
INFO:hf-to-gguf:blk.91.ffn_gate_inp.weight, torch.bfloat16 --> F32, shape = {5120, 160}
INFO:hf-to-gguf:blk.91.ffn_down_shexp.weight, torch.bfloat16 --> BF16, shape = {1536, 5120}
INFO:hf-to-gguf:blk.91.ffn_gate_shexp.weight, torch.bfloat16 --> BF16, shape = {5120, 1536}
INFO:hf-to-gguf:blk.91.ffn_up_shexp.weight, torch.bfloat16 --> BF16, shape = {5120, 1536}
INFO:hf-to-gguf:blk.91.post_attention_norm.weight, torch.bfloat16 --> F32, shape = {5120}
INFO:hf-to-gguf:blk.91.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.91.attn_k.bias, torch.bfloat16 --> F32, shape = {1024}
INFO:hf-to-gguf:blk.91.attn_k.weight, torch.bfloat16 --> BF16, shape = {5120, 1024}
INFO:hf-to-gguf:blk.91.attn_output.weight, torch.bfloat16 --> BF16, shape = {12288, 5120}
INFO:hf-to-gguf:blk.91.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.91.attn_q.bias, torch.bfloat16 --> F32, shape = {12288}
INFO:hf-to-gguf:blk.91.attn_q.weight, torch.bfloat16 --> BF16, shape = {5120, 12288}
INFO:hf-to-gguf:blk.91.attn_v.bias, torch.bfloat16 --> F32, shape = {1024}
INFO:hf-to-gguf:blk.91.attn_v.weight, torch.bfloat16 --> BF16, shape = {5120, 1024}
INFO:hf-to-gguf:output_norm.weight, torch.bfloat16 --> F32, shape = {5120}
Traceback (most recent call last):
File "/home/alpha/AI/llama.cpp/convert_hf_to_gguf.py", line 9365, in <module>
main()
~~~~^^
File "/home/alpha/AI/llama.cpp/convert_hf_to_gguf.py", line 9359, in main
model_instance.write()
~~~~~~~~~~~~~~~~~~~~^^
File "/home/alpha/AI/llama.cpp/convert_hf_to_gguf.py", line 429, in write
self.prepare_tensors()
~~~~~~~~~~~~~~~~~~~~^^
File "/home/alpha/AI/llama.cpp/convert_hf_to_gguf.py", line 7271, in prepare_tensors
super().prepare_tensors()
~~~~~~~~~~~~~~~~~~~~~~~^^
File "/home/alpha/AI/llama.cpp/convert_hf_to_gguf.py", line 282, in prepare_tensors
for name, data_torch in chain(self.generate_extra_tensors(), self.get_tensors()):
~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/alpha/AI/llama.cpp/convert_hf_to_gguf.py", line 227, in get_tensors
raise ValueError(f"Missing or incomplete model files: {missing_files}\n"
f"Missing tensors: {missing}")
ValueError: Missing or incomplete model files: ['mtp.safetensors']
Missing tensors: ['model.layers.92.eh_proj.weight', 'model.layers.92.enorm.weight', 'model.layers.92.hnorm.weight', 'model.layers.92.input_layernorm.weight', 'model.layers.92.mlp.experts.0.down_proj.weight', 'model.layers.92.mlp.experts.0.gate_proj.weight', 'model.layers.92.mlp.experts.0.up_proj.weight', 'model.layers.92.mlp.experts.1.down_proj.weight', 'model.layers.92.mlp.experts.1.gate_proj.weight', 'model.layers.92.mlp.experts.1.up_proj.weight', 'model.layers.92.mlp.experts.10.down_proj.weight', 'model.layers.92.mlp.experts.10.gate_proj.weight', 'model.layers.92.mlp.experts.10.up_proj.weight', 'model.layers.92.mlp.experts.100.down_proj.weight', 'model.layers.92.mlp.experts.100.gate_proj.weight', 'model.layers.92.mlp.experts.100.up_proj.weight', 'model.layers.92.mlp.experts.101.down_proj.weight', 'model.layers.92.mlp.experts.101.gate_proj.weight', 'model.layers.92.mlp.experts.101.up_proj.weight', 'model.layers.92.mlp.experts.102.down_proj.weight', 'model.layers.92.mlp.experts.102.gate_proj.weight', 'model.layers.92.mlp.experts.102.up_proj.weight', 'model.layers.92.mlp.experts.103.down_proj.weight', 'model.layers.92.mlp.experts.103.gate_proj.weight', 'model.layers.92.mlp.experts.103.up_proj.weight', 'model.layers.92.mlp.experts.104.down_proj.weight', 'model.layers.92.mlp.experts.104.gate_proj.weight', 'model.layers.92.mlp.experts.104.up_proj.weight', 'model.layers.92.mlp.experts.105.down_proj.weight', 'model.layers.92.mlp.experts.105.gate_proj.weight', 'model.layers.92.mlp.experts.105.up_proj.weight', 'model.layers.92.mlp.experts.106.down_proj.weight', 'model.layers.92.mlp.experts.106.gate_proj.weight', 'model.layers.92.mlp.experts.106.up_proj.weight', 'model.layers.92.mlp.experts.107.down_proj.weight', 'model.layers.92.mlp.experts.107.gate_proj.weight', 'model.layers.92.mlp.experts.107.up_proj.weight', 'model.layers.92.mlp.experts.108.down_proj.weight', 'model.layers.92.mlp.experts.108.gate_proj.weight', 'model.layers.92.mlp.experts.108.up_proj.weight', 'model.layers.92.mlp.experts.109.down_proj.weight', 'model.layers.92.mlp.experts.109.gate_proj.weight', 'model.layers.92.mlp.experts.109.up_proj.weight', 'model.layers.92.mlp.experts.11.down_proj.weight', 'model.layers.92.mlp.experts.11.gate_proj.weight', 'model.layers.92.mlp.experts.11.up_proj.weight', 'model.layers.92.mlp.experts.110.down_proj.weight', 'model.layers.92.mlp.experts.110.gate_proj.weight', 'model.layers.92.mlp.experts.110.up_proj.weight', 'model.layers.92.mlp.experts.111.down_proj.weight', 'model.layers.92.mlp.experts.111.gate_proj.weight', 'model.layers.92.mlp.experts.111.up_proj.weight', 'model.layers.92.mlp.experts.112.down_proj.weight', 'model.layers.92.mlp.experts.112.gate_proj.weight', 'model.layers.92.mlp.experts.112.up_proj.weight', 'model.layers.92.mlp.experts.113.down_proj.weight', 'model.layers.92.mlp.experts.113.gate_proj.weight', 'model.layers.92.mlp.experts.113.up_proj.weight', 'model.layers.92.mlp.experts.114.down_proj.weight', 'model.layers.92.mlp.experts.114.gate_proj.weight', 'model.layers.92.mlp.experts.114.up_proj.weight', 'model.layers.92.mlp.experts.115.down_proj.weight', 'model.layers.92.mlp.experts.115.gate_proj.weight', 'model.layers.92.mlp.experts.115.up_proj.weight', 'model.layers.92.mlp.experts.116.down_proj.weight', 'model.layers.92.mlp.experts.116.gate_proj.weight', 'model.layers.92.mlp.experts.116.up_proj.weight', 'model.layers.92.mlp.experts.117.down_proj.weight', 'model.layers.92.mlp.experts.117.gate_proj.weight', 'model.layers.92.mlp.experts.117.up_proj.weight', 'model.layers.92.mlp.experts.118.down_proj.weight', 'model.layers.92.mlp.experts.118.gate_proj.weight', 'model.layers.92.mlp.experts.118.up_proj.weight', 'model.layers.92.mlp.experts.119.down_proj.weight', 'model.layers.92.mlp.experts.119.gate_proj.weight', 'model.layers.92.mlp.experts.119.up_proj.weight', 'model.layers.92.mlp.experts.12.down_proj.weight', 'model.layers.92.mlp.experts.12.gate_proj.weight', 'model.layers.92.mlp.experts.12.up_proj.weight', 'model.layers.92.mlp.experts.120.down_proj.weight', 'model.layers.92.mlp.experts.120.gate_proj.weight', 'model.layers.92.mlp.experts.120.up_proj.weight', 'model.layers.92.mlp.experts.121.down_proj.weight', 'model.layers.92.mlp.experts.121.gate_proj.weight', 'model.layers.92.mlp.experts.121.up_proj.weight', 'model.layers.92.mlp.experts.122.down_proj.weight', 'model.layers.92.mlp.experts.122.gate_proj.weight', 'model.layers.92.mlp.experts.122.up_proj.weight', 'model.layers.92.mlp.experts.123.down_proj.weight', 'model.layers.92.mlp.experts.123.gate_proj.weight', 'model.layers.92.mlp.experts.123.up_proj.weight', 'model.layers.92.mlp.experts.124.down_proj.weight', 'model.layers.92.mlp.experts.124.gate_proj.weight', 'model.layers.92.mlp.experts.124.up_proj.weight', 'model.layers.92.mlp.experts.125.down_proj.weight', 'model.layers.92.mlp.experts.125.gate_proj.weight', 'model.layers.92.mlp.experts.125.up_proj.weight', 'model.layers.92.mlp.experts.126.down_proj.weight', 'model.layers.92.mlp.experts.126.gate_proj.weight', 'model.layers.92.mlp.experts.126.up_proj.weight', 'model.layers.92.mlp.experts.127.down_proj.weight', 'model.layers.92.mlp.experts.127.gate_proj.weight', 'model.layers.92.mlp.experts.127.up_proj.weight', 'model.layers.92.mlp.experts.128.down_proj.weight', 'model.layers.92.mlp.experts.128.gate_proj.weight', 'model.layers.92.mlp.experts.128.up_proj.weight', 'model.layers.92.mlp.experts.129.down_proj.weight', 'model.layers.92.mlp.experts.129.gate_proj.weight', 'model.layers.92.mlp.experts.129.up_proj.weight', 'model.layers.92.mlp.experts.13.down_proj.weight', 'model.layers.92.mlp.experts.13.gate_proj.weight', 'model.layers.92.mlp.experts.13.up_proj.weight', 'model.layers.92.mlp.experts.130.down_proj.weight', 'model.layers.92.mlp.experts.130.gate_proj.weight', 'model.layers.92.mlp.experts.130.up_proj.weight', 'model.layers.92.mlp.experts.131.down_proj.weight', 'model.layers.92.mlp.experts.131.gate_proj.weight', 'model.layers.92.mlp.experts.131.up_proj.weight', 'model.layers.92.mlp.experts.132.down_proj.weight', 'model.layers.92.mlp.experts.132.gate_proj.weight', 'model.layers.92.mlp.experts.132.up_proj.weight', 'model.layers.92.mlp.experts.133.down_proj.weight', 'model.layers.92.mlp.experts.133.gate_proj.weight', 'model.layers.92.mlp.experts.133.up_proj.weight', 'model.layers.92.mlp.experts.134.down_proj.weight', 'model.layers.92.mlp.experts.134.gate_proj.weight', 'model.layers.92.mlp.experts.134.up_proj.weight', 'model.layers.92.mlp.experts.135.down_proj.weight', 'model.layers.92.mlp.experts.135.gate_proj.weight', 'model.layers.92.mlp.experts.135.up_proj.weight', 'model.layers.92.mlp.experts.136.down_proj.weight', 'model.layers.92.mlp.experts.136.gate_proj.weight', 'model.layers.92.mlp.experts.136.up_proj.weight', 'model.layers.92.mlp.experts.137.down_proj.weight', 'model.layers.92.mlp.experts.137.gate_proj.weight', 'model.layers.92.mlp.experts.137.up_proj.weight', 'model.layers.92.mlp.experts.138.down_proj.weight', 'model.layers.92.mlp.experts.138.gate_proj.weight', 'model.layers.92.mlp.experts.138.up_proj.weight', 'model.layers.92.mlp.experts.139.down_proj.weight', 'model.layers.92.mlp.experts.139.gate_proj.weight', 'model.layers.92.mlp.experts.139.up_proj.weight', 'model.layers.92.mlp.experts.14.down_proj.weight', 'model.layers.92.mlp.experts.14.gate_proj.weight', 'model.layers.92.mlp.experts.14.up_proj.weight', 'model.layers.92.mlp.experts.140.down_proj.weight', 'model.layers.92.mlp.experts.140.gate_proj.weight', 'model.layers.92.mlp.experts.140.up_proj.weight', 'model.layers.92.mlp.experts.141.down_proj.weight', 'model.layers.92.mlp.experts.141.gate_proj.weight', 'model.layers.92.mlp.experts.141.up_proj.weight', 'model.layers.92.mlp.experts.142.down_proj.weight', 'model.layers.92.mlp.experts.142.gate_proj.weight', 'model.layers.92.mlp.experts.142.up_proj.weight', 'model.layers.92.mlp.experts.143.down_proj.weight', 'model.layers.92.mlp.experts.143.gate_proj.weight', 'model.layers.92.mlp.experts.143.up_proj.weight', 'model.layers.92.mlp.experts.144.down_proj.weight', 'model.layers.92.mlp.experts.144.gate_proj.weight', 'model.layers.92.mlp.experts.144.up_proj.weight', 'model.layers.92.mlp.experts.145.down_proj.weight', 'model.layers.92.mlp.experts.145.gate_proj.weight', 'model.layers.92.mlp.experts.145.up_proj.weight', 'model.layers.92.mlp.experts.146.down_proj.weight', 'model.layers.92.mlp.experts.146.gate_proj.weight', 'model.layers.92.mlp.experts.146.up_proj.weight', 'model.layers.92.mlp.experts.147.down_proj.weight', 'model.layers.92.mlp.experts.147.gate_proj.weight', 'model.layers.92.mlp.experts.147.up_proj.weight', 'model.layers.92.mlp.experts.148.down_proj.weight', 'model.layers.92.mlp.experts.148.gate_proj.weight', 'model.layers.92.mlp.experts.148.up_proj.weight', 'model.layers.92.mlp.experts.149.down_proj.weight', 'model.layers.92.mlp.experts.149.gate_proj.weight', 'model.layers.92.mlp.experts.149.up_proj.weight', 'model.layers.92.mlp.experts.15.down_proj.weight', 'model.layers.92.mlp.experts.15.gate_proj.weight', 'model.layers.92.mlp.experts.15.up_proj.weight', 'model.layers.92.mlp.experts.150.down_proj.weight', 'model.layers.92.mlp.experts.150.gate_proj.weight', 'model.layers.92.mlp.experts.150.up_proj.weight', 'model.layers.92.mlp.experts.151.down_proj.weight', 'model.layers.92.mlp.experts.151.gate_proj.weight', 'model.layers.92.mlp.experts.151.up_proj.weight', 'model.layers.92.mlp.experts.152.down_proj.weight', 'model.layers.92.mlp.experts.152.gate_proj.weight', 'model.layers.92.mlp.experts.152.up_proj.weight', 'model.layers.92.mlp.experts.153.down_proj.weight', 'model.layers.92.mlp.experts.153.gate_proj.weight', 'model.layers.92.mlp.experts.153.up_proj.weight', 'model.layers.92.mlp.experts.154.down_proj.weight', 'model.layers.92.mlp.experts.154.gate_proj.weight', 'model.layers.92.mlp.experts.154.up_proj.weight', 'model.layers.92.mlp.experts.155.down_proj.weight', 'model.layers.92.mlp.experts.155.gate_proj.weight', 'model.layers.92.mlp.experts.155.up_proj.weight', 'model.layers.92.mlp.experts.156.down_proj.weight', 'model.layers.92.mlp.experts.156.gate_proj.weight', 'model.layers.92.mlp.experts.156.up_proj.weight', 'model.layers.92.mlp.experts.157.down_proj.weight', 'model.layers.92.mlp.experts.157.gate_proj.weight', 'model.layers.92.mlp.experts.157.up_proj.weight', 'model.layers.92.mlp.experts.158.down_proj.weight', 'model.layers.92.mlp.experts.158.gate_proj.weight', 'model.layers.92.mlp.experts.158.up_proj.weight', 'model.layers.92.mlp.experts.159.down_proj.weight', 'model.layers.92.mlp.experts.159.gate_proj.weight', 'model.layers.92.mlp.experts.159.up_proj.weight', 'model.layers.92.mlp.experts.16.down_proj.weight', 'model.layers.92.mlp.experts.16.gate_proj.weight', 'model.layers.92.mlp.experts.16.up_proj.weight', 'model.layers.92.mlp.experts.17.down_proj.weight', 'model.layers.92.mlp.experts.17.gate_proj.weight', 'model.layers.92.mlp.experts.17.up_proj.weight', 'model.layers.92.mlp.experts.18.down_proj.weight', 'model.layers.92.mlp.experts.18.gate_proj.weight', 'model.layers.92.mlp.experts.18.up_proj.weight', 'model.layers.92.mlp.experts.19.down_proj.weight', 'model.layers.92.mlp.experts.19.gate_proj.weight', 'model.layers.92.mlp.experts.19.up_proj.weight', 'model.layers.92.mlp.experts.2.down_proj.weight', 'model.layers.92.mlp.experts.2.gate_proj.weight', 'model.layers.92.mlp.experts.2.up_proj.weight', 'model.layers.92.mlp.experts.20.down_proj.weight', 'model.layers.92.mlp.experts.20.gate_proj.weight', 'model.layers.92.mlp.experts.20.up_proj.weight', 'model.layers.92.mlp.experts.21.down_proj.weight', 'model.layers.92.mlp.experts.21.gate_proj.weight', 'model.layers.92.mlp.experts.21.up_proj.weight', 'model.layers.92.mlp.experts.22.down_proj.weight', 'model.layers.92.mlp.experts.22.gate_proj.weight', 'model.layers.92.mlp.experts.22.up_proj.weight', 'model.layers.92.mlp.experts.23.down_proj.weight', 'model.layers.92.mlp.experts.23.gate_proj.weight', 'model.layers.92.mlp.experts.23.up_proj.weight', 'model.layers.92.mlp.experts.24.down_proj.weight', 'model.layers.92.mlp.experts.24.gate_proj.weight', 'model.layers.92.mlp.experts.24.up_proj.weight', 'model.layers.92.mlp.experts.25.down_proj.weight', 'model.layers.92.mlp.experts.25.gate_proj.weight', 'model.layers.92.mlp.experts.25.up_proj.weight', 'model.layers.92.mlp.experts.26.down_proj.weight', 'model.layers.92.mlp.experts.26.gate_proj.weight', 'model.layers.92.mlp.experts.26.up_proj.weight', 'model.layers.92.mlp.experts.27.down_proj.weight', 'model.layers.92.mlp.experts.27.gate_proj.weight', 'model.layers.92.mlp.experts.27.up_proj.weight', 'model.layers.92.mlp.experts.28.down_proj.weight', 'model.layers.92.mlp.experts.28.gate_proj.weight', 'model.layers.92.mlp.experts.28.up_proj.weight', 'model.layers.92.mlp.experts.29.down_proj.weight', 'model.layers.92.mlp.experts.29.gate_proj.weight', 'model.layers.92.mlp.experts.29.up_proj.weight', 'model.layers.92.mlp.experts.3.down_proj.weight', 'model.layers.92.mlp.experts.3.gate_proj.weight', 'model.layers.92.mlp.experts.3.up_proj.weight', 'model.layers.92.mlp.experts.30.down_proj.weight', 'model.layers.92.mlp.experts.30.gate_proj.weight', 'model.layers.92.mlp.experts.30.up_proj.weight', 'model.layers.92.mlp.experts.31.down_proj.weight', 'model.layers.92.mlp.experts.31.gate_proj.weight', 'model.layers.92.mlp.experts.31.up_proj.weight', 'model.layers.92.mlp.experts.32.down_proj.weight', 'model.layers.92.mlp.experts.32.gate_proj.weight', 'model.layers.92.mlp.experts.32.up_proj.weight', 'model.layers.92.mlp.experts.33.down_proj.weight', 'model.layers.92.mlp.experts.33.gate_proj.weight', 'model.layers.92.mlp.experts.33.up_proj.weight', 'model.layers.92.mlp.experts.34.down_proj.weight', 'model.layers.92.mlp.experts.34.gate_proj.weight', 'model.layers.92.mlp.experts.34.up_proj.weight', 'model.layers.92.mlp.experts.35.down_proj.weight', 'model.layers.92.mlp.experts.35.gate_proj.weight', 'model.layers.92.mlp.experts.35.up_proj.weight', 'model.layers.92.mlp.experts.36.down_proj.weight', 'model.layers.92.mlp.experts.36.gate_proj.weight', 'model.layers.92.mlp.experts.36.up_proj.weight', 'model.layers.92.mlp.experts.37.down_proj.weight', 'model.layers.92.mlp.experts.37.gate_proj.weight', 'model.layers.92.mlp.experts.37.up_proj.weight', 'model.layers.92.mlp.experts.38.down_proj.weight', 'model.layers.92.mlp.experts.38.gate_proj.weight', 'model.layers.92.mlp.experts.38.up_proj.weight', 'model.layers.92.mlp.experts.39.down_proj.weight', 'model.layers.92.mlp.experts.39.gate_proj.weight', 'model.layers.92.mlp.experts.39.up_proj.weight', 'model.layers.92.mlp.experts.4.down_proj.weight', 'model.layers.92.mlp.experts.4.gate_proj.weight', 'model.layers.92.mlp.experts.4.up_proj.weight', 'model.layers.92.mlp.experts.40.down_proj.weight', 'model.layers.92.mlp.experts.40.gate_proj.weight', 'model.layers.92.mlp.experts.40.up_proj.weight', 'model.layers.92.mlp.experts.41.down_proj.weight', 'model.layers.92.mlp.experts.41.gate_proj.weight', 'model.layers.92.mlp.experts.41.up_proj.weight', 'model.layers.92.mlp.experts.42.down_proj.weight', 'model.layers.92.mlp.experts.42.gate_proj.weight', 'model.layers.92.mlp.experts.42.up_proj.weight', 'model.layers.92.mlp.experts.43.down_proj.weight', 'model.layers.92.mlp.experts.43.gate_proj.weight', 'model.layers.92.mlp.experts.43.up_proj.weight', 'model.layers.92.mlp.experts.44.down_proj.weight', 'model.layers.92.mlp.experts.44.gate_proj.weight', 'model.layers.92.mlp.experts.44.up_proj.weight', 'model.layers.92.mlp.experts.45.down_proj.weight', 'model.layers.92.mlp.experts.45.gate_proj.weight', 'model.layers.92.mlp.experts.45.up_proj.weight', 'model.layers.92.mlp.experts.46.down_proj.weight', 'model.layers.92.mlp.experts.46.gate_proj.weight', 'model.layers.92.mlp.experts.46.up_proj.weight', 'model.layers.92.mlp.experts.47.down_proj.weight', 'model.layers.92.mlp.experts.47.gate_proj.weight', 'model.layers.92.mlp.experts.47.up_proj.weight', 'model.layers.92.mlp.experts.48.down_proj.weight', 'model.layers.92.mlp.experts.48.gate_proj.weight', 'model.layers.92.mlp.experts.48.up_proj.weight', 'model.layers.92.mlp.experts.49.down_proj.weight', 'model.layers.92.mlp.experts.49.gate_proj.weight', 'model.layers.92.mlp.experts.49.up_proj.weight', 'model.layers.92.mlp.experts.5.down_proj.weight', 'model.layers.92.mlp.experts.5.gate_proj.weight', 'model.layers.92.mlp.experts.5.up_proj.weight', 'model.layers.92.mlp.experts.50.down_proj.weight', 'model.layers.92.mlp.experts.50.gate_proj.weight', 'model.layers.92.mlp.experts.50.up_proj.weight', 'model.layers.92.mlp.experts.51.down_proj.weight', 'model.layers.92.mlp.experts.51.gate_proj.weight', 'model.layers.92.mlp.experts.51.up_proj.weight', 'model.layers.92.mlp.experts.52.down_proj.weight', 'model.layers.92.mlp.experts.52.gate_proj.weight', 'model.layers.92.mlp.experts.52.up_proj.weight', 'model.layers.92.mlp.experts.53.down_proj.weight', 'model.layers.92.mlp.experts.53.gate_proj.weight', 'model.layers.92.mlp.experts.53.up_proj.weight', 'model.layers.92.mlp.experts.54.down_proj.weight', 'model.layers.92.mlp.experts.54.gate_proj.weight', 'model.layers.92.mlp.experts.54.up_proj.weight', 'model.layers.92.mlp.experts.55.down_proj.weight', 'model.layers.92.mlp.experts.55.gate_proj.weight', 'model.layers.92.mlp.experts.55.up_proj.weight', 'model.layers.92.mlp.experts.56.down_proj.weight', 'model.layers.92.mlp.experts.56.gate_proj.weight', 'model.layers.92.mlp.experts.56.up_proj.weight', 'model.layers.92.mlp.experts.57.down_proj.weight', 'model.layers.92.mlp.experts.57.gate_proj.weight', 'model.layers.92.mlp.experts.57.up_proj.weight', 'model.layers.92.mlp.experts.58.down_proj.weight', 'model.layers.92.mlp.experts.58.gate_proj.weight', 'model.layers.92.mlp.experts.58.up_proj.weight', 'model.layers.92.mlp.experts.59.down_proj.weight', 'model.layers.92.mlp.experts.59.gate_proj.weight', 'model.layers.92.mlp.experts.59.up_proj.weight', 'model.layers.92.mlp.experts.6.down_proj.weight', 'model.layers.92.mlp.experts.6.gate_proj.weight', 'model.layers.92.mlp.experts.6.up_proj.weight', 'model.layers.92.mlp.experts.60.down_proj.weight', 'model.layers.92.mlp.experts.60.gate_proj.weight', 'model.layers.92.mlp.experts.60.up_proj.weight', 'model.layers.92.mlp.experts.61.down_proj.weight', 'model.layers.92.mlp.experts.61.gate_proj.weight', 'model.layers.92.mlp.experts.61.up_proj.weight', 'model.layers.92.mlp.experts.62.down_proj.weight', 'model.layers.92.mlp.experts.62.gate_proj.weight', 'model.layers.92.mlp.experts.62.up_proj.weight', 'model.layers.92.mlp.experts.63.down_proj.weight', 'model.layers.92.mlp.experts.63.gate_proj.weight', 'model.layers.92.mlp.experts.63.up_proj.weight', 'model.layers.92.mlp.experts.64.down_proj.weight', 'model.layers.92.mlp.experts.64.gate_proj.weight', 'model.layers.92.mlp.experts.64.up_proj.weight', 'model.layers.92.mlp.experts.65.down_proj.weight', 'model.layers.92.mlp.experts.65.gate_proj.weight', 'model.layers.92.mlp.experts.65.up_proj.weight', 'model.layers.92.mlp.experts.66.down_proj.weight', 'model.layers.92.mlp.experts.66.gate_proj.weight', 'model.layers.92.mlp.experts.66.up_proj.weight', 'model.layers.92.mlp.experts.67.down_proj.weight', 'model.layers.92.mlp.experts.67.gate_proj.weight', 'model.layers.92.mlp.experts.67.up_proj.weight', 'model.layers.92.mlp.experts.68.down_proj.weight', 'model.layers.92.mlp.experts.68.gate_proj.weight', 'model.layers.92.mlp.experts.68.up_proj.weight', 'model.layers.92.mlp.experts.69.down_proj.weight', 'model.layers.92.mlp.experts.69.gate_proj.weight', 'model.layers.92.mlp.experts.69.up_proj.weight', 'model.layers.92.mlp.experts.7.down_proj.weight', 'model.layers.92.mlp.experts.7.gate_proj.weight', 'model.layers.92.mlp.experts.7.up_proj.weight', 'model.layers.92.mlp.experts.70.down_proj.weight', 'model.layers.92.mlp.experts.70.gate_proj.weight', 'model.layers.92.mlp.experts.70.up_proj.weight', 'model.layers.92.mlp.experts.71.down_proj.weight', 'model.layers.92.mlp.experts.71.gate_proj.weight', 'model.layers.92.mlp.experts.71.up_proj.weight', 'model.layers.92.mlp.experts.72.down_proj.weight', 'model.layers.92.mlp.experts.72.gate_proj.weight', 'model.layers.92.mlp.experts.72.up_proj.weight', 'model.layers.92.mlp.experts.73.down_proj.weight', 'model.layers.92.mlp.experts.73.gate_proj.weight', 'model.layers.92.mlp.experts.73.up_proj.weight', 'model.layers.92.mlp.experts.74.down_proj.weight', 'model.layers.92.mlp.experts.74.gate_proj.weight', 'model.layers.92.mlp.experts.74.up_proj.weight', 'model.layers.92.mlp.experts.75.down_proj.weight', 'model.layers.92.mlp.experts.75.gate_proj.weight', 'model.layers.92.mlp.experts.75.up_proj.weight', 'model.layers.92.mlp.experts.76.down_proj.weight', 'model.layers.92.mlp.experts.76.gate_proj.weight', 'model.layers.92.mlp.experts.76.up_proj.weight', 'model.layers.92.mlp.experts.77.down_proj.weight', 'model.layers.92.mlp.experts.77.gate_proj.weight', 'model.layers.92.mlp.experts.77.up_proj.weight', 'model.layers.92.mlp.experts.78.down_proj.weight', 'model.layers.92.mlp.experts.78.gate_proj.weight', 'model.layers.92.mlp.experts.78.up_proj.weight', 'model.layers.92.mlp.experts.79.down_proj.weight', 'model.layers.92.mlp.experts.79.gate_proj.weight', 'model.layers.92.mlp.experts.79.up_proj.weight', 'model.layers.92.mlp.experts.8.down_proj.weight', 'model.layers.92.mlp.experts.8.gate_proj.weight', 'model.layers.92.mlp.experts.8.up_proj.weight', 'model.layers.92.mlp.experts.80.down_proj.weight', 'model.layers.92.mlp.experts.80.gate_proj.weight', 'model.layers.92.mlp.experts.80.up_proj.weight', 'model.layers.92.mlp.experts.81.down_proj.weight', 'model.layers.92.mlp.experts.81.gate_proj.weight', 'model.layers.92.mlp.experts.81.up_proj.weight', 'model.layers.92.mlp.experts.82.down_proj.weight', 'model.layers.92.mlp.experts.82.gate_proj.weight', 'model.layers.92.mlp.experts.82.up_proj.weight', 'model.layers.92.mlp.experts.83.down_proj.weight', 'model.layers.92.mlp.experts.83.gate_proj.weight', 'model.layers.92.mlp.experts.83.up_proj.weight', 'model.layers.92.mlp.experts.84.down_proj.weight', 'model.layers.92.mlp.experts.84.gate_proj.weight', 'model.layers.92.mlp.experts.84.up_proj.weight', 'model.layers.92.mlp.experts.85.down_proj.weight', 'model.layers.92.mlp.experts.85.gate_proj.weight', 'model.layers.92.mlp.experts.85.up_proj.weight', 'model.layers.92.mlp.experts.86.down_proj.weight', 'model.layers.92.mlp.experts.86.gate_proj.weight', 'model.layers.92.mlp.experts.86.up_proj.weight', 'model.layers.92.mlp.experts.87.down_proj.weight', 'model.layers.92.mlp.experts.87.gate_proj.weight', 'model.layers.92.mlp.experts.87.up_proj.weight', 'model.layers.92.mlp.experts.88.down_proj.weight', 'model.layers.92.mlp.experts.88.gate_proj.weight', 'model.layers.92.mlp.experts.88.up_proj.weight', 'model.layers.92.mlp.experts.89.down_proj.weight', 'model.layers.92.mlp.experts.89.gate_proj.weight', 'model.layers.92.mlp.experts.89.up_proj.weight', 'model.layers.92.mlp.experts.9.down_proj.weight', 'model.layers.92.mlp.experts.9.gate_proj.weight', 'model.layers.92.mlp.experts.9.up_proj.weight', 'model.layers.92.mlp.experts.90.down_proj.weight', 'model.layers.92.mlp.experts.90.gate_proj.weight', 'model.layers.92.mlp.experts.90.up_proj.weight', 'model.layers.92.mlp.experts.91.down_proj.weight', 'model.layers.92.mlp.experts.91.gate_proj.weight', 'model.layers.92.mlp.experts.91.up_proj.weight', 'model.layers.92.mlp.experts.92.down_proj.weight', 'model.layers.92.mlp.experts.92.gate_proj.weight', 'model.layers.92.mlp.experts.92.up_proj.weight', 'model.layers.92.mlp.experts.93.down_proj.weight', 'model.layers.92.mlp.experts.93.gate_proj.weight', 'model.layers.92.mlp.experts.93.up_proj.weight', 'model.layers.92.mlp.experts.94.down_proj.weight', 'model.layers.92.mlp.experts.94.gate_proj.weight', 'model.layers.92.mlp.experts.94.up_proj.weight', 'model.layers.92.mlp.experts.95.down_proj.weight', 'model.layers.92.mlp.experts.95.gate_proj.weight', 'model.layers.92.mlp.experts.95.up_proj.weight', 'model.layers.92.mlp.experts.96.down_proj.weight', 'model.layers.92.mlp.experts.96.gate_proj.weight', 'model.layers.92.mlp.experts.96.up_proj.weight', 'model.layers.92.mlp.experts.97.down_proj.weight', 'model.layers.92.mlp.experts.97.gate_proj.weight', 'model.layers.92.mlp.experts.97.up_proj.weight', 'model.layers.92.mlp.experts.98.down_proj.weight', 'model.layers.92.mlp.experts.98.gate_proj.weight', 'model.layers.92.mlp.experts.98.up_proj.weight', 'model.layers.92.mlp.experts.99.down_proj.weight', 'model.layers.92.mlp.experts.99.gate_proj.weight', 'model.layers.92.mlp.experts.99.up_proj.weight', 'model.layers.92.mlp.gate.e_score_correction_bias', 'model.layers.92.mlp.gate.weight', 'model.layers.92.mlp.shared_experts.down_proj.weight', 'model.layers.92.mlp.shared_experts.gate_proj.weight', 'model.layers.92.mlp.shared_experts.up_proj.weight', 'model.layers.92.post_attention_layernorm.weight', 'model.layers.92.self_attn.k_norm.weight', 'model.layers.92.self_attn.k_proj.bias', 'model.layers.92.self_attn.k_proj.weight', 'model.layers.92.self_attn.o_proj.weight', 'model.layers.92.self_attn.q_norm.weight', 'model.layers.92.self_attn.q_proj.bias', 'model.layers.92.self_attn.q_proj.weight', 'model.layers.92.self_attn.v_proj.bias', 'model.layers.92.self_attn.v_proj.weight', 'model.layers.92.shared_head.norm.weight']