GGUF model wont save out (tried mulitple fixes) #3537
Replies: 3 comments 13 replies
-
|
I've been having this issue as well in Colab. I get an output after training fine but the GGUF file doesn't show up. Says something along the lines of "model not found" |
Beta Was this translation helpful? Give feedback.
-
|
@nolan-josh apologies for the issues you're having. |
Beta Was this translation helpful? Give feedback.
-
|
I check memory usage during merge. It not grow. For example Llama3_(8B)_Ollama with 2 NVIDIA GeForce RTX 4070 (2866MiB and
AttributeError: type object 'MODEL_ARCH' has no attribute 'RND1'
|
Beta Was this translation helpful? Give feedback.


Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone, very new to unsloth.
I taken this notebook and edited the dataset to read in my custom one, works fine for that and the inferencing works
BUT when i go to save out i get the following error:
`Unsloth: Preparing converter script...
INFO:unsloth_zoo.llama_cpp: Unsloth: Identifying llama.cpp gguf supported architectures...
ERROR:unsloth_zoo.llama_cpp: Unsloth: Error during download or introspection of original script: Failed to execute module convert_hf_to_gguf_original_gguf_yaxzp8q5 from /workspace/unsloth-notebooks/llama.cpp/original_gguf_yaxzp8q5.py
Traceback (most recent call last):
File "/opt/conda/lib/python3.11/site-packages/unsloth_zoo/llama_cpp.py", line 490, in _load_module_from_path
spec.loader.exec_module(module)
File "", line 940, in exec_module
File "", line 241, in _call_with_frames_removed
File "/workspace/unsloth-notebooks/llama.cpp/original_gguf_yaxzp8q5.py", line 4157, in
class Qwen3VLTextModel(Qwen3Model):
File "/workspace/unsloth-notebooks/llama.cpp/original_gguf_yaxzp8q5.py", line 4158, in Qwen3VLTextModel
model_arch = gguf.MODEL_ARCH.QWEN3VL
^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/enum.py", line 786, in getattr
raise AttributeError(name) from None
AttributeError: QWEN3VL
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/lib/python3.11/site-packages/unsloth_zoo/llama_cpp.py", line 535, in _download_convert_hf_to_gguf
module = _load_module_from_path(temp_original_file_path, original_module_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/unsloth_zoo/llama_cpp.py", line 494, in _load_module_from_path
raise ImportError(f"Failed to execute module {module_name} from {filepath}") from e
ImportError: Failed to execute module convert_hf_to_gguf_original_gguf_yaxzp8q5 from /workspace/unsloth-notebooks/llama.cpp/original_gguf_yaxzp8q5.py
AttributeError Traceback (most recent call last)
File /opt/conda/lib/python3.11/site-packages/unsloth_zoo/llama_cpp.py:490, in _load_module_from_path(filepath, module_name)
489 try:
--> 490 spec.loader.exec_module(module)
491 except Exception as e:
492 # Clean up registry if exec fails
File :940, in exec_module(self, module)
File :241, in _call_with_frames_removed(f, *args, **kwds)
File /workspace/unsloth-notebooks/llama.cpp/original_gguf_yaxzp8q5.py:4157
4153 return super().modify_tensors(data_torch, name, bid)
4156 @ModelBase.register("Qwen3VLForConditionalGeneration")
-> 4157 class Qwen3VLTextModel(Qwen3Model):
4158 model_arch = gguf.MODEL_ARCH.QWEN3VL
File /workspace/unsloth-notebooks/llama.cpp/original_gguf_yaxzp8q5.py:4158, in Qwen3VLTextModel()
4156 @ModelBase.register("Qwen3VLForConditionalGeneration")
4157 class Qwen3VLTextModel(Qwen3Model):
-> 4158 model_arch = gguf.MODEL_ARCH.QWEN3VL
4160 def set_gguf_parameters(self):
File /opt/conda/lib/python3.11/enum.py:786, in EnumType.getattr(cls, name)
785 except KeyError:
--> 786 raise AttributeError(name) from None
AttributeError: QWEN3VL
The above exception was the direct cause of the following exception:
ImportError Traceback (most recent call last)
File /opt/conda/lib/python3.11/site-packages/unsloth_zoo/llama_cpp.py:535, in _download_convert_hf_to_gguf(name)
534 try:
--> 535 module = _load_module_from_path(temp_original_file_path, original_module_name)
536 finally:
537 # Restore environment
File /opt/conda/lib/python3.11/site-packages/unsloth_zoo/llama_cpp.py:494, in _load_module_from_path(filepath, module_name)
493 del sys.modules[module_name]
--> 494 raise ImportError(f"Failed to execute module {module_name} from {filepath}") from e
495 return module
ImportError: Failed to execute module convert_hf_to_gguf_original_gguf_yaxzp8q5 from /workspace/unsloth-notebooks/llama.cpp/original_gguf_yaxzp8q5.py
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
File /opt/conda/lib/python3.11/site-packages/unsloth/save.py:1835, in unsloth_save_pretrained_gguf(self, save_directory, tokenizer, quantization_method, first_conversion, push_to_hub, token, private, is_main_process, state_dict, save_function, max_shard_size, safe_serialization, variant, save_peft_format, tags, temporary_location, maximum_memory_usage)
1834 try:
-> 1835 all_file_locations, want_full_precision, is_vlm_update = save_to_gguf(
1836 model_name=model_name,
1837 model_type=model_type,
1838 model_dtype=model_dtype,
1839 is_sentencepiece=False,
1840 model_directory=save_directory,
1841 quantization_method=quantization_methods,
1842 first_conversion=first_conversion,
1843 is_vlm=is_vlm, # Pass VLM flag
1844 is_gpt_oss = is_gpt_oss, # Pass gpt_oss Flag
1845 )
1846 except Exception as e:
File /opt/conda/lib/python3.11/site-packages/unsloth/save.py:1093, in save_to_gguf(model_name, model_type, model_dtype, is_sentencepiece, model_directory, quantization_method, first_conversion, is_vlm, is_gpt_oss)
1092 with use_local_gguf():
-> 1093 converter_path, supported_text_archs, supported_vision_archs = _download_convert_hf_to_gguf()
1095 # Step 3: Initial GGUF conversion
File /opt/conda/lib/python3.11/site-packages/unsloth_zoo/llama_cpp.py:598, in _download_convert_hf_to_gguf(name)
597 except OSError as remove_error: logger.warning(f"Could not remove temp file {temp_original_file_path}: {remove_error}")
--> 598 raise RuntimeError(f"Failed during download/introspection of original script: {e}") from e
599 finally:
RuntimeError: Failed during download/introspection of original script: Failed to execute module convert_hf_to_gguf_original_gguf_yaxzp8q5 from /workspace/unsloth-notebooks/llama.cpp/original_gguf_yaxzp8q5.py
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last)
Cell In[17], line 9
6 if False: model.push_to_hub_gguf("hf/model", tokenizer, token = "")
8 # Save to 16bit GGUF
----> 9 if True: model.save_pretrained_gguf("model", tokenizer, quantization_method = "f16")
10 if False: model.push_to_hub_gguf("hf/model", tokenizer, quantization_method = "f16", token = "")
11 print("Model Saved")
File /opt/conda/lib/python3.11/site-packages/unsloth/save.py:1855, in unsloth_save_pretrained_gguf(self, save_directory, tokenizer, quantization_method, first_conversion, push_to_hub, token, private, is_main_process, state_dict, save_function, max_shard_size, safe_serialization, variant, save_peft_format, tags, temporary_location, maximum_memory_usage)
1848 raise RuntimeError(
1849 f"Unsloth: GGUF conversion failed in Kaggle environment.\n"
1850 f"This is likely due to the 20GB disk space limit.\n"
1851 f"Try saving to /tmp directory or use a smaller model.\n"
1852 f"Error: {e}"
1853 )
1854 else:
-> 1855 raise RuntimeError(f"Unsloth: GGUF conversion failed: {e}")
1857 # Step 9: Create Ollama modelfile
1858 modelfile_location = None
RuntimeError: Unsloth: GGUF conversion failed: Failed during download/introspection of original script: Failed to execute module convert_hf_to_gguf_original_gguf_yaxzp8q5 from /workspace/unsloth-notebooks/llama.cpp/original_gguf_yaxzp8q5.py`
I'm using the Docker image on windows 11 with wsl - i have edited my docker setting in docker engine to have defaultKeepStorage to 100gb (it was 20 gb before) really not sure why its happening and very stuck I've tried other fixes from this discussions page but none work. I really would appreciate any help with this it's the last hurdle for me
Beta Was this translation helpful? Give feedback.
All reactions