-
Notifications
You must be signed in to change notification settings - Fork 130
Description
Describe the bug
when i run test_qwen2_5_omni.py get error:
Traceback (most recent call last):
File "D:\miniconda3\envs\GPTAQ\Lib\importlib\metadata\__init__.py", line 563, in from_name
return next(cls.discover(name=name))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
StopIteration
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\StreamingMedia\quantize\GPTAQ\GPTQModel\tests\models\test_qwen2_5_omni.py", line 8, in <module>
from model_test import ModelTest
File "D:\StreamingMedia\quantize\GPTAQ\GPTQModel\tests\models\model_test.py", line 60, in <module>
from gptqmodel import BACKEND, DEBUG_ON, GPTQModel # noqa: E402
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\StreamingMedia\quantize\GPTAQ\GPTQModel\gptqmodel\__init__.py", line 14, in <module>
patch_triton_autotuner()
File "D:\StreamingMedia\quantize\GPTAQ\GPTQModel\gptqmodel\utils\nogil_patcher.py", line 46, in patch_triton_autotuner
triton_version_str = version("triton")
^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\importlib\metadata\__init__.py", line 1009, in version
return distribution(distribution_name).version
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\importlib\metadata\__init__.py", line 982, in distribution
return Distribution.from_name(distribution_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\importlib\metadata\__init__.py", line 565, in from_name
raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: No package metadata was found for triton
triton is installed when install torch2.8.0+xpu by:
pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 numpy==1.26.3 --index-url https://download.pytorch.org/whl/xpu
so i have pytorch-triton-xpu 3.4.0 in pip list, no triton. so i get above error.
when i replace the code :
triton_version_str = version("triton")
to
triton_version_str = version("pytorch-triton-xpu")
in nogil_patcher.py.
get new error:
DEBUG Device-SMI initialisation failed for `xpu:0`: `xpu-smi` is not installed. Please follow the instructions at https://github.com/intel/xpumanager/blob/master/doc/smi_install_guide.md `
WARN Calibration dataset size should be more than 256. Current: 20.
error occurred, command_result:
LoadPercentage
------------
error occurred, command_result:
LoadPercentage
------------
error occurred, command_result:
LoadPercentage
------------
error occurred, command_result:
LoadPercentage
------------
WARN Disk IO speed estimation failed: [Errno 13] Permission denied: 'C:\\Users\\jh_ni\\AppData\\Local\\Temp\\tmpa01ai1zf'
WARN Disk subsystem write throughput detected at 0.0 MB/s; quantization may be slowed by IO.
error occurred, command_result: yer inputs from 20 calibration batches
LoadPercentage inputs (Pre Qwen2_5OmniDecoderLayer) Batch 0/20 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░| 0:00:06 / 0:02:00 [0/20] 0.0%
error occurred, command_result:
LoadPercentage inputs (Pre Qwen2_5OmniDecoderLayer) Batch 0/20 ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░| 0:00:07 / 0:02:20 [0/20] 0.0%
------------
Traceback (most recent call last):
File "D:\hou\quantize\GPTAQ\GPTQModel\tests\models\test_qwen2_5_omni.py", line 95, in <module>
tester.test_qwen2_5_omni()
File "D:\hou\quantize\GPTAQ\GPTQModel\tests\models\test_qwen2_5_omni.py", line 27, in test_qwen2_5_omni
model, tokenizer, processor = self.quantModel(self.NATIVE_MODEL_ID, trust_remote_code=self.TRUST_REMOTE_CODE,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\hou\quantize\GPTAQ\GPTQModel\tests\models\model_test.py", line 900, in quantModel
model.quantize(
File "D:\hou\quantize\GPTAQ\GPTQModel\gptqmodel\models\base.py", line 671, in quantize
result = module_looper.loop(
^^^^^^^^^^^^^^^^^^^
File "D:\hou\quantize\GPTAQ\GPTQModel\gptqmodel\looper\module_looper.py", line 1075, in loop
return self._loop_impl(fail_safe=fail_safe, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\utils\_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\hou\quantize\GPTAQ\GPTQModel\gptqmodel\looper\module_looper.py", line 1120, in _loop_impl
input_cache = self.cache_inputs(layers=layers,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\hou\quantize\GPTAQ\GPTQModel\gptqmodel\looper\module_looper.py", line 1067, in cache_inputs
return capture_stage.cache_inputs(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\hou\quantize\GPTAQ\GPTQModel\gptqmodel\looper\stage_inputs_capture.py", line 189, in cache_inputs
self.gptq_model.model.generate(
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\utils\_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\transformers\models\qwen2_5_omni\modeling_qwen2_5_omni.py", line 3884, in generate
thinker_result = self.thinker.generate(input_ids=input_ids, **thinker_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\utils\_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\transformers\generation\utils.py", line 2564, in generate
result = decoding_method(
^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\transformers\generation\utils.py", line 2784, in _sample
outputs = self(**model_inputs, return_dict=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\transformers\models\qwen2_5_omni\modeling_qwen2_5_omni.py", line 1911, in forward
image_embeds = self.get_image_features(pixel_values, image_grid_thw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\transformers\models\qwen2_5_omni\modeling_qwen2_5_omni.py", line 1717, in get_image_features
image_embeds = self.visual(pixel_values, grid_thw=image_grid_thw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\transformers\models\qwen2_5_omni\modeling_qwen2_5_omni.py", line 1219, in forward
hidden_states = blk(
^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\transformers\modeling_layers.py", line 94, in __call__
return super().__call__(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\transformers\models\qwen2_5_omni\modeling_qwen2_5_omni.py", line 1007, in forward
hidden_states = hidden_states + self.attn(
^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\transformers\models\qwen2_5_omni\modeling_qwen2_5_omni.py", line 953, in forward
splits = [
^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\transformers\models\qwen2_5_omni\modeling_qwen2_5_omni.py", line 954, in <listcomp>
torch.split(tensor, lengths.tolist(), dim=2) for tensor in (query_states, key_states, value_states)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\functional.py", line 222, in split
return tensor.split(split_size_or_sections, dim)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\envs\GPTAQ\Lib\site-packages\torch\_tensor.py", line 1052, in split
return torch._VF.split_with_sizes(self, split_size, dim)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: split_with_sizes expects split_sizes have only non-negative entries, but got split_sizes=[-1777792000, 1116, 64, 64, 64, 48, 64, 64, 64, 64, 64, 48, 64, 64, 64, 64, 64, 48, 64, 64, 64, 64, 64, 48, 32, 32, 32, 32, 32, 24]
GPU Info
Intel(R) Arc(TM) 140T GPU (16GB)
driver: 32.0.101.8132
Software Info
windows11 + python311
Show output of:
pip show gptqmodel torch transformers accelerate triton
If you are reporting an inference bug of a post-quantized model, please post the content of config.json and quantize_config.json.
To Reproduce
How to reproduce this bug if possible.
Expected behavior
A clear and concise description of what you expected to happen.
Model/Datasets
Make sure your model/dataset is downloadable (on HF for example) so we can reproduce your issue.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.