You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(nb_dev) intel@intel-Meteor-Lake-Client-Platform:~/lei/ipex-llm/python/llm/example/GPU/PyTorch-Models/Model/qwen-vl$ python chat.py
2024-05-16 00:15:59,668 - INFO - Note: NumExpr detected 22 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2024-05-16 00:15:59,668 - INFO - NumExpr defaulting to 8 threads.
2024-05-16 00:16:00,056 - WARNING - CUDA extension not installed.
2024-05-16 00:16:00,057 - WARNING - CUDA extension not installed.
2024-05-16 00:16:02,304 - INFO - intel_extension_for_pytorch auto imported
Using disable_exllama is deprecated and will be removed in version 4.37. Use use_exllama instead and specify the version with exllama_config.The value of use_exllama will be overwritten by disable_exllama passed in GPTQConfig or stored in your config file.
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 6.02it/s]
-------------------- Session 1 --------------------
Please input a picture: dog_cat.jpg
Please enter the text: what is the picture?
error: LLVM ERROR: GenXCisaBuilder failed for: < %.esimd133 = tail call <128 x float> @llvm.genx.dpas2.v128f32.v128f32.v128i32.v64i32(<128 x float> %.sroa.0196.027, <128 x i32> %.decomp.0, <64 x i32> %.esimd132, i32 10, i32 10, i32 8, i32 8, i32 1, i32 1)>: Intrinsic is not supported by platform
# save low bit
model = AutoModelForCausalLM.from_pretrained(model_path,
load_in_4bit=True,
trust_remote_code=True,
modules_to_not_convert=['c_fc', 'out_proj'],
torch_dtype=torch.float32)
model.save_low_bit(model_path+"-ipex-int4")
LOAD:
model = AutoModelForCausalLM.load_low_bit(model_path+"-int4",
# load_in_4bit=True,
trust_remote_code=True,
modules_to_not_convert=['c_fc', 'out_proj'],
torch_dtype=torch.float32)
output:
-------------------- Session 1 --------------------
Please input a picture: test.jpg
Please enter the text: what is it
---------- Response ----------
-------------------- Session 1 --------------------
Please input a picture: pic.jpg
Please enter the text: what is it?
---------- Response ----------
This is an anime scene. Against a blue sky, a black boy with wings on his back stands with his arms raised. Next to him is a white boy with a black cardigan and a tie.
Follow the guide to set up qwen-VL
ipex-llm version:2.1.0b20240515, python version 3.9
https://github.com/intel-analytics/ipex-llm/blob/main/python/llm/example/GPU/PyTorch-Models/Model/qwen-vl/chat.py
download model from https://hf-mirror.com/Qwen/Qwen-VL-Chat-Int4/tree/main
(nb_dev) intel@intel-Meteor-Lake-Client-Platform:~/lei/ipex-llm/python/llm/example/GPU/PyTorch-Models/Model/qwen-vl$ python chat.py
2024-05-16 00:15:59,668 - INFO - Note: NumExpr detected 22 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2024-05-16 00:15:59,668 - INFO - NumExpr defaulting to 8 threads.
2024-05-16 00:16:00,056 - WARNING - CUDA extension not installed.
2024-05-16 00:16:00,057 - WARNING - CUDA extension not installed.
2024-05-16 00:16:02,304 - INFO - intel_extension_for_pytorch auto imported
Using
disable_exllama
is deprecated and will be removed in version 4.37. Useuse_exllama
instead and specify the version withexllama_config
.The value ofuse_exllama
will be overwritten bydisable_exllama
passed inGPTQConfig
or stored in your config file.Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 6.02it/s]
-------------------- Session 1 --------------------
Please input a picture: dog_cat.jpg
Please enter the text: what is the picture?
error: LLVM ERROR: GenXCisaBuilder failed for: < %.esimd133 = tail call <128 x float> @llvm.genx.dpas2.v128f32.v128f32.v128i32.v64i32(<128 x float> %.sroa.0196.027, <128 x i32> %.decomp.0, <64 x i32> %.esimd132, i32 10, i32 10, i32 8, i32 8, i32 1, i32 1)>: Intrinsic is not supported by platform
LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)
LIBXSMM_TARGET: adl [Intel(R) Core(TM) Ultra 7 155H]
Registry and code: 13 MB
Command: python chat.py
Uptime: 72.320985 s
The text was updated successfully, but these errors were encountered: