Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MTL Linux Qwen-VL: LLVM ERROR: GenXCisaBuilder failed #11086

Open
lei-sun-intel opened this issue May 21, 2024 · 1 comment
Open

MTL Linux Qwen-VL: LLVM ERROR: GenXCisaBuilder failed #11086

lei-sun-intel opened this issue May 21, 2024 · 1 comment
Assignees

Comments

@lei-sun-intel
Copy link

Follow the guide to set up qwen-VL
ipex-llm version:2.1.0b20240515, python version 3.9
https://github.com/intel-analytics/ipex-llm/blob/main/python/llm/example/GPU/PyTorch-Models/Model/qwen-vl/chat.py

download model from https://hf-mirror.com/Qwen/Qwen-VL-Chat-Int4/tree/main

(nb_dev) intel@intel-Meteor-Lake-Client-Platform:~/lei/ipex-llm/python/llm/example/GPU/PyTorch-Models/Model/qwen-vl$ python chat.py
2024-05-16 00:15:59,668 - INFO - Note: NumExpr detected 22 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2024-05-16 00:15:59,668 - INFO - NumExpr defaulting to 8 threads.
2024-05-16 00:16:00,056 - WARNING - CUDA extension not installed.
2024-05-16 00:16:00,057 - WARNING - CUDA extension not installed.
2024-05-16 00:16:02,304 - INFO - intel_extension_for_pytorch auto imported
Using disable_exllama is deprecated and will be removed in version 4.37. Use use_exllama instead and specify the version with exllama_config.The value of use_exllama will be overwritten by disable_exllama passed in GPTQConfig or stored in your config file.
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 6.02it/s]
-------------------- Session 1 --------------------
Please input a picture: dog_cat.jpg
Please enter the text: what is the picture?
error: LLVM ERROR: GenXCisaBuilder failed for: < %.esimd133 = tail call <128 x float> @llvm.genx.dpas2.v128f32.v128f32.v128i32.v64i32(<128 x float> %.sroa.0196.027, <128 x i32> %.decomp.0, <64 x i32> %.esimd132, i32 10, i32 10, i32 8, i32 8, i32 1, i32 1)>: Intrinsic is not supported by platform

LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)
LIBXSMM_TARGET: adl [Intel(R) Core(TM) Ultra 7 155H]
Registry and code: 13 MB
Command: python chat.py
Uptime: 72.320985 s

@leonardozcm
Copy link
Contributor

leonardozcm commented May 28, 2024

We need to do some adaptation work for this gptj quantified qwen-vl model https://github.com/intel-analytics/ipex-llm/blob/main/python/llm/example/GPU/PyTorch-Models/Model/qwen-vl/chat.py. If your goal is to use the qwen-vl model on mtl linux, we recommend that you use save_low_bit to save Qwen-VL-Chat on other machines with sufficient memory. The int4 model in ipex-llm format, and then load it on mtl linux:

First check your linux driver and level zero version refer to" https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#linux

Then SAVE Qwen-VL:

# save low bit
    model = AutoModelForCausalLM.from_pretrained(model_path, 
                                                 load_in_4bit=True, 
                                                 trust_remote_code=True, 
                                                 modules_to_not_convert=['c_fc', 'out_proj'],
                                                 torch_dtype=torch.float32)
     model.save_low_bit(model_path+"-ipex-int4")

LOAD:

    model = AutoModelForCausalLM.load_low_bit(model_path+"-int4", 
                                                #  load_in_4bit=True, 
                                                 trust_remote_code=True, 
                                                 modules_to_not_convert=['c_fc', 'out_proj'],
                                                 torch_dtype=torch.float32)

output:

-------------------- Session 1 --------------------
 Please input a picture: test.jpg
 Please enter the text: what is it
---------- Response ----------
-------------------- Session 1 --------------------
 Please input a picture: pic.jpg
 Please enter the text: what is it?
---------- Response ----------
This is an anime scene. Against a blue sky, a black boy with wings on his back stands with his arms raised. Next to him is a white boy with a black cardigan and a tie.  

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants