phi-3-mini support #10913

aoke79 · 2024-04-29T08:38:34Z

Dear,
i've tried phi-3-mini with phi-2 code, and found below errors, might you please have a look at it?

(env_p311) C:\AIGC\llama\ipex-llm\python\llm\example\GPU\HF-Transformers-AutoModels\Model\phi-3-ed>python ./generate.py --repo-id-or-model-path "C:\AIGC\hf\Phi-3-mini-128k-instruct" --prompt "What is AI?"
C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: 'Could not find module 'C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax.'If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
warn(
2024-04-29 16:28:26,539 - INFO - intel_extension_for_pytorch auto imported
2024-04-29 16:28:26,695 - WARNING - flash-attention package not found, consider installing for better performance: No module named 'flash_attn'.
2024-04-29 16:28:26,695 - WARNING - Current flash-attenton does not support window_size. Either upgrade or use attn_implementation='eager'.
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:00<00:00, 14.75it/s]
2024-04-29 16:28:26,869 - INFO - Converting the current model to sym_int4 format......
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2024-04-29 16:32:54,511 - WARNING - You are not running the flash-attention implementation, expect numerical differences.
Traceback (most recent call last):
File "C:\AIGC\llama\ipex-llm\python\llm\example\GPU\HF-Transformers-AutoModels\Model\phi-3-ed\generate.py", line 64, in
output = model.generate(input_ids,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\ipex_llm\transformers\lookup.py", line 86, in generate
return original_generate(self,
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\ipex_llm\transformers\speculative.py", line 103, in generate
return original_generate(self,
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\transformers\generation\utils.py", line 1474, in generate
return self.greedy_search(
^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\transformers\generation\utils.py", line 2375, in greedy_search
next_tokens = next_tokens * unfinished_sequences + pad_token_id * (1 - unfinished_sequences)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for +: 'Tensor' and 'list'

(env_p311) C:\AIGC\llama\ipex-llm\python\llm\example\GPU\HF-Transformers-AutoModels\Model\phi-3-ed>C:\AIGC\hf\Phi-3-mini-128k-instruct" --prompt "What is AI?"
'C:\AIGC\hf\Phi-3-mini-128k-instruct" --prompt "What' is not recognized as an internal or external command,
operable program or batch file.

(env_p311) C:\AIGC\llama\ipex-llm\python\llm\example\GPU\HF-Transformers-AutoModels\Model\phi-3-ed>

The text was updated successfully, but these errors were encountered:

JinBridger · 2024-04-30T02:55:49Z

Hi, aoke79,

We recently added the Phi-3 example for both CPU and GPU. Could you please try it to see if it works?

Here's link for phi-3 GPU example: link. Please feel free to ask if there's any problem. :)

Edward-Lin · 2024-05-21T02:39:01Z

it works, thanks a lot! some warning like below, if it's ok, please help close this ticket. thanks a lot

2024-05-21 10:29:49,985 - INFO - intel_extension_for_pytorch auto imported
2024-05-21 10:29:50,037 - WARNING - flash-attention package not found, consider installing for better performance: No module named 'flash_attn'.
2024-05-21 10:29:50,037 - WARNING - Current flash-attention does not support window_size. Either upgrade or use attn_implementation='eager'.
Loading checkpoint shards: 100%|███████████████████████████████████████████| 2/2 [00:00<00:00, 15.89it/s]
2024-05-21 10:29:50,206 - INFO - Converting the current model to sym_int4 format......
C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\torch\nn\init.py:412: UserWarning: Initializing zero-element tensors is a no-op
warnings.warn("Initializing zero-element tensors is a no-op")
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tunedor trained.
C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\ipex_llm\transformers\models\phi3.py:93: UserWarning: You are not running the flash-attention implementation, expect numerical differences.
warnings.warn("You are not running the flash-attention implementation, "

Oscilloscope98 assigned JinBridger Apr 30, 2024

qiuxin2012 added the user issue label Apr 30, 2024

Oscilloscope98 closed this as completed May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

phi-3-mini support #10913

phi-3-mini support #10913

aoke79 commented Apr 29, 2024

JinBridger commented Apr 30, 2024

Edward-Lin commented May 21, 2024

phi-3-mini support #10913

phi-3-mini support #10913

Comments

aoke79 commented Apr 29, 2024

JinBridger commented Apr 30, 2024

Edward-Lin commented May 21, 2024