Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

phi-3-mini support #10913

Closed
aoke79 opened this issue Apr 29, 2024 · 2 comments
Closed

phi-3-mini support #10913

aoke79 opened this issue Apr 29, 2024 · 2 comments
Assignees

Comments

@aoke79
Copy link

aoke79 commented Apr 29, 2024

Dear,
i've tried phi-3-mini with phi-2 code, and found below errors, might you please have a look at it?

(env_p311) C:\AIGC\llama\ipex-llm\python\llm\example\GPU\HF-Transformers-AutoModels\Model\phi-3-ed>python ./generate.py --repo-id-or-model-path "C:\AIGC\hf\Phi-3-mini-128k-instruct" --prompt "What is AI?"
C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: 'Could not find module 'C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax.'If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
warn(
2024-04-29 16:28:26,539 - INFO - intel_extension_for_pytorch auto imported
2024-04-29 16:28:26,695 - WARNING - flash-attention package not found, consider installing for better performance: No module named 'flash_attn'.
2024-04-29 16:28:26,695 - WARNING - Current flash-attenton does not support window_size. Either upgrade or use attn_implementation='eager'.
Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:00<00:00, 14.75it/s]
2024-04-29 16:28:26,869 - INFO - Converting the current model to sym_int4 format......
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2024-04-29 16:32:54,511 - WARNING - You are not running the flash-attention implementation, expect numerical differences.
Traceback (most recent call last):
File "C:\AIGC\llama\ipex-llm\python\llm\example\GPU\HF-Transformers-AutoModels\Model\phi-3-ed\generate.py", line 64, in
output = model.generate(input_ids,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\ipex_llm\transformers\lookup.py", line 86, in generate
return original_generate(self,
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\ipex_llm\transformers\speculative.py", line 103, in generate
return original_generate(self,
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\transformers\generation\utils.py", line 1474, in generate
return self.greedy_search(
^^^^^^^^^^^^^^^^^^^
File "C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\transformers\generation\utils.py", line 2375, in greedy_search
next_tokens = next_tokens * unfinished_sequences + pad_token_id * (1 - unfinished_sequences)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for +: 'Tensor' and 'list'

(env_p311) C:\AIGC\llama\ipex-llm\python\llm\example\GPU\HF-Transformers-AutoModels\Model\phi-3-ed>C:\AIGC\hf\Phi-3-mini-128k-instruct" --prompt "What is AI?"
'C:\AIGC\hf\Phi-3-mini-128k-instruct" --prompt "What' is not recognized as an internal or external command,
operable program or batch file.

(env_p311) C:\AIGC\llama\ipex-llm\python\llm\example\GPU\HF-Transformers-AutoModels\Model\phi-3-ed>

@JinBridger
Copy link
Member

Hi, aoke79,

We recently added the Phi-3 example for both CPU and GPU. Could you please try it to see if it works?

Here's link for phi-3 GPU example: link. Please feel free to ask if there's any problem. :)

@Edward-Lin
Copy link

it works, thanks a lot! some warning like below, if it's ok, please help close this ticket. thanks a lot

2024-05-21 10:29:49,985 - INFO - intel_extension_for_pytorch auto imported
2024-05-21 10:29:50,037 - WARNING - flash-attention package not found, consider installing for better performance: No module named 'flash_attn'.
2024-05-21 10:29:50,037 - WARNING - Current flash-attention does not support window_size. Either upgrade or use attn_implementation='eager'.
Loading checkpoint shards: 100%|███████████████████████████████████████████| 2/2 [00:00<00:00, 15.89it/s]
2024-05-21 10:29:50,206 - INFO - Converting the current model to sym_int4 format......
C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\torch\nn\init.py:412: UserWarning: Initializing zero-element tensors is a no-op
warnings.warn("Initializing zero-element tensors is a no-op")
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tunedor trained.
C:\ProgramData\anaconda3\envs\env_p311\Lib\site-packages\ipex_llm\transformers\models\phi3.py:93: UserWarning: You are not running the flash-attention implementation, expect numerical differences.
warnings.warn("You are not running the flash-attention implementation, "

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants