Issues: intel-analytics/ipex-llm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
default values of max_generated_tokens, top_k, top_p, and temperature?
#11033
opened May 15, 2024 by
JamieVC
Transform a string into input llama2-specific and llama3-specific input ?
user issue
#11026
opened May 15, 2024 by
JamieVC
how to switch to load multiple llm models in a streamlit page?
user issue
#11019
opened May 14, 2024 by
JamieVC
MTL Windows Qwen-VL AttributeError: 'QWenAttention' object has no attribute 'position_ids'
#11006
opened May 13, 2024 by
juan-OY
llama3-8B causes MTL iGPU runtime error when ipex-llm's running AI inference
#10999
opened May 13, 2024 by
zcwang
ipex-llm version 0510 has regression than 0430, especially for BS=16,32 and 8k input
#10994
opened May 12, 2024 by
Fred-cell
all-in-one tool for chatglm3-6b: 2nd latency of batch size 1 is larger than batch size 2
#10992
opened May 12, 2024 by
Fred-cell
Main Memory continued decline with ipex-llm for local LLM inference on Intel Arc GPU.
user issue
#10949
opened May 7, 2024 by
sunyijin
Speech T5 on XPU on Intel Arc GPU 770 taking 8 seconds and for CPU it takes 3 seconds ??
user issue
#10942
opened May 7, 2024 by
shailesh837
Unable to invoke the torch installed via the setup tutorial.
user issue
#10927
opened May 5, 2024 by
index1001012
2nd latency of llama3-8B-Instruct with int4 & all-in-one tool issue
user issue
#10926
opened May 5, 2024 by
Fred-cell
Performance drop for neural-chat 7b with new repo of ipex-llm(2.5.0b20240425) vllm serving.
user issue
#10924
opened May 3, 2024 by
Vasud-ha
IndexError: list index out of range when ipex_fp16_gpu test_api is used in all-in-one
user issue
#10914
opened Apr 29, 2024 by
Kpeacef
Improve First Token Latency for multi-GPU projects (by flash attention or alternative)
user issue
#10897
opened Apr 26, 2024 by
moutainriver
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.