-
Notifications
You must be signed in to change notification settings - Fork 216
[vLLM] QBits Perf Enhence #1581
Conversation
⚡ Required checks status: All passing 🟢Groups summary🟢 Format Scan Tests workflow
These checks are required after the changes to 🟢 Optimize Unit Test workflow
These checks are required after the changes to 🟢 NeuralChat Unit Test
These checks are required after the changes to 🟢 Engine Unit Test workflow
These checks are required after the changes to 🟢 Chat Bot Test workflow
These checks are required after the changes to Thank you for your contribution! 💜
|
intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Zhenzhong1 <109137058+Zhenzhong1@users.noreply.github.com>
|
@XuehaoSun Hi, if this |
|
ready for merge. |
Type of Change
Description
vLLM perf:

Expected Behavior & Potential Risk
N/A
How has this PR been tested?
Manually profiling.
Dependency Change?
N/A