Support AutoRound quantization method for intel GPU #1428

PenghuiCheng · 2024-03-27T05:27:46Z

Type of Change

support auto-round quantization for intel GPU
No API changed

Description

support auto-round quantization for intel GPU

Expected Behavior & Potential Risk

Quantized model with the auto-round method.

How has this PR been tested?

Local tested

Signed-off-by: Cheng Penghui <penghui.cheng@intel.com>

github-actions · 2024-03-27T05:28:09Z

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Format Scan Tests workflow

Check ID	Status
format-scan (pylint)	success	✅
format-scan (bandit)	success	✅
format-scan (cloc)	success	✅
format-scan (cpplint)	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/evaluator.py, intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py.

🟢 Optimize Unit Test workflow

Check ID	Status
optimize-unit-test-baseline	success	✅
optimize-unit-test-PR-test	success	✅
Genreate-OptimizeUT-Report	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/evaluator.py, intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py.

🟢 NeuralChat Unit Test

Check ID	Status
neuralchat-unit-test-baseline	success	✅
neuralchat-unit-test-PR-test	success	✅
Generate-NeuralChat-Report	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/evaluator.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py.

🟢 Engine Unit Test workflow

Check ID	Status
engine-unit-test-baseline	success	✅
engine-unit-test-PR-test	success	✅
Genreate-Engine-Report	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/evaluator.py, intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py.

🟢 Chat Bot Test workflow

Check ID	Status	Error details
call-inference-llama-2-7b-chat-hf / inference test	success		✅
call-inference-mpt-7b-chat / inference test	success		✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/evaluator.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py.

Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact VincyZhang or XuehaoSun for help.

a32543254

LGTM

intel_extension_for_transformers/transformers/utils/config.py

Signed-off-by: Wenxin Zhang <wenxin.zhang@intel.com>

Signed-off-by: Cheng Penghui <penghui.cheng@intel.com>