Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate SQ and WOQ to INC 3.x API. #1606

Merged
merged 38 commits into from
Jul 11, 2024
Merged

Migrate SQ and WOQ to INC 3.x API. #1606

merged 38 commits into from
Jul 11, 2024

Conversation

changwangss
Copy link
Collaborator

@changwangss changwangss commented Jun 12, 2024

Type of Change

SQ, WOQ features based INC 3.x.
CI changes:
Because NeuralChat and engine are no longer updated, so remove it from CI.
WOQ changes:

  1. Remove weight dtype fp4_e2m1_bnb.
  2. nf4, fp4 weight dtype support compressed model from INC.

SQ changes:
use INC 3.x API.

Description

detail description
JIRA ticket: xxx

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

changwangss and others added 23 commits May 7, 2024 01:49
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Copy link

github-actions bot commented Jun 12, 2024

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Format Scan Tests workflow
Check ID Status Error details
format-scan (pylint) success
format-scan (bandit) success
format-scan (cloc) success
format-scan (cpplint) success

These checks are required after the changes to intel_extension_for_transformers/neural_chat/examples/finetuning/multi_modal/train.py, intel_extension_for_transformers/neural_chat/models/model_utils.py, intel_extension_for_transformers/transformers/llm/evaluation/models.py, intel_extension_for_transformers/transformers/llm/quantization/autograd/functions.py, intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/sq_utils.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py, intel_extension_for_transformers/transformers/utils/utility.py.

🟢 Optimize Unit Test workflow
Check ID Status Error details
optimize-unit-test-baseline success
optimize-unit-test-PR-test success
Genreate-OptimizeUT-Report success

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/evaluation/models.py, intel_extension_for_transformers/transformers/llm/quantization/autograd/functions.py, intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/sq_utils.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, intel_extension_for_transformers/transformers/utils/config.py, intel_extension_for_transformers/transformers/utils/utility.py, tests/CI/test_quantization.py, tests/CI/test_weight_only.py, tests/CI/test_weight_only_gpu.py.


Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact VincyZhang or XuehaoSun for help.

@changwangss
Copy link
Collaborator Author

@XuehaoSun please update CI support INC 3.x API installation.

changwangss and others added 3 commits June 13, 2024 19:37
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com>
@chensuyue
Copy link
Contributor

Any update for SQ part?

Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
@changwangss changwangss removed the WIP label Jul 11, 2024
@XuehaoSun XuehaoSun merged commit a864bb2 into main Jul 11, 2024
14 checks passed
@XuehaoSun XuehaoSun deleted the wangchang/inc3.x branch July 11, 2024 05:43
@airMeng
Copy link
Collaborator

airMeng commented Jul 12, 2024

Is the following error related with this?

Traceback (most recent call last):
  File "~/frameworks.ai.pytorch.ipex-gpu/examples/gpu/inference/python/llm/run_generation_woq.py", line 19, in <module>
    from intel_extension_for_transformers.transformers.modeling import AutoModelForCausalLM
  File "~/intel_extension_for_transformers-1.6.dev14+g79277b4d19b.gpu-py3.9.egg/intel_extension_for_transformers/transformers/__init__.py", line 44, in <module>
    from .modeling import (
  File "~/intel_extension_for_transformers-1.6.dev14+g79277b4d19b.gpu-py3.9.egg/intel_extension_for_transformers/transformers/modeling/__init__.py", line 21, in <module>
    from .modeling_auto import (AutoModel, AutoModelForCausalLM,
  File "~/intel_extension_for_transformers-1.6.dev14+g79277b4d19b.gpu-py3.9.egg/intel_extension_for_transformers/transformers/modeling/modeling_auto.py", line 63, in <module>
    from ..llm.quantization.utils import (
  File "~/intel_extension_for_transformers-1.6.dev14+g79277b4d19b.gpu-py3.9.egg/intel_extension_for_transformers/transformers/llm/quantization/utils.py", line 26, in <module>
    from neural_compressor.torch.algorithms.weight_only.modules import WeightOnlyLinear
ModuleNotFoundError: No module named 'neural_compressor.torch'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants