Add autotune support for PT2E #2110

yiliu30 · 2025-01-13T04:54:03Z

Type of Change

feature or bug fix or documentation or validation or others
API changed or not

Description

Add half precision transformation for linear and conv.

Signed-off-by: yiliu30 <yi4.liu@intel.com>

neural_compressor/torch/quantization/autotune.py

Signed-off-by: yiliu30 <yi4.liu@intel.com>

…essor into pt2e/autotune

Signed-off-by: yiliu30 <yi4.liu@intel.com>

…essor into pt2e/autotune

test/3x/torch/quantization/test_pt2e_quant.py

neural_compressor/torch/algorithms/pt2e_quant/half_precision_rewriter.py

Signed-off-by: yiliu30 <yi4.liu@intel.com>

test/3x/torch/quantization/test_pt2e_quant.py

thuang6 · 2025-01-23T08:56:47Z

Besides feature PR, we'd better have a PR to update autotune related documents. maybe a non-LLM model example PR to demonstrate the usage in real case.

yiliu30 · 2025-01-24T00:41:39Z

Besides feature PR, we'd better have a PR to update autotune related documents. maybe a non-LLM model example PR to demonstrate the usage in real case.

Sure, will do it in a separate PR.

thuang6

LGTM

* [pre-commit.ci] pre-commit autoupdate (#2107) Signed-off-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sun, Xuehao <xuehao.sun@intel.com> * Update publication list with new blog (#2111) * Update publication list with new blog Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * Update publication list num Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> --------- Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * update the License (#2108) Signed-off-by: yiliu30 <yi4.liu@intel.com> * Add autotune support for PT2E (#2110) Add autotune support for PT2E and disable some conv1d-related test on HPU --------- Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: Xin He <xin3.he@intel.com> * Add intel-extension-for-pytorch to Transformers-like API requirements (#2113) Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * Add VLM quantization & loading into transformers-like API (#2116) Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sun, Xuehao <xuehao.sun@intel.com> * Fix hf_device_map setting for transformers-like api (#2122) Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * add mapping entry to 1.20 (#2126) Signed-off-by: Huang, Tai <tai.huang@intel.com> * fix bug of lwq gtpq (#2128) Signed-off-by: n1ck-guo <heng.guo@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * add xfail for onnx test_layer_wise.py (#2129) * Doc: Update fp8 accuracy test data and update docker image 1.20.0 (#2130) Signed-off-by: fengding <feng1.ding@intel.com> * [SW-219274] - Changing the quant method name in lm-head (#150) (#2132) * [SW-219274] - Changing the quant method name in lm-head (#150) * Update helper_modules.py --------- Co-authored-by: Nir David <124874956+nirda7@users.noreply.github.com> * Adapt ipex xpu transformers version (#2134) Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add phi3 vlm transformers example (#2135) Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> * fix saving issue for group_size=-1 (#2138) Signed-off-by: xin3he <xin3.he@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add v3.3 release faq (#2139) Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sun, Xuehao <xuehao.sun@intel.com> * bump release version into 3.3 (#2140) Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * upgrade numpy from 1.23.5 to 1.26.4 (#2115) Signed-off-by: xin3he <xin3.he@intel.com> * Update publications (#2145) * update publications Signed-off-by: chensuyue <suyue.chen@intel.com> * Add transformers to align onnxruntime-extensions=1.14.0 (#2147) Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * Freeze 2x package versions (#2151) Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * fix vulnerability (#2149) Signed-off-by: xin3he <xin3.he@intel.com> * Bump into v3.3.1 (#2152) Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * [SW-218939] fix memory mapping failure in UT (#2154) Signed-off-by: Xin He <xinhe3@habana.ai> * [SW-223106] change code with robust implementation (#2153) * fix ILITV-3854 Signed-off-by: xin3he <xin3.he@intel.com> * fix sdxl_smooth_quant Signed-off-by: xin3he <xin3.he@intel.com> * workaround for ILITV-3858 Signed-off-by: xin3he <xin3.he@intel.com> * fix ILITV-3859 Signed-off-by: xin3he <xin3.he@intel.com> * add xfail for torchvision Signed-off-by: Xin He <xinhe3@habana.ai> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> Signed-off-by: yiliu30 <yi4.liu@intel.com> Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> Signed-off-by: Huang, Tai <tai.huang@intel.com> Signed-off-by: n1ck-guo <heng.guo@intel.com> Signed-off-by: fengding <feng1.ding@intel.com> Signed-off-by: xin3he <xin3.he@intel.com> Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Xin He <xinhe3@habana.ai> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sun, Xuehao <xuehao.sun@intel.com> Co-authored-by: Yi Liu <yi4.liu@intel.com> Co-authored-by: Xin He <xin3.he@intel.com> Co-authored-by: Kaihui-intel <kaihui.tang@intel.com> Co-authored-by: Huang, Tai <tai.huang@intel.com> Co-authored-by: Heng Guo <heng.guo@intel.com> Co-authored-by: fengding <feng1.ding@intel.com> Co-authored-by: Wang, Chang <chang1.wang@intel.com> Co-authored-by: Nir David <124874956+nirda7@users.noreply.github.com> Co-authored-by: chen, suyue <suyue.chen@intel.com> Co-authored-by: Xin He <xinhe3@habana.ai>

add bf16 for mixed precision

51d8210

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 added Auto-Tune PT2E PyTorch Related to PyTorch F/W labels Jan 13, 2025

add more ops

e4a4fb3

Signed-off-by: yiliu30 <yi4.liu@intel.com>

xin3he self-requested a review January 21, 2025 07:39

xin3he reviewed Jan 21, 2025

View reviewed changes

neural_compressor/torch/quantization/autotune.py Show resolved Hide resolved

add more ops

de59f73

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 requested review from Kaihui-intel and xin3he January 21, 2025 09:05

yiliu30 and others added 6 commits January 22, 2025 09:38

Merge branch 'master' into pt2e/autotune

404903d

add more uts

d7f99da

Signed-off-by: yiliu30 <yi4.liu@intel.com>

Merge branch 'pt2e/autotune' of https://github.com/intel/neural-compr…

12be4d2

…essor into pt2e/autotune

disable conv1d for rtn on HPU (#2112)

a468783

minor fix

ee23363

Signed-off-by: yiliu30 <yi4.liu@intel.com>

Merge branch 'pt2e/autotune' of https://github.com/intel/neural-compr…

bba35d4

…essor into pt2e/autotune

xin3he approved these changes Jan 23, 2025

View reviewed changes

test/3x/torch/quantization/test_pt2e_quant.py Outdated Show resolved Hide resolved

neural_compressor/torch/algorithms/pt2e_quant/half_precision_rewriter.py Show resolved Hide resolved

add op_type

c180739

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 requested review from XuehaoSun and thuang6 January 23, 2025 08:00

XuehaoSun approved these changes Jan 23, 2025

View reviewed changes

thuang6 reviewed Jan 23, 2025

View reviewed changes

test/3x/torch/quantization/test_pt2e_quant.py Show resolved Hide resolved

thuang6 approved these changes Jan 24, 2025

View reviewed changes

yiliu30 merged commit a617115 into master Jan 24, 2025
26 checks passed

yiliu30 deleted the pt2e/autotune branch January 24, 2025 01:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add autotune support for PT2E #2110

Add autotune support for PT2E #2110

Uh oh!

yiliu30 commented Jan 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

thuang6 commented Jan 23, 2025

Uh oh!

yiliu30 commented Jan 24, 2025

Uh oh!

thuang6 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add autotune support for PT2E #2110

Add autotune support for PT2E #2110

Uh oh!

Conversation

yiliu30 commented Jan 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Type of Change

Description

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

thuang6 commented Jan 23, 2025

Uh oh!

yiliu30 commented Jan 24, 2025

Uh oh!

thuang6 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yiliu30 commented Jan 13, 2025 •

edited

Loading