-
Notifications
You must be signed in to change notification settings - Fork 282
Add autotune support for PT2E #2110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
neural_compressor/torch/algorithms/pt2e_quant/half_precision_rewriter.py
Show resolved
Hide resolved
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Besides feature PR, we'd better have a PR to update autotune related documents. maybe a non-LLM model example PR to demonstrate the usage in real case. |
Sure, will do it in a separate PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* [pre-commit.ci] pre-commit autoupdate (#2107) Signed-off-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sun, Xuehao <xuehao.sun@intel.com> * Update publication list with new blog (#2111) * Update publication list with new blog Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * Update publication list num Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> --------- Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * update the License (#2108) Signed-off-by: yiliu30 <yi4.liu@intel.com> * Add autotune support for PT2E (#2110) Add autotune support for PT2E and disable some conv1d-related test on HPU --------- Signed-off-by: yiliu30 <yi4.liu@intel.com> Co-authored-by: Xin He <xin3.he@intel.com> * Add intel-extension-for-pytorch to Transformers-like API requirements (#2113) Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * Add VLM quantization & loading into transformers-like API (#2116) Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sun, Xuehao <xuehao.sun@intel.com> * Fix hf_device_map setting for transformers-like api (#2122) Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * add mapping entry to 1.20 (#2126) Signed-off-by: Huang, Tai <tai.huang@intel.com> * fix bug of lwq gtpq (#2128) Signed-off-by: n1ck-guo <heng.guo@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * add xfail for onnx test_layer_wise.py (#2129) * Doc: Update fp8 accuracy test data and update docker image 1.20.0 (#2130) Signed-off-by: fengding <feng1.ding@intel.com> * [SW-219274] - Changing the quant method name in lm-head (#150) (#2132) * [SW-219274] - Changing the quant method name in lm-head (#150) * Update helper_modules.py --------- Co-authored-by: Nir David <124874956+nirda7@users.noreply.github.com> * Adapt ipex xpu transformers version (#2134) Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add phi3 vlm transformers example (#2135) Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> * fix saving issue for group_size=-1 (#2138) Signed-off-by: xin3he <xin3.he@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Add v3.3 release faq (#2139) Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sun, Xuehao <xuehao.sun@intel.com> * bump release version into 3.3 (#2140) Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * upgrade numpy from 1.23.5 to 1.26.4 (#2115) Signed-off-by: xin3he <xin3.he@intel.com> * Update publications (#2145) * update publications Signed-off-by: chensuyue <suyue.chen@intel.com> * Add transformers to align onnxruntime-extensions=1.14.0 (#2147) Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * Freeze 2x package versions (#2151) Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * fix vulnerability (#2149) Signed-off-by: xin3he <xin3.he@intel.com> * Bump into v3.3.1 (#2152) Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> * [SW-218939] fix memory mapping failure in UT (#2154) Signed-off-by: Xin He <xinhe3@habana.ai> * [SW-223106] change code with robust implementation (#2153) * fix ILITV-3854 Signed-off-by: xin3he <xin3.he@intel.com> * fix sdxl_smooth_quant Signed-off-by: xin3he <xin3.he@intel.com> * workaround for ILITV-3858 Signed-off-by: xin3he <xin3.he@intel.com> * fix ILITV-3859 Signed-off-by: xin3he <xin3.he@intel.com> * add xfail for torchvision Signed-off-by: Xin He <xinhe3@habana.ai> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Sun, Xuehao <xuehao.sun@intel.com> Signed-off-by: yiliu30 <yi4.liu@intel.com> Signed-off-by: Kaihui-intel <kaihui.tang@intel.com> Signed-off-by: Huang, Tai <tai.huang@intel.com> Signed-off-by: n1ck-guo <heng.guo@intel.com> Signed-off-by: fengding <feng1.ding@intel.com> Signed-off-by: xin3he <xin3.he@intel.com> Signed-off-by: chensuyue <suyue.chen@intel.com> Signed-off-by: Xin He <xinhe3@habana.ai> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sun, Xuehao <xuehao.sun@intel.com> Co-authored-by: Yi Liu <yi4.liu@intel.com> Co-authored-by: Xin He <xin3.he@intel.com> Co-authored-by: Kaihui-intel <kaihui.tang@intel.com> Co-authored-by: Huang, Tai <tai.huang@intel.com> Co-authored-by: Heng Guo <heng.guo@intel.com> Co-authored-by: fengding <feng1.ding@intel.com> Co-authored-by: Wang, Chang <chang1.wang@intel.com> Co-authored-by: Nir David <124874956+nirda7@users.noreply.github.com> Co-authored-by: chen, suyue <suyue.chen@intel.com> Co-authored-by: Xin He <xinhe3@habana.ai>
Type of Change
feature or bug fix or documentation or validation or others
API changed or not
Description
Add half precision transformation for linear and conv.