Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[inductor][cpu]mobilenet_v2_quantized_qat float32 single thread static/dynamic shape CPP/default wrapper performance regression in 2024-04-28 nightly release #125672

Open
zxd1997066 opened this issue May 7, 2024 · 1 comment
Assignees
Labels
module: inductor oncall: pt2 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@zxd1997066
Copy link
Contributor

zxd1997066 commented May 7, 2024

馃悰 Describe the bug

float32 static shape default wrapper

suite name thread batch_size_new speed_up_new inductor_new eager_new compilation_latency_new batch_size_old speed_up_old inductor_old eager_old compilation_latency_old Ratio Speedup(New/old) Eager Ratio(old/new) Inductor Ratio(old/new) Compilation_latency_Ratio(old/new)
torchbench mobilenet_v2_quantized_qat single 1 0.986425 0.006968785 0.006874183743625 0.096932 1 0.990124 0.006050362 0.005990608624888 0.093676 1.0 0.87 0.87 0.97

float32 static shape cpp wrapper

suite name thread batch_size_new speed_up_new inductor_new eager_new compilation_latency_new batch_size_old speed_up_old inductor_old eager_old compilation_latency_old Ratio Speedup(New/old) Eager Ratio(old/new) Inductor Ratio(old/new) Compilation_latency_Ratio(old/new)
torchbench mobilenet_v2_quantized_qat single 1 0.985836 0.006905878999999999 0.006808064129843999 0.096427 1 0.991287 0.006014836 0.005962428733932 0.093884 0.99 0.88 0.87 0.97

float32 dynamic shape cpp wrapper

suite name thread batch_size_new speed_up_new inductor_new eager_new compilation_latency_new batch_size_old speed_up_old inductor_old eager_old compilation_latency_old Ratio Speedup(New/old) Eager Ratio(old/new) Inductor Ratio(old/new) Compilation_latency_Ratio(old/new)
torchbench mobilenet_v2_quantized_qat single 1 0.985912 0.006797219 0.006701459778728 0.096001 1 0.988582 0.006000043 0.005931534509026 0.093643 1.0 0.89 0.88 0.98

SW info

name target_branch target_commit refer_branch refer_commit
torchbench main d6015d42 main d6015d42
torch main 7478b7f main bad8d25
torchvision main 0.19.0a0+2c4665f main 0.19.0a0+2c4665f
torchtext main 0.16.0a0+b0ebddc main 0.16.0a0+b0ebddc
torchaudio main 2.2.0a0+ea437b3 main 2.2.0a0+ea437b3
torchdata main 0.7.1a0+0790338 main 0.7.1a0+0790338
dynamo_benchmarks main nightly main nightly

Repro:
inductor_single_run.sh
bash inductor_single_run.sh single inference performance torchbench mobilenet_v2_quantized_qat float32 first dynamic/static default/cpp
Suspected guilty commit: e62169a
torchbench-mobilenet_v2_quantized_qat-inference-float32-dynamic-cpp-single-performance-drop_guilty_commit.log
cc @ezyang @msaroufim @bdhirsh @anijain2305 @chauhang @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @WeizhuoZhang-intel @chuanqi129

@jgong5
Copy link
Collaborator

jgong5 commented May 8, 2024

Both eager and inductor have the same performance degradation so I don't think it is inductor specific problem. Can we have a perf profiling checking what ops caused the regression? We can start from the eager mode.

@bdhirsh bdhirsh added oncall: pt2 module: inductor triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: inductor oncall: pt2 triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

3 participants