update the baseline data for the operator benchmark #162693

LifengWang · 2025-09-11T05:03:58Z

According to the results of the last four operator benchmark runs, we found that five models achieved more than a 30% improvement compared to the baseline. Therefore, we will update the operator benchmark baseline data.
We use the average results from the four runs as the new baseline for the five models.

And add a pull request trigger for the operator benchmark workflow

Benchmarking Framework	Benchmarking Module Name	Case Name	tag	run_backward	baseline old	r1	r2	r3	r4	avg	speedup
PyTorch	add	add_M1_N1_K1_cpu	short	FALSE	3.9497	2.57	2.54	2.38	2.31	2.45	1.61
PyTorch	functional.hardtanh	functional.hardtanh_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8	short	FALSE	67.118	50.02	49.80	46.78	48.94	48.88	1.37
PyTorch	relu6	relu6_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8	short	FALSE	68.739	51.17	51.19	48.07	50.42	50.21	1.37
PyTorch	relu6	relu6_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8	short	FALSE	69.1875	51.97	52.77	50.00	51.24	51.50	1.34
PyTorch	functional.hardtanh	functional.hardtanh_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8	short	FALSE	67.436	50.98	51.69	49.06	49.87	50.40	1.34

@chuanqi129 @huydhn @desertfire @jainapurva

pytorch-bot · 2025-09-11T05:04:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162693

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 4720751 with merge base 6b59a19 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

LifengWang · 2025-09-11T05:06:46Z

@pytorchbot label "topic: not user facing"

LifengWang · 2025-09-11T05:15:42Z

@pytorchbot label "ciflow/op-benchmark"

pytorch-bot · 2025-09-11T05:15:51Z

To add these label(s) (ciflow/op-benchmark) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

LifengWang · 2025-09-11T05:18:56Z

@pytorchbot label "ciflow/op-benchmark"

huydhn · 2025-09-12T06:52:34Z

@pytorchbot rebase -b main

pytorchmergebot · 2025-09-12T06:54:59Z

@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here

pytorchmergebot · 2025-09-12T06:55:03Z

Successfully rebased update_baseline onto refs/remotes/origin/main, please pull locally before adding more changes (for example, via git checkout update_baseline && git pull --rebase)

jainapurva · 2025-09-12T06:56:02Z

@LifengWang The fix up PR for operator benchmarks #162744 has been landed in main. Please rebase to get the latest changes

jainapurva · 2025-09-12T18:07:50Z

@pytorchmergebot merge

pytorchmergebot · 2025-09-12T18:09:45Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

@chuanqi129

According to the results of the last four operator benchmark runs, we found that five models achieved more than a 30% improvement compared to the baseline. Therefore, we will update the operator benchmark baseline data. We use the average results from the four runs as the new baseline for the five models. And add a pull request trigger for the operator benchmark workflow Benchmarking Framework | Benchmarking Module Name | Case Name | tag | run_backward | baseline old | r1 | r2 | r3 | r4 | avg | speedup -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- PyTorch | add | add_M1_N1_K1_cpu | short | FALSE | 3.9497 | 2.57 | 2.54 | 2.38 | 2.31 | 2.45 | 1.61 PyTorch | functional.hardtanh | functional.hardtanh_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 67.118 | 50.02 | 49.80 | 46.78 | 48.94 | 48.88 | 1.37 PyTorch | relu6 | relu6_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 68.739 | 51.17 | 51.19 | 48.07 | 50.42 | 50.21 | 1.37 PyTorch | relu6 | relu6_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 69.1875 | 51.97 | 52.77 | 50.00 | 51.24 | 51.50 | 1.34 PyTorch | functional.hardtanh | functional.hardtanh_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 67.436 | 50.98 | 51.69 | 49.06 | 49.87 | 50.40 | 1.34 @chuanqi129 @huydhn @desertfire @jainapurva Pull Request resolved: pytorch#162693 Approved by: https://github.com/huydhn

@chuanqi129

According to the results of the last four operator benchmark runs, we found that five models achieved more than a 30% improvement compared to the baseline. Therefore, we will update the operator benchmark baseline data. We use the average results from the four runs as the new baseline for the five models. And add a pull request trigger for the operator benchmark workflow Benchmarking Framework | Benchmarking Module Name | Case Name | tag | run_backward | baseline old | r1 | r2 | r3 | r4 | avg | speedup -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- PyTorch | add | add_M1_N1_K1_cpu | short | FALSE | 3.9497 | 2.57 | 2.54 | 2.38 | 2.31 | 2.45 | 1.61 PyTorch | functional.hardtanh | functional.hardtanh_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 67.118 | 50.02 | 49.80 | 46.78 | 48.94 | 48.88 | 1.37 PyTorch | relu6 | relu6_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 68.739 | 51.17 | 51.19 | 48.07 | 50.42 | 50.21 | 1.37 PyTorch | relu6 | relu6_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 69.1875 | 51.97 | 52.77 | 50.00 | 51.24 | 51.50 | 1.34 PyTorch | functional.hardtanh | functional.hardtanh_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 67.436 | 50.98 | 51.69 | 49.06 | 49.87 | 50.40 | 1.34 @chuanqi129 @huydhn @desertfire @jainapurva Pull Request resolved: pytorch#162693 Approved by: https://github.com/huydhn

@chuanqi129

According to the results of the last four operator benchmark runs, we found that five models achieved more than a 30% improvement compared to the baseline. Therefore, we will update the operator benchmark baseline data. We use the average results from the four runs as the new baseline for the five models. And add a pull request trigger for the operator benchmark workflow Benchmarking Framework | Benchmarking Module Name | Case Name | tag | run_backward | baseline old | r1 | r2 | r3 | r4 | avg | speedup -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- PyTorch | add | add_M1_N1_K1_cpu | short | FALSE | 3.9497 | 2.57 | 2.54 | 2.38 | 2.31 | 2.45 | 1.61 PyTorch | functional.hardtanh | functional.hardtanh_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 67.118 | 50.02 | 49.80 | 46.78 | 48.94 | 48.88 | 1.37 PyTorch | relu6 | relu6_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 68.739 | 51.17 | 51.19 | 48.07 | 50.42 | 50.21 | 1.37 PyTorch | relu6 | relu6_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 69.1875 | 51.97 | 52.77 | 50.00 | 51.24 | 51.50 | 1.34 PyTorch | functional.hardtanh | functional.hardtanh_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 67.436 | 50.98 | 51.69 | 49.06 | 49.87 | 50.40 | 1.34 @chuanqi129 @huydhn @desertfire @jainapurva Pull Request resolved: pytorch#162693 Approved by: https://github.com/huydhn

@chuanqi129

According to the results of the last four operator benchmark runs, we found that five models achieved more than a 30% improvement compared to the baseline. Therefore, we will update the operator benchmark baseline data. We use the average results from the four runs as the new baseline for the five models. And add a pull request trigger for the operator benchmark workflow Benchmarking Framework | Benchmarking Module Name | Case Name | tag | run_backward | baseline old | r1 | r2 | r3 | r4 | avg | speedup -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- PyTorch | add | add_M1_N1_K1_cpu | short | FALSE | 3.9497 | 2.57 | 2.54 | 2.38 | 2.31 | 2.45 | 1.61 PyTorch | functional.hardtanh | functional.hardtanh_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 67.118 | 50.02 | 49.80 | 46.78 | 48.94 | 48.88 | 1.37 PyTorch | relu6 | relu6_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 68.739 | 51.17 | 51.19 | 48.07 | 50.42 | 50.21 | 1.37 PyTorch | relu6 | relu6_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 69.1875 | 51.97 | 52.77 | 50.00 | 51.24 | 51.50 | 1.34 PyTorch | functional.hardtanh | functional.hardtanh_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 67.436 | 50.98 | 51.69 | 49.06 | 49.87 | 50.40 | 1.34 @chuanqi129 @huydhn @desertfire @jainapurva Pull Request resolved: pytorch#162693 Approved by: https://github.com/huydhn

Camyll · 2025-10-06T21:44:41Z

@pytorchbot cherry-pick --onto release/2.9 --c critical

@chuanqi129

According to the results of the last four operator benchmark runs, we found that five models achieved more than a 30% improvement compared to the baseline. Therefore, we will update the operator benchmark baseline data. We use the average results from the four runs as the new baseline for the five models. And add a pull request trigger for the operator benchmark workflow Benchmarking Framework | Benchmarking Module Name | Case Name | tag | run_backward | baseline old | r1 | r2 | r3 | r4 | avg | speedup -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- PyTorch | add | add_M1_N1_K1_cpu | short | FALSE | 3.9497 | 2.57 | 2.54 | 2.38 | 2.31 | 2.45 | 1.61 PyTorch | functional.hardtanh | functional.hardtanh_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 67.118 | 50.02 | 49.80 | 46.78 | 48.94 | 48.88 | 1.37 PyTorch | relu6 | relu6_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 68.739 | 51.17 | 51.19 | 48.07 | 50.42 | 50.21 | 1.37 PyTorch | relu6 | relu6_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 69.1875 | 51.97 | 52.77 | 50.00 | 51.24 | 51.50 | 1.34 PyTorch | functional.hardtanh | functional.hardtanh_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 67.436 | 50.98 | 51.69 | 49.06 | 49.87 | 50.40 | 1.34 @chuanqi129 @huydhn @desertfire @jainapurva Pull Request resolved: #162693 Approved by: https://github.com/huydhn (cherry picked from commit f7ea497)

pytorchbot · 2025-10-06T21:50:37Z

Cherry picking #162693

The cherry pick PR is at #164789 and it is recommended to link a critical cherry pick PR with an issue. The following tracker issues are updated:

[v.2.9.0] Release Tracker #162497 (comment)

Details for Dev Infra team

Raised by workflow job

@chuanqi129

update the baseline data for the operator benchmark (#162693) According to the results of the last four operator benchmark runs, we found that five models achieved more than a 30% improvement compared to the baseline. Therefore, we will update the operator benchmark baseline data. We use the average results from the four runs as the new baseline for the five models. And add a pull request trigger for the operator benchmark workflow Benchmarking Framework | Benchmarking Module Name | Case Name | tag | run_backward | baseline old | r1 | r2 | r3 | r4 | avg | speedup -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- PyTorch | add | add_M1_N1_K1_cpu | short | FALSE | 3.9497 | 2.57 | 2.54 | 2.38 | 2.31 | 2.45 | 1.61 PyTorch | functional.hardtanh | functional.hardtanh_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 67.118 | 50.02 | 49.80 | 46.78 | 48.94 | 48.88 | 1.37 PyTorch | relu6 | relu6_dims(512 512)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 68.739 | 51.17 | 51.19 | 48.07 | 50.42 | 50.21 | 1.37 PyTorch | relu6 | relu6_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 69.1875 | 51.97 | 52.77 | 50.00 | 51.24 | 51.50 | 1.34 PyTorch | functional.hardtanh | functional.hardtanh_dims(256 1024)_contigFalse_inplaceFalse_dtypetorch.quint8 | short | FALSE | 67.436 | 50.98 | 51.69 | 49.06 | 49.87 | 50.40 | 1.34 @chuanqi129 @huydhn @desertfire @jainapurva Pull Request resolved: #162693 Approved by: https://github.com/huydhn (cherry picked from commit f7ea497) Co-authored-by: LifengWang <lifeng.a.wang@intel.com>

pytorch-bot bot added the topic: not user facing topic category label Sep 11, 2025

pytorchbot added the open source label Sep 11, 2025

pytorch-bot bot added the ciflow/op-benchmark Trigger microbenchmark for operations. label Sep 11, 2025

LifengWang requested a review from a team as a code owner September 11, 2025 08:23

pytorch-bot bot removed the ciflow/op-benchmark Trigger microbenchmark for operations. label Sep 11, 2025

soulitzer requested a review from jainapurva September 11, 2025 21:05

soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 11, 2025

huydhn approved these changes Sep 11, 2025

View reviewed changes

huydhn mentioned this pull request Sep 12, 2025

Fix the regression issue caused by non-arrch64 platforms not hitting the MKLDNN path. #162168

Closed

jainapurva mentioned this pull request Sep 12, 2025

Fix operator benchmark issue#162708 #162744

Closed

LifengWang added 2 commits September 12, 2025 06:55

update the baseline data for the operator benchmark

612d8dd

Add pull request trigger for operator benchmark workflow

4720751

pytorchmergebot force-pushed the update_baseline branch from c6a01ca to 4720751 Compare September 12, 2025 06:55

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 12, 2025

pytorchmergebot added the merging label Sep 12, 2025

pytorchmergebot added the Merged label Sep 12, 2025

pytorchmergebot closed this in f7ea497 Sep 12, 2025

pytorchmergebot removed the merging label Sep 12, 2025

jainapurva mentioned this pull request Sep 22, 2025

Failing Operator benchmarks CI job #162507

Closed

pytorchbot mentioned this pull request Oct 6, 2025

[v.2.9.0] Release Tracker #162497

Open

update the baseline data for the operator benchmark #162693

update the baseline data for the operator benchmark #162693

Uh oh!

Conversation

LifengWang commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162693

✅ No Failures

Uh oh!

LifengWang commented Sep 11, 2025

Uh oh!

LifengWang commented Sep 11, 2025

Uh oh!

pytorch-bot bot commented Sep 11, 2025

Uh oh!

LifengWang commented Sep 11, 2025

Uh oh!

huydhn commented Sep 12, 2025

Uh oh!

pytorchmergebot commented Sep 12, 2025

Uh oh!

pytorchmergebot commented Sep 12, 2025

Uh oh!

jainapurva commented Sep 12, 2025

Uh oh!

jainapurva commented Sep 12, 2025

Uh oh!

pytorchmergebot commented Sep 12, 2025

Merge started

Uh oh!

Camyll commented Oct 6, 2025

Uh oh!

pytorchbot commented Oct 6, 2025

Cherry picking #162693

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

LifengWang commented Sep 11, 2025 •

edited

Loading

pytorch-bot bot commented Sep 11, 2025 •

edited

Loading