-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aldebaran BBS tuning #1284
aldebaran BBS tuning #1284
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you are using the re-tuning script for this PR. Ideally, we should not see any new kernels in the diff (I need to update merge.py to not adds duplicates). The performances of new sizes are low. Please merge only if this PR solves the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same precheckin HMM failures as baseline, LGTM if extended and staging testing pass
It's a combination a retuning and regular tuning. All the kernels added are true new kernels |
This PR is ready to merge. The failed tests are unrelated to these changes. |
This patch causes a regression for large GEMMs (SWDEV-375718), the reason seems to be that there are no tuning points for larger M values (this one stops at 16), and so some large GEMMs are picking the kernels for the M=16 case (v small tile size, perf regression). I'm doing some extra tuning for larger M and will update the PR. |
I added some new exact sizes, with large M, following the pattern of the other tunings in this commit. I confirmed it fixes the regression. Only the Retune tool was used, but unfortunately new kernels are added due to known issue with merge script. |
Tuning for SWDEV-372453
All sizes are new and there are some new kernels as well.