Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64] Fix sched model for TSV110 core. #82343

Merged
merged 1 commit into from
Feb 22, 2024
Merged

[AArch64] Fix sched model for TSV110 core. #82343

merged 1 commit into from
Feb 22, 2024

Conversation

yugr
Copy link
Member

@yugr yugr commented Feb 20, 2024

Accumulator operand of MADD instruction can be bypassed from another MUL-like operation. Before this fix bypassing was incorrectly applied to multiplier operand.

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 20, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Yury Gribov (yugr)

Changes

Accumulator operand of MADD instruction can be bypassed from another MUL-like operation. Before this fix bypassing was incorrectly applied to multiplier operand.


Full diff: https://github.com/llvm/llvm-project/pull/82343.diff

1 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64SchedTSV110.td (+3-3)
diff --git a/llvm/lib/Target/AArch64/AArch64SchedTSV110.td b/llvm/lib/Target/AArch64/AArch64SchedTSV110.td
index 0ae9a69fd48265..1c577a25bf7390 100644
--- a/llvm/lib/Target/AArch64/AArch64SchedTSV110.td
+++ b/llvm/lib/Target/AArch64/AArch64SchedTSV110.td
@@ -419,10 +419,10 @@ def : InstRW<[TSV110Wr_12cyc_1MDU],  (instregex "^(S|U)DIVWr$")>;
 def : InstRW<[TSV110Wr_20cyc_1MDU],  (instregex "^(S|U)DIVXr$")>;
 
 def TSV110ReadMAW : SchedReadAdvance<2, [TSV110Wr_3cyc_1MDU]>;
-def : InstRW<[TSV110Wr_3cyc_1MDU, TSV110ReadMAW], (instrs MADDWrrr, MSUBWrrr)>;
+def : InstRW<[TSV110Wr_3cyc_1MDU, ReadIM, ReadIM, TSV110ReadMAW], (instrs MADDWrrr, MSUBWrrr)>;
 def TSV110ReadMAQ : SchedReadAdvance<3, [TSV110Wr_4cyc_1MDU]>;
-def : InstRW<[TSV110Wr_4cyc_1MDU, TSV110ReadMAQ], (instrs MADDXrrr, MSUBXrrr)>;
-def : InstRW<[TSV110Wr_3cyc_1MDU, TSV110ReadMAW], (instregex "(S|U)(MADDL|MSUBL)rrr")>;
+def : InstRW<[TSV110Wr_4cyc_1MDU, ReadIM, ReadIM, TSV110ReadMAQ], (instrs MADDXrrr, MSUBXrrr)>;
+def : InstRW<[TSV110Wr_3cyc_1MDU, ReadIM, ReadIM, TSV110ReadMAW], (instregex "(S|U)(MADDL|MSUBL)rrr")>;
 def : InstRW<[TSV110Wr_4cyc_1MDU], (instregex "^(S|U)MULHrr$")>;
 
 

@davemgreen
Copy link
Collaborator

Could you add a test? Something like llvm/test/tools/llvm-mca/AArch64/Neoverse/V2-forwarding.s maybe?

@yugr
Copy link
Member Author

yugr commented Feb 20, 2024

Could you add a test? Something like llvm/test/tools/llvm-mca/AArch64/Neoverse/V2-forwarding.s maybe?

Right, done.

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This looks good to me

Accumulator operand of MADD instruction can be bypassed from another
MUL-like operation. Before this fix bypassing was incorrectly applied to
multiplier operand.
@vfdff
Copy link
Contributor

vfdff commented Feb 22, 2024

Thanks. The shortest forward latency is 1-cycle according the document for madd/msub when their accumulator operand depend on MAC operation's result, LGTM

@yugr yugr merged commit 6193233 into llvm:main Feb 22, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants