[AArch64] Fix sched model for TSV110 core. #82343

yugr · 2024-02-20T11:19:25Z

Accumulator operand of MADD instruction can be bypassed from another MUL-like operation. Before this fix bypassing was incorrectly applied to multiplier operand.

llvmbot · 2024-02-20T11:19:44Z

@llvm/pr-subscribers-backend-aarch64

Author: Yury Gribov (yugr)

Changes

Accumulator operand of MADD instruction can be bypassed from another MUL-like operation. Before this fix bypassing was incorrectly applied to multiplier operand.

Full diff: https://github.com/llvm/llvm-project/pull/82343.diff

1 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64SchedTSV110.td (+3-3)

diff --git a/llvm/lib/Target/AArch64/AArch64SchedTSV110.td b/llvm/lib/Target/AArch64/AArch64SchedTSV110.td
index 0ae9a69fd48265..1c577a25bf7390 100644
--- a/llvm/lib/Target/AArch64/AArch64SchedTSV110.td
+++ b/llvm/lib/Target/AArch64/AArch64SchedTSV110.td
@@ -419,10 +419,10 @@ def : InstRW<[TSV110Wr_12cyc_1MDU],  (instregex "^(S|U)DIVWr$")>;
 def : InstRW<[TSV110Wr_20cyc_1MDU],  (instregex "^(S|U)DIVXr$")>;
 
 def TSV110ReadMAW : SchedReadAdvance<2, [TSV110Wr_3cyc_1MDU]>;
-def : InstRW<[TSV110Wr_3cyc_1MDU, TSV110ReadMAW], (instrs MADDWrrr, MSUBWrrr)>;
+def : InstRW<[TSV110Wr_3cyc_1MDU, ReadIM, ReadIM, TSV110ReadMAW], (instrs MADDWrrr, MSUBWrrr)>;
 def TSV110ReadMAQ : SchedReadAdvance<3, [TSV110Wr_4cyc_1MDU]>;
-def : InstRW<[TSV110Wr_4cyc_1MDU, TSV110ReadMAQ], (instrs MADDXrrr, MSUBXrrr)>;
-def : InstRW<[TSV110Wr_3cyc_1MDU, TSV110ReadMAW], (instregex "(S|U)(MADDL|MSUBL)rrr")>;
+def : InstRW<[TSV110Wr_4cyc_1MDU, ReadIM, ReadIM, TSV110ReadMAQ], (instrs MADDXrrr, MSUBXrrr)>;
+def : InstRW<[TSV110Wr_3cyc_1MDU, ReadIM, ReadIM, TSV110ReadMAW], (instregex "(S|U)(MADDL|MSUBL)rrr")>;
 def : InstRW<[TSV110Wr_4cyc_1MDU], (instregex "^(S|U)MULHrr$")>;

davemgreen · 2024-02-20T13:00:31Z

Could you add a test? Something like llvm/test/tools/llvm-mca/AArch64/Neoverse/V2-forwarding.s maybe?

yugr · 2024-02-20T17:27:15Z

Could you add a test? Something like llvm/test/tools/llvm-mca/AArch64/Neoverse/V2-forwarding.s maybe?

Right, done.

davemgreen

Thanks. This looks good to me

Accumulator operand of MADD instruction can be bypassed from another MUL-like operation. Before this fix bypassing was incorrectly applied to multiplier operand.

vfdff · 2024-02-22T08:48:13Z

Thanks. The shortest forward latency is 1-cycle according the document for madd/msub when their accumulator operand depend on MAC operation's result, LGTM

yugr added backend:AArch64 performance labels Feb 20, 2024

yugr requested review from bryanpkc, davemgreen and ElvinaYakubova February 20, 2024 11:19

yugr self-assigned this Feb 20, 2024

davemgreen requested a review from vfdff February 20, 2024 12:59

yugr force-pushed the tsv110-madd branch from c4db729 to 58eae6f Compare February 20, 2024 17:26

davemgreen approved these changes Feb 21, 2024

View reviewed changes

[AArch64] Fix sched model for TSV110 core.

15d7f56

Accumulator operand of MADD instruction can be bypassed from another MUL-like operation. Before this fix bypassing was incorrectly applied to multiplier operand.

yugr force-pushed the tsv110-madd branch from 58eae6f to 15d7f56 Compare February 22, 2024 08:40

yugr merged commit 6193233 into llvm:main Feb 22, 2024
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AArch64] Fix sched model for TSV110 core. #82343

[AArch64] Fix sched model for TSV110 core. #82343

yugr commented Feb 20, 2024

llvmbot commented Feb 20, 2024

davemgreen commented Feb 20, 2024

yugr commented Feb 20, 2024

davemgreen left a comment

vfdff commented Feb 22, 2024

[AArch64] Fix sched model for TSV110 core. #82343

[AArch64] Fix sched model for TSV110 core. #82343

Conversation

yugr commented Feb 20, 2024

llvmbot commented Feb 20, 2024

davemgreen commented Feb 20, 2024

yugr commented Feb 20, 2024

davemgreen left a comment

Choose a reason for hiding this comment

vfdff commented Feb 22, 2024