[AArch64] Split zero cycle zeoring per register class #154547
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change improves LLVM's model accuracy by splitting AArch64 subtarget features of zero cycle zeroing per register class. This aligns with how uarch is designed (each register bank has unique capabilities). Similarly to how we improved ZCM modeling.
It splits
HasZeroCycleZeroingGPtoHasZeroCycleZeroingGPR32andHasZeroCycleZeroingGPR64, removes opaqueFeatureZCZeroing, and infersFeatureNoZCZeroingFPto beFeatureNoZCZeroingFPR64based on the single usage inAArch64AsmPrinter.cpp.It also splits
arm64-zero-cycle-zeroing.llinto 2 tests one-gprand one-fpr, similarly to ZCM, to make the tests more focused and managable in correspondance with the new modeling.The test cases are updated as well, exlpoiting the fact that this is a refactor patch:
apple-a10withapple-m1-mtriple=arm64-apple-macosx -mcpu=generictest case for GPRmtriple=arm64-apple-ios -mcpu=cycloneFP workaround test cas and move-fullfp16to another non-workaround test case