-
Notifications
You must be signed in to change notification settings - Fork 15.5k
[InterleavedAccess] Construct interleaved access store with shuffles #164000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -239,7 +239,8 @@ static bool isDeInterleaveMask(ArrayRef<int> Mask, unsigned &Factor, | |
| /// I.e. <0, LaneLen, ... , LaneLen*(Factor - 1), 1, LaneLen + 1, ...> | ||
| /// E.g. For a Factor of 2 (LaneLen=4): <0, 4, 1, 5, 2, 6, 3, 7> | ||
| static bool isReInterleaveMask(ShuffleVectorInst *SVI, unsigned &Factor, | ||
| unsigned MaxFactor) { | ||
| unsigned MaxFactor, | ||
| bool InterleaveWithShuffles) { | ||
| unsigned NumElts = SVI->getShuffleMask().size(); | ||
| if (NumElts < 4) | ||
| return false; | ||
|
|
@@ -250,6 +251,13 @@ static bool isReInterleaveMask(ShuffleVectorInst *SVI, unsigned &Factor, | |
| return true; | ||
| } | ||
|
|
||
| if (InterleaveWithShuffles) { | ||
| for (unsigned i = 1; MaxFactor * i <= 16; i *= 2) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. LLVM stype uses upper case for variable names. Why cap at 16?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The next interleaving factor is 32. For efficient interleaving, more than 32 registers are required. Without spilling it is not possible for AArch64. So stopped at 16. For all the corrections, will create a follow up PR. |
||
| Factor = i * MaxFactor; | ||
| if (SVI->isInterleave(Factor)) | ||
| return true; | ||
| } | ||
| } | ||
| return false; | ||
| } | ||
|
|
||
|
|
@@ -528,7 +536,8 @@ bool InterleavedAccessImpl::lowerInterleavedStore( | |
| cast<FixedVectorType>(SVI->getType())->getNumElements(); | ||
| // Check if the shufflevector is RE-interleave shuffle. | ||
| unsigned Factor; | ||
| if (!isReInterleaveMask(SVI, Factor, MaxFactor)) | ||
| if (!isReInterleaveMask(SVI, Factor, MaxFactor, | ||
| TLI->isProfitableToInterleaveWithGatherScatter())) | ||
| return false; | ||
| assert(NumStoredElements % Factor == 0 && | ||
| "number of stored element should be a multiple of Factor"); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a very specific hook, targeted at InterleavedAcecssPass. Would be good to clarify the comment what it means precisely, possibly with an example.
It it is about profitability for gathers/scatters, can it be checked by checking the costs of the gather/scatter sequence?