-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow codegen when using Vector128/256<T> parameter #78883
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsDescriptionThe following method generates native code that reloads the private static unsafe void BitwiseFillVector128_BadCodeGen(Vector128<byte>* startAddress, Vector128<byte> value, ulong elementCount)
{
while (elementCount >= 4)
{
Vector128.Store(value, (byte*)startAddress);
Vector128.Store(value, (byte*)(startAddress + 1));
Vector128.Store(value, (byte*)(startAddress + 2));
Vector128.Store(value, (byte*)(startAddress + 3));
elementCount -= 4;
startAddress += 4;
}
while (elementCount > 0)
{
Vector128.Store(value, (byte*)startAddress);
--elementCount;
++startAddress;
}
}
If you copy private static unsafe void BitwiseFillVector128_GoodCodeGen(Vector128<byte>* startAddress, Vector128<byte> value, ulong elementCount)
{
Vector128<Byte> valueP = value;
while (elementCount >= 4)
{
Vector128.Store(valueP, (byte*)startAddress);
Vector128.Store(valueP, (byte*)(startAddress + 1));
Vector128.Store(valueP, (byte*)(startAddress + 2));
Vector128.Store(valueP, (byte*)(startAddress + 3));
elementCount -= 4;
startAddress += 4;
}
while (elementCount > 0)
{
Vector128.Store(valueP, (byte*)startAddress);
--elementCount;
++startAddress;
}
}
A similar method using Reproduction Steps... Expected behavior
Actual behavior
Regression?No idea Known WorkaroundsCopy the Configuration.NET 7.0.0 x64 Other informationNo response
|
Also, the codegen is fine if the method is inlined, e.g. |
CC @EgorBo. |
CC @tannergooding. |
This isn't something we'll be able to get to for .NET 8, it is a scenario we should support however so I'm moving it to 9, rather than future. This is probably something where @jakobbotsch or @EgorBo would be better to weigh in. We notably track There's also been inverse cases in the past where knowing an arg, or a sequence of args, were sequential in memory would've allowed us to optimize them better. If there were an easy way to take advantage of this info, such as in morph, then we could opt to hoist such implicit byref args to avoid repeated memory access or to take advantage of the fact they are memory accesses to combine neighboring loads. |
We can maybe do something simple for implicit byref args by marking their created indirections as invariant when we are able to tell that the arg itself is never modified (common scenario). It would probably require some VN work to avoid regressions. The general problem is harder and touches upon having a more sophisticated aliasing model -- in this case the fact we'd like to make use of is that the ABI disallows implicit byref args from aliasing anything else. In particular the same optimization cannot be allowed if |
Definitely understand the broader aliasing issues for explicit byrefs. The biggest issues I've seen are the implicit cases like this one and I think that will help catch some of the worst offenders. The other case I particularly remember is Morph has a bit that will look for |
Moving to future. Codegen has always been like this, also fairly complex change. |
Description
The following method generates native code that reloads the
value
parameter fromrdx
every time it's used:If you copy
value
to a local/temp, the codegen is fixed:A similar method using
Vector256
exhibits the same issue.Sharplab link: https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKGIAYACY8gOgCUBXAOwwEt8YLAMIR8AB14AbGFADKMgG68wMXAG4a9Jq049+ggJI8ovLrmXqamgMxNSDIQwDeNBm6a3iKBgFkAFACUzq7uAL4hbhEMYiYK2Bgw2kgM3LjYAGaJXgwAQrwYAO68uDAAYlKSAGowYBjQ5KQAHAD6OdgAJiLtMADiMFx+1bX1TQA8wACeCQB8AFQMuBjYUBgAgu3tsLi4aAxDdVANjeNTMNMMcZIcMLsckhBcAOYMMNICPCLcGAFRLtTuAIYBQAFlJEn5XjB3hhPjwGNMALwMFA/f6AtxOdHoqJYvY1A5HFiyA4wPyXa67PyTBKzAKLZZrDZbXABDRo3FufYjRpEklk7BXG4MKmnWl+ekrdabVS4BgAagY5ACrJxWK5hyavOgpPJQpFNIC4qWkqZMvlDFIyrZHPc6sJxO1/MFlOpMDFEsZ0u25usVqs7I5kOhsIwDDgSJQ1ptC2NnuZ8ojUfR4WoqrcILBwqD/RhEC+8IYdFRuL+0bxww1PIdsCdFOFrtpHqlzJVqYDuLgcGzHzzPCTHLlcqbpu2/YBKfHUSiMV4cQSSRSZgyWW8eUKxTKFTtTWaPQgEE6B96/UG+O5Jxm82HXp25YJY1d511t3uTxebxzIeL7lLarPldGHJTifAVrgABQYJFdTZNMgVBaQsw/Ht80RZFv3RTFo1ggFtyrPldTAl1RTpWNmxlVsy1wrUawIoiDSNBkyO9BUlQo6MqOrHVQJgQj62IhiTRvc1LTYm0OPw7jeP1N1DWveMFV9VtsPcbtc3zcNkTHLE5LNOVE1glNYIzBCISQtS4XOItYN/MT/3tCTBSkhsSMYkcWRg9ssU7VSQy09FBx00cDKiUIARoUIgA==
Reproduction Steps
...
Expected behavior
value
is not reloaded fromrdx
every time it is neededActual behavior
value
is reloaded fromrdx
every time it's neededRegression?
No idea
Known Workarounds
Copy the
value
in question to a local temp before using it -- don't use parameters directly, in other wordsConfiguration
.NET 7.0.0 x64
Other information
No response
The text was updated successfully, but these errors were encountered: