-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor TranscodeUtf8 to allow trimming of Vector128<T> #47928
Conversation
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
TBH we should probably have a separate implementation of the UTF-8 logic for interpreted or wasm scenarios. The current validation and transcoding logic is heavily optimized to take into consideration things like branch prediction, cache locality, and efficient register utilization. This results in the methods being large and complex because they consist of a bunch of little worker loops that each work on one optimized scenario. A separate interpreted / wasm-specific implementation would be a straightforward "just do the simplest thing" loop. No cleverness, just a small simple loop that consists of easy instructions. Edit - an example of the "simple" validation and transcoding loops: // simplest validation loop
int GetIndexOfFirstInvalidUtf8Byte(ROS<byte> input, out int utf16CharCount)
{
int originalInputLength = input.Length;
utf16CharCount = 0;
while (!input.IsEmpty)
{
if (Rune.DecodeUtf8(input, out Rune rune, out int bytesConsumed) != OperationStatus.Done)
break;
input = input.Slice(bytesConsumed);
utf16CharCount += input.Utf16CodeUnitCount;
}
return originalInputLength - input.Length;
}
// simplest transcoding loop
OperationStatus TranscodeUtf8ToUtf16(ROS<byte> input, ROS<char> output, out int bytesConsumed, out int charsWritten)
{
int originalInputLength = input.Length;
int originalOutputLength = output.Length;
OperationStatus opStatus = OperationStatus.Done;
while (!input.IsEmpty)
{
opStatus = Rune.DecodeUtf8(input, out Rune rune, out int bytesConsumedJustNow);
if (opStatus != OperationStatus.Done) { break; }
if (!rune.TryWriteUtf16(output, out int charsWrittenJustNow)) { opStatus = OperationStatus.DestinationTooSmall; break; }
input = input.Slice(bytesConsumedJustNow);
output = output.Slice(charsWrittenJustNow);
}
bytesConsumed = originalInputLength - input.Length;
charsWritten = originalOutputLength - output.Length;
return opStatus;
} |
Tagging subscribers to 'linkable-framework': @eerhardt, @vitek-karas, @LakshanF, @tannergooding, @sbomer Issue DetailsThis resolves #47860 by moving where I logged dotnet/linker#1805 to track the linker not understanding things like CC. @eerhardt, @stephentoub
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! LGTM
@GrabYourPitchforks, could you log an "up-for-grabs" issue suggesting the above? |
Hello @tannergooding! Because this pull request has the p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (
|
This resolves #47860 by moving where
Unsafe.SkipInit
is called so we can trim outVector128<short>
in the default blazor_wasm template.I logged dotnet/linker#1805 to track the linker not understanding things like
SkipInit
(which areNonVersionable
) and dotnet/linker#1808 to track the linker not trimming out a local when its only usage is it being initialized viainitobj
CC. @eerhardt, @stephentoub