Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect bit match pattern in UTF-16 validation #24015

Merged
merged 1 commit into from Apr 16, 2019

Conversation

Projects
None yet
3 participants
@GrabYourPitchforks
Copy link
Member

GrabYourPitchforks commented Apr 15, 2019

Fixes dotnet/corefx#36870.

In this code, mask has the bit pattern aa_bb_cc_..., where each pair of bits is "11" if the corresponding char in the input vector is a UTF-16 surrogate code point (U+D800..U+DFFF), "00" otherwise.

mask2 has the bit pattern xx_yy_zz_..., where each pair of bits is "00", "01", or (undefined) as specified in the comment starting on line 170. ANDing these two masks together should result in "01" if the corresponding char was a low surrogate char, "00" otherwise. This is stored in the local lowSurrogatesMask.

The local highSurrogatesMask is intended to be similar, but where "01" represents that the corresponding char was a high surrogate char, "00" otherwise. This value was being incorrectly generated because the operands to the XOR and the AND operation were swapped. The actual behavior of this code (which has been made clearer in comments and which has been fixed of this PR) is that it should first flip every odd bit in mask2 (so that "00" means low surrogate, "01" means high surrogate, and garbage stays garbage), then when it's ANDed with mask it'll be properly normalized to "01" or "00".

@GrabYourPitchforks

This comment has been minimized.

Copy link
Member Author

GrabYourPitchforks commented Apr 15, 2019

Failing Release CoreFX Tests CI build is due to known flaky test - see dotnet/corefx#30683.

@GrabYourPitchforks

This comment has been minimized.

Copy link
Member Author

GrabYourPitchforks commented Apr 16, 2019

CI's going to go green - I'm just going to merge this to unblock everybody. Thanks for the reviews!

@GrabYourPitchforks GrabYourPitchforks merged commit fcc4beb into dotnet:master Apr 16, 2019

8 of 12 checks passed

Windows_NT x64 Release CoreFX Tests Build finished.
Details
Ubuntu arm Cross Checked crossgen_comparison Build and Test Started.
Details
Ubuntu arm Cross Release crossgen_comparison Build and Test Started.
Details
Ubuntu x64 Checked CoreFX Tests Started.
Details
Ubuntu x64 Formatting Build finished.
Details
Windows_NT x64 Checked CoreFX Tests Build finished.
Details
Windows_NT x64 Formatting Build finished.
Details
Windows_NT x64 full_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x64 min_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x86 full_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x86 min_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
license/cla All CLA requirements met.
Details

@GrabYourPitchforks GrabYourPitchforks deleted the GrabYourPitchforks:fix_utf16 branch Apr 16, 2019

Dotnet-GitSync-Bot pushed a commit to Dotnet-GitSync-Bot/corefx that referenced this pull request Apr 16, 2019

Fix incorrect bit match pattern in UTF-16 validation (dotnet/coreclr#…
…24015)

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>

Dotnet-GitSync-Bot pushed a commit to Dotnet-GitSync-Bot/corert that referenced this pull request Apr 16, 2019

Fix incorrect bit match pattern in UTF-16 validation (dotnet/coreclr#…
…24015)

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>

stephentoub added a commit to dotnet/corefx that referenced this pull request Apr 16, 2019

Fix incorrect bit match pattern in UTF-16 validation (dotnet/coreclr#…
…24015)

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>

MichalStrehovsky added a commit to dotnet/corert that referenced this pull request Apr 16, 2019

Fix incorrect bit match pattern in UTF-16 validation (dotnet/coreclr#…
…24015)

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>

Dotnet-GitSync-Bot pushed a commit to Dotnet-GitSync-Bot/mono that referenced this pull request Apr 16, 2019

Fix incorrect bit match pattern in UTF-16 validation (dotnet/coreclr#…
…24015)

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>

marek-safar added a commit to mono/mono that referenced this pull request Apr 16, 2019

Fix incorrect bit match pattern in UTF-16 validation (dotnet/coreclr#…
…24015)

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.