Ascii.Equals and Ascii.EqualsIgnoreCase #84886

adamsitnik · 2023-04-15T17:09:44Z

dotnet-issue-labeler · 2023-04-15T17:09:52Z

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

ghost · 2023-04-15T17:23:17Z

Tagging subscribers to this area: @dotnet/area-system-text-encoding
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author:	adamsitnik
Assignees:	adamsitnik
Labels:	`area-System.Text.Encoding`, `new-api-needs-documentation`
Milestone:	-

src/libraries/System.Net.Http/src/System/Net/Http/ByteArrayHelpers.cs

… directly

src/libraries/System.Text.Encoding/tests/Ascii/EqualsTests.cs

jeffhandley · 2023-04-16T17:15:19Z

src/libraries/System.Text.Encoding/tests/Ascii/EqualsTests.cs

+
+namespace System.Text.Tests
+{
+    public static class EqualsTests


I suggest adding a could tests that show real-world use cases like what we see in networking or aspnetcore, comparing against a constant. Such tests could use a non-ascii utf8 sequence that could match something provided by client input in one of those scenarios.

gfoidl · 2023-04-17T11:48:20Z

src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Equality.cs

+                for (int i = 0; i < right.Length; i++)
+                {
+                    char c = right[i];
+                    byte b = left[i];


Do we care about the bound check here?

Since we've already confirmed left.Length == right.Length above, I think we're OK without another bounds check here.

src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Equality.cs

gfoidl · 2023-04-17T12:07:59Z

src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Equality.cs

+        {
+            Debug.Assert(left.Length == right.Length);
+
+            for (int i = 0; i < left.Length; i++)


Here is no vectorization needed?

Perhaps vectorization here would be a further perf opportunity that was missed. I recommend we proceed with mering this without it and potentially follow up with vectorizing it in a separate PR.

gfoidl · 2023-04-17T12:40:17Z

src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Equality.cs

+            }
+            else
+            {
+                (Vector64<ushort> lower, Vector64<ushort> upper) = Vector64.Widen(bytes);


IIRC Vector64 uses the software fallback on x64.
crossgen2 with main-branch shows that too.

I don't know if there are plans or WIP to hw-accelerate Vector64 in the .NET 8 timeframe.
Otherwise maybe read the value as long into a Vector128 and do the widening then. Something like

static Vector128<ushort> LoadAndWiden(ref byte ptr) { if (AdvSimd.IsSupported) { Vector64<byte> vec64 = Vector64.LoadUnsafe(ref ptr); return AdvSimd.ZeroExtendWideningLower(vec64); } else if (Sse2.IsSupported) { ulong value = Unsafe.ReadUnaligned<ulong>(ref ptr); Vector128<byte> vec = Vector128.CreateScalarUnsafe(value).AsByte(); return Sse2.UnpackLow(vec, Vector128<byte>.Zero).AsUInt16(); } else { Vector64<byte> vec64 = Vector64.LoadUnsafe(ref ptr); (Vector64<ushort> lower, Vector64<ushort> upper) = Vector64.Widen(vec64); return Vector128.Create(lower, upper); } }

gfoidl · 2023-04-17T12:42:14Z

src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Equality.cs

+        }
+
+        private static bool VectorContainsNonAsciiChar(Vector64<byte> bytes)
+            => !Utf8Utility.AllBytesInUInt64AreAscii(bytes.AsUInt64().ToScalar());


Same concern here for Vector64. Pass in ulong as Utf8Utility.AllBytesInUInt64AreAscii expects that.
So get rid of Vector64 where possible and use ulong instead?

Edit: the vector is stored to stack, then the ulong read from there. Not ideal, but not too bad.

Co-authored-by: Günther Foidl <gue@korporal.at>

adamsitnik · 2023-04-21T19:16:26Z

@gfoidl thank you for your suggestions and ideas! I won't have the time to implement all perf improvements now (I started paternity leave very recently and now I just want to get this issue off my TODO list). Would you be interested in sending a PR with improvements once this PR is merged?

gfoidl · 2023-04-21T19:20:20Z

Would you be interested in sending a PR with improvements once this PR is merged?

Yep, I can do the follow up.

paternity leave

Congratulations!

jeffhandley

Looks good to me. @adamsitnik requested we merge this without the vectorization improvements recommended by @gfoidl so that we can get in ahead of the Preview 4 snap (and while Adam is unavailable), and then potentially follow up with an additional PR to improve the vectorization.

@gfoidl -- would you like to submit a follow-up PR with those recommendations you made?

jeffhandley · 2023-04-21T19:41:44Z

src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Equality.cs

+                for (int i = 0; i < right.Length; i++)
+                {
+                    char c = right[i];
+                    byte b = left[i];


Since we've already confirmed left.Length == right.Length above, I think we're OK without another bounds check here.

jeffhandley · 2023-04-21T19:43:26Z

src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Equality.cs

+        {
+            Debug.Assert(left.Length == right.Length);
+
+            for (int i = 0; i < left.Length; i++)


Perhaps vectorization here would be a further perf opportunity that was missed. I recommend we proceed with mering this without it and potentially follow up with vectorizing it in a separate PR.

gfoidl · 2023-04-21T19:51:42Z

@jeffhandley yes I'll do a follow up PR to address the perf-points (maybe soon, otherwise in about 2 weeks as rough timeframe).
Bring this PR in...

jeffhandley · 2023-04-21T19:52:24Z

Awesome; thank you! I'll merge when green.

Cf. dotnet#84886 (comment)

gfoidl · 2023-05-08T18:37:34Z

@jeffhandley follow up PR is here: #85926

ghost assigned adamsitnik Apr 15, 2023

dotnet-issue-labeler bot added needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners new-api-needs-documentation labels Apr 15, 2023

adamsitnik added area-System.Text.Encoding and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Apr 15, 2023

Ascii.Equals and Ascii.EqualsIgnoreCase

2958544

adamsitnik force-pushed the asciiEquals branch from 2fa7e15 to 2958544 Compare April 15, 2023 17:34

build-analysis bot mentioned this pull request Apr 15, 2023

WasmTestOnBrowser-System.Text.Json.Tests.WorkItemExecution timing out #84434

Closed

MihaZupan reviewed Apr 15, 2023

View reviewed changes

src/libraries/System.Net.Http/src/System/Net/Http/ByteArrayHelpers.cs Outdated Show resolved Hide resolved

adamsitnik added 2 commits April 16, 2023 15:54

add Equals(byte, byte) and Equals(char, char)

4c01430

address code review feedback: remove ByteArrayHelpers type, use Ascii…

7a39700

… directly

jeffhandley reviewed Apr 16, 2023

View reviewed changes

gfoidl reviewed Apr 17, 2023

View reviewed changes

Apply suggestions from code review

896feb6

Co-authored-by: Günther Foidl <gue@korporal.at>

build-analysis bot mentioned this pull request Apr 21, 2023

[wasm] interpreter timeouts when WebSocket closes unexpectedly #84101

Closed

address code review and API review feedback

d72e89b

adamsitnik marked this pull request as ready for review April 21, 2023 18:23

adamsitnik requested review from jeffhandley and stephentoub April 21, 2023 18:25

Merge remote-tracking branch 'upstream/main' into asciiEquals

933c3e7

jeffhandley approved these changes Apr 21, 2023

View reviewed changes

build-analysis bot mentioned this pull request Apr 21, 2023

Tracking issue for CI build timeouts #76454

Closed

jeffhandley merged commit 880d44c into dotnet:main Apr 22, 2023
167 checks passed

gfoidl added a commit to gfoidl/dotnet-runtime that referenced this pull request May 8, 2023

Optimized Ascii.Equals

d67ae7a

Cf. dotnet#84886 (comment)

gfoidl mentioned this pull request May 8, 2023

Optimizations for Ascii.Equals and Ascii.EqualsIgnoreCase #85926

Merged

dotnet locked as resolved and limited conversation to collaborators Jun 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ascii.Equals and Ascii.EqualsIgnoreCase #84886

Ascii.Equals and Ascii.EqualsIgnoreCase #84886

adamsitnik commented Apr 15, 2023 •

edited

dotnet-issue-labeler bot commented Apr 15, 2023

ghost commented Apr 15, 2023

jeffhandley Apr 16, 2023

gfoidl Apr 17, 2023

jeffhandley Apr 21, 2023

gfoidl Apr 17, 2023

jeffhandley Apr 21, 2023

gfoidl Apr 17, 2023

gfoidl Apr 17, 2023 •

edited

adamsitnik commented Apr 21, 2023

gfoidl commented Apr 21, 2023

jeffhandley left a comment

jeffhandley Apr 21, 2023

jeffhandley Apr 21, 2023

gfoidl commented Apr 21, 2023

jeffhandley commented Apr 21, 2023

gfoidl commented May 8, 2023

Ascii.Equals and Ascii.EqualsIgnoreCase #84886

Ascii.Equals and Ascii.EqualsIgnoreCase #84886

Conversation

adamsitnik commented Apr 15, 2023 • edited

dotnet-issue-labeler bot commented Apr 15, 2023

ghost commented Apr 15, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfoidl Apr 17, 2023 • edited

Choose a reason for hiding this comment

adamsitnik commented Apr 21, 2023

gfoidl commented Apr 21, 2023

jeffhandley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfoidl commented Apr 21, 2023

jeffhandley commented Apr 21, 2023

gfoidl commented May 8, 2023

adamsitnik commented Apr 15, 2023 •

edited

gfoidl Apr 17, 2023 •

edited