Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GetIndexOfFirstNonAsciiByte_Vector path in Ascii.Utility.cs is never exercised for AdvSimd/SSE4.1 #89924

Open
neon-sunset opened this issue Aug 3, 2023 · 4 comments
Labels
Milestone

Comments

@neon-sunset
Copy link
Contributor

neon-sunset commented Aug 3, 2023

Description

It appears that GetIndexOfFirstNonAsciiByte in https://github.com/dotnet/runtime/blob/main/src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Utility.cs#L104 does not exercise the GetIndexOfFirstNonAsciiByte_Vector path despite it having explicit support for AdvSimd and SSE4.1.

Is this intended? Should SSE4.1 and AdvSimd targets be allowed to take _Vector instead?

Regression?

No

Data

Preliminary benchmark shows that on osx-arm64 apple-m1 the time to traverse 1_000_000 bytes is 28.71us with current _Intrinsified path and is improved by ~40% to 20.12us if we relax the check and allow calling GetIndexOfFirstNonAsciiByte_Vector for Vector128.IsHardwareAccelerated.

@neon-sunset neon-sunset added the tenet-performance Performance related issue label Aug 3, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 3, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Aug 3, 2023
@EgorBo
Copy link
Member

EgorBo commented Aug 3, 2023

Looks like a regression (or leftover) from #88532 cc @anthonycanino

@MihaZupan MihaZupan added area-System.Text.Encoding and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Aug 3, 2023
@ghost
Copy link

ghost commented Aug 3, 2023

Tagging subscribers to this area: @dotnet/area-system-text-encoding
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

It appears that GetIndexOfFirstNonAsciiByte in https://github.com/dotnet/runtime/blob/main/src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Utility.cs#L104 does not exercise the GetIndexOfFirstNonAsciiByte_Vector path despite it having explicit support for AdvSimd and SSE4.1.

Is this intended? Should SSE4.1 and AdvSimd targets be allowed to take _Vector instead?

Regression?

No.

Data

Preliminary benchmark shows that on osx-arm64 apple-m1 the time to traverse 1_000_000 bytes is 28.71us with current _Intrinsified path and is improved by ~40% to 20.12us if we relax the check and allow calling GetIndexOfFirstNonAsciiByte_Vector for Vector128.IsHardwareAccelerated.

Author: neon-sunset
Assignees: -
Labels:

area-System.Text.Encoding, tenet-performance, untriaged

Milestone: -

@eiriktsarpalis eiriktsarpalis added this to the 8.0.0 milestone Aug 9, 2023
@eiriktsarpalis eiriktsarpalis removed the untriaged New issue has not been triaged by the area owner label Aug 9, 2023
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 14, 2023
@stephentoub
Copy link
Member

stephentoub commented Aug 14, 2023

Regression? Yes, vs .NET 7

@neon-sunset, @tannergooding pointed out to me offline it was using the _Intrinsified path in .NET 7:

public static unsafe nuint GetIndexOfFirstNonAsciiByte(byte* pBuffer, nuint bufferLength)
{
// If SSE2 is supported, use those specific intrinsics instead of the generic vectorized
// code below. This has two benefits: (a) we can take advantage of specific instructions like
// pmovmskb which we know are optimized, and (b) we can avoid downclocking the processor while
// this method is running.
return (Sse2.IsSupported || (AdvSimd.Arm64.IsSupported && BitConverter.IsLittleEndian))
? GetIndexOfFirstNonAsciiByte_Intrinsified(pBuffer, bufferLength)
: GetIndexOfFirstNonAsciiByte_Default(pBuffer, bufferLength);
}

Did you measure it to be a regression from .NET 7, and if so, can you share the benchmark and your resulting numbers?

Or are you just saying it's not as good in .NET 8 as it could be but it's not a regression?

EDIT: Tanner just commented on the PR with similar feedback:
#90527 (comment)

@neon-sunset
Copy link
Contributor Author

Or are you just saying it's not as good in .NET 8 as it could be but it's not a regression?

Yes, it appears to not be a regression because I mistakenly assumed the _Vector path was present in .NET 7. I have updated the description.

@stephentoub stephentoub modified the milestones: 8.0.0, 9.0.0 Aug 14, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants