Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ascii.Equals #84887

Closed
adamsitnik opened this issue Apr 15, 2023 · 5 comments · Fixed by #84886
Closed

Ascii.Equals #84887

adamsitnik opened this issue Apr 15, 2023 · 5 comments · Fixed by #84886
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.Encoding
Milestone

Comments

@adamsitnik
Copy link
Member

adamsitnik commented Apr 15, 2023

Background and motivation

Based on an ASP.NET and dotnet/runtime code study, I concluded that from all of the methods that were initially proposed in #28230 and were reverted in 0d69abd before #75012 got merged, we need only Equals and EqualsIgnoreCase.

The tricky part was deciding what should happen when both input values are equal, but they are not valid ASCII. Example:

Ascii.Equals(new byte[] { 128 }, new char[] { (char)128 });

The initial proposal (#28230) suggested to throw. We have received a very strong push back about it. Based on real-life use cases from dotnet/runtime and ASP.NET:

https://github.com/dotnet/aspnetcore/blob/8968058c9e5fdfdd1242426a03dc80609997edab/src/Http/Routing/src/Matching/Ascii.cs#L16
https://github.com/dotnet/aspnetcore/blob/8968058c9e5fdfdd1242426a03dc80609997edab/src/Shared/ServerInfrastructure/StringUtilities.cs#L420

together with @stephentoub and @GrabYourPitchforks we came to agreement, that these methods should be used only when at least one of the input buffers is guaranteed to be valid ASCII (common use case: comparing user input with a const value) and the equality checks should return false in such case.

API Proposal

namespace System.Text;

public static partial class Ascii
{
    /// <summary>
    /// Determines whether the provided buffers contain equal ASCII characters.
    /// </summary>
    /// <param name="left">The buffer to compare with <paramref name="right" />.</param>
    /// <param name="right">The buffer to compare with <paramref name="left" />.</param>
    /// <returns><see langword="true" /> if the corresponding elements in <paramref name="left" /> and <paramref name="right" /> were equal and ASCII. <see langword="false" /> otherwise.</returns>
    /// <remarks>If both buffers contain equal, but non-ASCII characters, the method returns <see langword="false" />.</remarks>
    public static bool Equals(ReadOnlySpan<byte> left, ReadOnlySpan<byte> right);
    public static bool Equals(ReadOnlySpan<byte> left, ReadOnlySpan<char> right);
    public static bool Equals(ReadOnlySpan<char> left, ReadOnlySpan<char> right);
    
    /// <summary>
    /// Determines whether the provided buffers contain equal ASCII characters, ignoring case considerations.
    /// </summary>
    /// <param name="left">The buffer to compare with <paramref name="right" />.</param>
    /// <param name="right">The buffer to compare with <paramref name="left" />.</param>
    /// <returns><see langword="true" /> if the corresponding elements in <paramref name="left" /> and <paramref name="right" /> were equal ignoring case considerations and ASCII. <see langword="false" /> otherwise.</returns>
    /// <remarks>If both buffers contain equal, but non-ASCII characters, the method returns <see langword="false" />.</remarks>
    public static bool EqualsIgnoreCase(ReadOnlySpan<byte> left, ReadOnlySpan<byte> right);
    public static bool EqualsIgnoreCase(ReadOnlySpan<byte> left, ReadOnlySpan<char> right);
    public static bool EqualsIgnoreCase(ReadOnlySpan<char> left, ReadOnlySpan<char> right);
}

Alternative Designs

Equals(ReadOnlySpan<byte> left, ReadOnlySpan<byte> right) and Equals(ReadOnlySpan<char> left, ReadOnlySpan<char> right) could be removed, as they can be expressed by calling left.SequenceEqual(right) with Ascii.IsValid calls if needed.

@adamsitnik adamsitnik added api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Text.Encoding labels Apr 15, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Apr 15, 2023
@ghost
Copy link

ghost commented Apr 15, 2023

Tagging subscribers to this area: @dotnet/area-system-text-encoding
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

Based on an ASP.NET and dotnet/runtime code study, I concluded that from all of the methods that were initially proposed in #28230 and were reverted in 0d69abd before #75012 got merged, we need only Equals and EqualsIgnoreCase.

The tricky part was deciding what should happen when both input values are equal, but they are not valid ASCII. Example:

Ascii.Equals(new byte[] { 128 }, new char[] { (char)128 });

The initial proposal (#28230) suggested to throw. We have received a very strong push back about it. Based on real-life use cases from dotnet/runtime and ASP.NET:

https://github.com/dotnet/aspnetcore/blob/8968058c9e5fdfdd1242426a03dc80609997edab/src/Http/Routing/src/Matching/Ascii.cs#L16
https://github.com/dotnet/aspnetcore/blob/8968058c9e5fdfdd1242426a03dc80609997edab/src/Shared/ServerInfrastructure/StringUtilities.cs#L420

together with @stephentoub and @GrabYourPitchforks we came to agreement, that these methods should be used only when at least one of the input buffers is guaranteed to be valid ASCII (common use case: comparing user input with a const value) and the equality checks should return false in such case.

API Proposal

namespace System.Text;

public static partial class Ascii
{
    /// <summary>
    /// Determines whether the provided buffers contain equal ASCII characters.
    /// </summary>
    /// <param name="left">The buffer to compare with <paramref name="right" />.</param>
    /// <param name="right">The buffer to compare with <paramref name="left" />.</param>
    /// <returns><see langword="true" /> if the corresponding elements in <paramref name="left" /> and <paramref name="right" /> were equal and ASCII. <see langword="false" /> otherwise.</returns>
    /// <remarks>If both buffers contain equal, but non-ASCII characters, the method returns <see langword="false" />.</remarks>
    public static bool Equals(ReadOnlySpan<byte> left, ReadOnlySpan<char> right);
    
    /// <summary>
    /// Determines whether the provided buffers contain equal ASCII characters, ignoring case considerations.
    /// </summary>
    /// <param name="left">The buffer to compare with <paramref name="right" />.</param>
    /// <param name="right">The buffer to compare with <paramref name="left" />.</param>
    /// <returns><see langword="true" /> if the corresponding elements in <paramref name="left" /> and <paramref name="right" /> were equal ignoring case considerations and ASCII. <see langword="false" /> otherwise.</returns>
    /// <remarks>If both buffers contain equal, but non-ASCII characters, the method returns <see langword="false" />.</remarks>
    public static bool EqualsIgnoreCase(ReadOnlySpan<byte> left, ReadOnlySpan<byte> right);
    public static bool EqualsIgnoreCase(ReadOnlySpan<byte> left, ReadOnlySpan<char> right);
    public static bool EqualsIgnoreCase(ReadOnlySpan<char> left, ReadOnlySpan<char> right);
}

The proposal does not include Equals(ReadOnlySpan<byte> left, ReadOnlySpan<byte> right) and Equals(ReadOnlySpan<char> left, ReadOnlySpan<char> right) methods, as they can be expressed by calling left.SequenceEqual(right) with Ascii.IsValid calls if needed.

Author: adamsitnik
Assignees: -
Labels:

api-suggestion, area-System.Text.Encoding

Milestone: -

@stephentoub stephentoub added blocking Marks issues that we want to fast track in order to unblock other important work api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Apr 15, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Apr 15, 2023
@jeffhandley
Copy link
Member

The proposal does not include Equals(ReadOnlySpan<byte> left, ReadOnlySpan<byte> right) and Equals(ReadOnlySpan<char> left, ReadOnlySpan<char> right) methods, as they can be expressed by calling left.SequenceEqual(right)

I suggest we provide those two methods so that Equals and EqualsIgnoreCase have the same set of overloads, improving discoverability and consistency of usage, even if those two methods simply call SequenceEqual.

@adamsitnik
Copy link
Member Author

@jeffhandley good point, I've updated the proposal and the PR

@tarekgh tarekgh added this to the 8.0.0 milestone Apr 17, 2023
@bartonjs
Copy link
Member

bartonjs commented Apr 20, 2023

Video

  • We added <char>,<byte> overloads so we weren't opinionated on the order of UTF-16 and ASCII inputs.
  • We discussed collapsing the two methods groups into one with a bool parameter (or StringComparison),
    but decided that it was better as-proposed.
  • We've not yet had a public method named EqualsIgnoreCase, but we were all sort of surprised by that, and believe it's the best name.
namespace System.Text;

public static partial class Ascii
{
    /// <summary>
    /// Determines whether the provided buffers contain equal ASCII characters.
    /// </summary>
    /// <param name="left">The buffer to compare with <paramref name="right" />.</param>
    /// <param name="right">The buffer to compare with <paramref name="left" />.</param>
    /// <returns><see langword="true" /> if the corresponding elements in <paramref name="left" /> and <paramref name="right" /> were equal and ASCII. <see langword="false" /> otherwise.</returns>
    /// <remarks>If both buffers contain equal, but non-ASCII characters, the method returns <see langword="false" />.</remarks>
    public static bool Equals(ReadOnlySpan<byte> left, ReadOnlySpan<byte> right);
    public static bool Equals(ReadOnlySpan<byte> left, ReadOnlySpan<char> right);
    public static bool Equals(ReadOnlySpan<char> left, ReadOnlySpan<byte> right);
    public static bool Equals(ReadOnlySpan<char> left, ReadOnlySpan<char> right);
    
    /// <summary>
    /// Determines whether the provided buffers contain equal ASCII characters, ignoring case considerations.
    /// </summary>
    /// <param name="left">The buffer to compare with <paramref name="right" />.</param>
    /// <param name="right">The buffer to compare with <paramref name="left" />.</param>
    /// <returns><see langword="true" /> if the corresponding elements in <paramref name="left" /> and <paramref name="right" /> were equal ignoring case considerations and ASCII. <see langword="false" /> otherwise.</returns>
    /// <remarks>If both buffers contain equal, but non-ASCII characters, the method returns <see langword="false" />.</remarks>
    public static bool EqualsIgnoreCase(ReadOnlySpan<byte> left, ReadOnlySpan<byte> right);
    public static bool EqualsIgnoreCase(ReadOnlySpan<byte> left, ReadOnlySpan<char> right);
    public static bool EqualsIgnoreCase(ReadOnlySpan<char> left, ReadOnlySpan<byte> right);
    public static bool EqualsIgnoreCase(ReadOnlySpan<char> left, ReadOnlySpan<char> right);
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed blocking Marks issues that we want to fast track in order to unblock other important work api-ready-for-review API is ready for review, it is NOT ready for implementation labels Apr 20, 2023
@mphelt
Copy link
Contributor

mphelt commented Apr 21, 2023

I know it's pretty late for the party, but shouldn't EqualsIgnoreCase methods be rather put inside nested IgnoreCase type as just Equals?

@dotnet dotnet locked as resolved and limited conversation to collaborators May 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.Encoding
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants