Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: MemoryExtensions.TryWrite for UTF8 #79376

Closed
stephentoub opened this issue Dec 8, 2022 · 8 comments · Fixed by #83852
Closed

[API Proposal]: MemoryExtensions.TryWrite for UTF8 #79376

stephentoub opened this issue Dec 8, 2022 · 8 comments · Fixed by #83852
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Buffers
Milestone

Comments

@stephentoub
Copy link
Member

stephentoub commented Dec 8, 2022

Background and motivation

In .NET 6, as part of rolling out the improved support for string interpolation, we added some TryWrite extension methods to MemoryExtensions that enable interpolating directly into a Span<char>, e.g.

Span<char> span = ...;
bool wrote = span.TryWrite($"The current day/time is {DateTime.Now}", out int charsWritten);

We should consider enabling the same capability for writing UTF8 into a Span<byte>.

API Proposal

namespace System
{
    public static class MemoryExtensions
    {
+        public static bool TryWrite(this Span<byte> utf8Destination, [InterpolatedStringHandlerArgument("destination")] ref TryWriteUtf8InterpolatedStringHandler handler, out int bytesWritten);
+        public static bool TryWrite(this Span<byte> utf8Destination, IFormatProvider? provider, [InterpolatedStringHandlerArgument("destination", "provider")] ref TryWriteUtf8InterpolatedStringHandler handler, out int bytesWritten);

+        [EditorBrowsable(EditorBrowsableState.Never)]
+        [InterpolatedStringHandlerAttribute]
+        public ref struct TryWriteUtf8InterpolatedStringHandler
+        {
+            public TryWriteInterpolatedStringHandler(int literalLength, int formattedCount, Span<byte> utf8Destination, out bool shouldAppend);
+            public TryWriteInterpolatedStringHandler(int literalLength, int formattedCount, Span<byte> utf8Destination, IFormatProvider? provider, out bool shouldAppend);

+            public bool AppendLiteral(string value);
+            // public bool AppendLiteral(ReadOnlySpan<byte> value); // if the C# compiler supports interpolation with u8 literals

+            public bool AppendFormatted(scoped ReadOnlySpan<char> value);
+            public bool AppendFormatted(scoped ReadOnlySpan<char> value, int alignment = 0, string? format = null);

+            public bool AppendFormatted(scoped ReadOnlySpan<byte> utf8Value);
+            public bool AppendFormatted(scoped ReadOnlySpan<byte> utf8Value, int alignment = 0, string? format = null);

+            public bool AppendFormatted<T>(T value);
+            public bool AppendFormatted<T>(T value, string? format);
+            public bool AppendFormatted<T>(T value, int alignment);
+            public bool AppendFormatted<T>(T value, int alignment, string? format);

+            public bool AppendFormatted(object? value, int alignment = 0, string? format = null);

+            public bool AppendFormatted(string? value);
+            public bool AppendFormatted(string? value, int alignment = 0, string? format = null);
        }
}

This is essentially the exact same design as for MemoryExtensions.TryWrite, with a few specific differences:

  • The TryWrite methods take a Span<byte> instead of a Span<char>.
  • In addition to ReadOnlySpan<char>-based AppendFormatted methods, there are also ReadOnlySpan<byte>-based AppendFormatted methods, the latter of which is expected to be UTF8 data, like a u8 literal. The ReadOnlySpan<byte> data would be memcpy'd into the destination, whereas the ReadOnlySpan<char> data would be Encoding.UTF8-encoded into the destination.
  • The implementation of Append<T> will check for IUtf8SpanFormattable before it checks for ISpanFormattable, prefering to use a type's built-in UTF8 formatting support if supplied.

Also note that the AppendLiteral method takes a string value, as that's what's supported by the C# language. If C# were to ever support u8 literals in string interpolation, we could add an appropriate AppendLiteral(ReadOnlySpan<byte> utf8Value) overload.

API Usage

Span<byte> utf8 = ...;
bool wrote = utf8.TryWrite($"The current day/time is {DateTime.Now}", out int bytesWritten);

Alternative Designs

No response

Risks

No response

@stephentoub stephentoub added api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Buffers labels Dec 8, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Dec 8, 2022
@ghost
Copy link

ghost commented Dec 8, 2022

Tagging subscribers to this area: @dotnet/area-system-buffers
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

In .NET 6, as part of rolling out the improved support for string interpolation, we added some TryWrite extension methods to MemoryExtensions that enable interpolating directly into a Span<char>, e.g.

Span<char> span = ...;
bool wrote = span.TryWrite($"The current day/time is {DateTime.Now}", out int charsWritten);

We should consider enabling the same capability for writing UTF8 into a Span<byte>.

API Proposal

namespace System.Buffers.Text
{
    public static partial class Utf8Formatter
    {
        public static bool TryWrite(this Span<byte> destination, [InterpolatedStringHandlerArgument("destination")] ref TryWriteInterpolatedStringHandler handler, out int bytesWritten);
        public static bool TryWrite(this Span<byte> destination, IFormatProvider? provider, [InterpolatedStringHandlerArgument("destination", "provider")] ref TryWriteInterpolatedStringHandler handler, out int bytesWritten);

        [EditorBrowsable(EditorBrowsableState.Never)]
        [InterpolatedStringHandlerAttribute]
        public ref struct TryWriteInterpolatedStringHandler
        {
            public TryWriteInterpolatedStringHandler(int literalLength, int formattedCount, Span<byte> destination, out bool shouldAppend);
            public TryWriteInterpolatedStringHandler(int literalLength, int formattedCount, Span<byte> destination, IFormatProvider? provider, out bool shouldAppend);

            public bool AppendLiteral(string value);

            public bool AppendFormatted(scoped ReadOnlySpan<char> value);
            public bool AppendFormatted(scoped ReadOnlySpan<char> value, int alignment = 0, string? format = null);

            public bool AppendFormatted(scoped ReadOnlySpan<byte> value);
            public bool AppendFormatted(scoped ReadOnlySpan<byte> value, int alignment = 0, string? format = null);

            public bool AppendFormatted<T>(T value);
            public bool AppendFormatted<T>(T value, string? format);
            public bool AppendFormatted<T>(T value, int alignment);
            public bool AppendFormatted<T>(T value, int alignment, string? format);

            public bool AppendFormatted(object? value, int alignment = 0, string? format = null);

            public bool AppendFormatted(string? value) { throw null; }
            public bool AppendFormatted(string? value, int alignment = 0, string? format = null);
        }
}

This is essentially the exact same design as for MemoryExtensions.TryWrite, with a few specific differences:

  • The TryWrite methods take a Span<byte> instead of a Span<char>.
  • In addition to ReadOnlySpan<char>-based AppendFormatted methods, there are also ReadOnlySpan<byte>-based AppendFormatted methods, the latter of which is expected to be UTF8 data, like a u8 literal. The ReadOnlySpan<byte> data would be memcpy'd into the destination, whereas the ReadOnlySpan<char> data would be Encoding.UTF8-encoded into the destination.

Also note that the AppendLiteral method takes a string value, as that's what's supported by the C# language. If C# were to ever support u8 literals in string interpolation, we could add an appropriate AppendLiteral(ReadOnlySpan<byte> value) overload.

API Usage

Span<byte> span = ...;
bool wrote = span.TryWrite($"The current day/time is {DateTime.Now}", out int bytesWritten);

Alternative Designs

No response

Risks

No response

Author: stephentoub
Assignees: -
Labels:

api-suggestion, area-System.Buffers

Milestone: -

@stephentoub
Copy link
Member Author

stephentoub commented Dec 8, 2022

cc: @333fred for string interpolation
cc: @tannergooding for generic math interfaces, since we discussed possible future UTF8 support for something ISpanFormattable-like

@Zintom
Copy link

Zintom commented Dec 8, 2022

Love it!

@333fred
Copy link
Member

333fred commented Dec 8, 2022

@stephentoub was there a specific thing you wanted me to comment on, or just awareness?

@stephentoub
Copy link
Member Author

Just awareness.

@dakersnar dakersnar added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation untriaged New issue has not been triaged by the area owner labels Feb 10, 2023
@dakersnar dakersnar added this to the 8.0.0 milestone Feb 10, 2023
@stephentoub stephentoub changed the title [API Proposal]: Utf8Formatter.TryWrite [API Proposal]: MemoryExtensions.TryWrite for UTF8 Mar 16, 2023
@stephentoub
Copy link
Member Author

@terrajobst
Copy link
Member

terrajobst commented Mar 21, 2023

Video

  • We should rename the method to make it clear that the buffer receives the UTF8 encoding, not the UTF16 encoding
  • We're OK with supporting both UTF8 and UTF16, with UTF16 potentially generating a warning, telling the user that should use a u8 literal to avoid the transcoding
  • Will C# support interpolating strings with the u8 suffix? Seems like it should? @dotnet/ldm
  • We should consider making it work that one removes the $ that the code still compiles, but this would apply for the existing UTF16 version as well
namespace System;

public static partial class MemoryExtensions
{
    public static bool TryWriteUtf8(this Span<byte> destination, [InterpolatedStringHandlerArgument("destination")] ref TryWriteUtf8InterpolatedStringHandler handler, out int bytesWritten);
    public static bool TryWriteUtf8(this Span<byte> destination, IFormatProvider? provider, [InterpolatedStringHandlerArgument("destination", "provider")] ref TryWriteUtf8InterpolatedStringHandler handler, out int bytesWritten);

    [EditorBrowsable(EditorBrowsableState.Never)]
    [InterpolatedStringHandlerAttribute]
    public ref struct TryWriteUtf8InterpolatedStringHandler
    {
        public TryWriteInterpolatedStringHandler(int literalLength, int formattedCount, Span<byte> utf8Destination, out bool shouldAppend);
        public TryWriteInterpolatedStringHandler(int literalLength, int formattedCount, Span<byte> utf8Destination, IFormatProvider? provider, out bool shouldAppend);

        public bool AppendLiteral(string value);
        // public bool AppendLiteral(ReadOnlySpan<byte> value); // if the C# compiler supports interpolation with u8 literals

        public bool AppendFormatted(scoped ReadOnlySpan<char> value);
        public bool AppendFormatted(scoped ReadOnlySpan<char> value, int alignment = 0, string? format = null);

        public bool AppendFormatted(scoped ReadOnlySpan<byte> utf8Value);
        public bool AppendFormatted(scoped ReadOnlySpan<byte> utf8Value, int alignment = 0, string? format = null);

        public bool AppendFormatted<T>(T value);
        public bool AppendFormatted<T>(T value, string? format);
        public bool AppendFormatted<T>(T value, int alignment);
        public bool AppendFormatted<T>(T value, int alignment, string? format);

        public bool AppendFormatted(object? value, int alignment = 0, string? format = null);

        public bool AppendFormatted(string? value);
        public bool AppendFormatted(string? value, int alignment = 0, string? format = null);
    }
}

@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Mar 21, 2023
@stephentoub stephentoub self-assigned this Mar 21, 2023
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Mar 22, 2023
@bartonjs bartonjs added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-approved API was approved in API review, it can be implemented labels Mar 23, 2023
@bartonjs
Copy link
Member

bartonjs commented Mar 23, 2023

Video

  • We revisited this and decided to remove the extension methods, and change to be direct static methods on System.Text.Unicode.Utf8
namespace System.Text.Unicode;

public static partial class Utf8
{
    public static bool TryWrite(Span<byte> destination, [InterpolatedStringHandlerArgument("destination")] ref TryWriteUtf8InterpolatedStringHandler handler, out int bytesWritten);
    public static bool TryWrite(Span<byte> destination, IFormatProvider? provider, [InterpolatedStringHandlerArgument("destination", "provider")] ref TryWriteUtf8InterpolatedStringHandler handler, out int bytesWritten);

    [EditorBrowsable(EditorBrowsableState.Never)]
    [InterpolatedStringHandlerAttribute]
    public ref struct TryWriteInterpolatedStringHandler
    {
        public TryWriteInterpolatedStringHandler(int literalLength, int formattedCount, Span<byte> utf8Destination, out bool shouldAppend);
        public TryWriteInterpolatedStringHandler(int literalLength, int formattedCount, Span<byte> utf8Destination, IFormatProvider? provider, out bool shouldAppend);

        public bool AppendLiteral(string value);
        // public bool AppendLiteral(ReadOnlySpan<byte> value); // if the C# compiler supports interpolation with u8 literals

        public bool AppendFormatted(scoped ReadOnlySpan<char> value);
        public bool AppendFormatted(scoped ReadOnlySpan<char> value, int alignment = 0, string? format = null);

        public bool AppendFormatted(scoped ReadOnlySpan<byte> utf8Value);
        public bool AppendFormatted(scoped ReadOnlySpan<byte> utf8Value, int alignment = 0, string? format = null);

        public bool AppendFormatted<T>(T value);
        public bool AppendFormatted<T>(T value, string? format);
        public bool AppendFormatted<T>(T value, int alignment);
        public bool AppendFormatted<T>(T value, int alignment, string? format);

        public bool AppendFormatted(object? value, int alignment = 0, string? format = null);

        public bool AppendFormatted(string? value);
        public bool AppendFormatted(string? value, int alignment = 0, string? format = null);
    }
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Mar 23, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Mar 23, 2023
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Mar 23, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 6, 2023
@ghost ghost locked as resolved and limited conversation to collaborators May 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Buffers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants