-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add flexible ISO 8601 support to Utf8Parser/Formatter (and beyond?) #28942
Comments
If this format is as common as is cited, then I'd say yes. If there's value in adding support to the UTF8-based implementation, there's similar value in adding support to the UTF16-based implementation. |
I've created new logger that only supports directly write to UTF8 to support zero allocation formatting. The DateTime format is important when creating a human-readable log. For example, I'd be happy to write something like this // Additing Log's prefix
logging.AddZLoggerConsole(options =>
{
// Action<IBufferWriter<byte>, LogInfo>?
options.PrefixFormatter = (writer, info) => ZString.Utf8Format(writer, "[{0}][{1:yyyy-MM-dd HH:mm:ss}]", info.LogLevel, info.Timestamp.DateTime.ToLocalTime());
});
// {1:yyyy-MM-dd HH:mm:ss} will translate to
// Utf8Formatter.TryFormat(info.Timestamp, dest, out var written, StandardFormat.Parse("yyyy-MM-dd HH:mm:ss"));
// If you can set this option, you will get the following message
// [Debug][2020-04-09 07:56:12]foo
logger.ZLogDebug("foo"); |
.... officially 8601 allows for leap seconds, so the inclusive range should 00-60. This is mostly fine as |
Sorry for misusing this issue about an So, although running the risk of duplicating issues that you fixed in
And can I ask for Thumbs up for extracting what |
In particular, Java (and NodaTime, and probably some databases) supports storing and passing nanosecond precision, so 9 digits would help with interoperability. (In terms of a maximum limit, the 19 maximum digits available were NTP to shift to a 64-bit fractional second should likely be enough, although even that is overkill) However, note that actually/accurately getting a timestamp that precise requires special hardware - no consumer/server hardware is going to do it - and the best that can be done is 7 digits (... in most cases - some hardware or older software stacks may still be limited to less than 3). Doing this may also play havoc with round tripping (because ingestion could only accept 7 digits, and so would return truncated values). |
Sure, I was not talking about getting the precision. That requires not using I'm just talking about accepting and producing the required format that can be expressed through the existing API, truncating or rounding to 7 digits when parsing and producing zero padding on formatting. |
Is there yet a consensus on whether this should be implemented, as the issue is still open but with a suggestion-label? For what it's worth, I think it's an important missing piece in Utf8Parser. |
This proposal stems from a discussion (dotnet/coreclr#22999) regarding adding a new format specifier “I” to the UTF8 DateTime(Offset) parser (and subsequently formatter) for ISO 8601 strings. The new format was to be used by the
Utf8JsonReader/Writer
.Given feedback, it is clear that adding a new format specifier requires further investigation and design.
The workaround for the
Utf8JsonReader/Writer
was to implement the parsing and formatting logic internally.We are looking to add performant UTF-8
DateTime(Offset)
read/write support (https://github.com/dotnet/corefx/issues/34690, https://github.com/dotnet/corefx/issues/34576) to theUtf8JsonReader/Writer
for the following ISO 8601 profile:YYYY-MM-DD[Thh:mm[:ss[.s]][TZD]]
where:
YYYY
= four-digit yearMM
= two-digit month (01=January, etc.)DD
= two-digit day of month (01 through 31)T
= ‘T’ or ‘ ‘hh
= two digits of hour (00 through 23) (am/pm NOT allowed)mm
= two digits of minute (00 through 59)ss
= two digits of second (00 through 59)s
= one or more digits representing a decimal fraction of a secondTZD
= time zone designator (Z or +hh:mm or -hh:mm or +hhmm or –hhmm or +hh or -hh)Rationale and Usage
ISO 8601 is an unambiguous and prolific date and time standard, important for interoping with different systems/languages.
As part of 3.0 we are adding the new
Utf8JsonReader/Writer/Document
types. We have implemented(Try)GetDateTime(Offset)
andWriteString(Value)
methods on the Json Reader/Writer types (dotnet/corefx#35903, dotnet/corefx#35966) to read and write ISO 8601DateTime(Offset)
strings. The parsing/formatting logic is currently internal to the types:Utf8JsonReader
https://github.com/dotnet/corefx/blob/ef1b1835a8459cee50d895b1e2040bb2336eeeda/src/System.Text.Json/src/System/Text/Json/Reader/JsonReaderHelper.Date.cs#L43-L357
Utf8JsonWriter
https://github.com/dotnet/corefx/blob/52f3ad9f5f1276833fe8f53c79b5a995c8865df9/src/System.Text.Json/src/System/Text/Json/Writer/JsonWriterHelper.Date.cs#L22-L126
The proposal for adding ISO 8601 support is to add a new custom format specifier to the Utf8Parser/Formatter for the
(Try)GetDateTime(Offset)
methods to depend on (in place of the internal logic) with the following specifications. This will be particularly beneficial if there is a need for flexible ISO 8601 processing beyond Json.API (No changes)
API methods for DateTime reading and writing have already been approved.
These methods are could use the already existing
Utf8Parser/Formatter
types for parsing and formattingDateTime
data to replace the internally implemented parse/format logic:Details
Of note are the
format
andstandardFormat
parameters of theTryParse
andTryFormat
methods of the Utf8Parser/Formatter. The current options for the format are “G”/standard, “O”, and “R”. Of these, only “O” implements a profile of the ISO 8601 standard, the strict Round-trip format specifier:yyyy'-'MM'-'dd'T'HH':'mm':'ss'.'fffffffK
. Most ISO 8601 adopters do not use this profile. Thus, it is beneficial to implement a new format which is more permissive and also very performant. We can call it “I”.Open Questions
Utf8JsonReader/Writer
rather than rolling up toUtf8Parser/Formatter
?Utf8Parser/Formatter
types, should it be as a new API (e.g.TryParseAsISO
/TryFormatToISO
) rather than a new format specifier (theformat
/standardFormat
arguments)?Utf8Parser/Formatter
, do we also need/want to roll up support toDateTime{Offset}.{Try}Parse/Format/ToString
?Pull Requests
Utf8Parser
: Add UTF8 DateTime(Offset) parser for new date and time format "I" coreclr#22999.Utf8JsonReader
: Add (Try)GetDateTime(Offset) to Utf8JsonReader corefx#35903.Utf8JsonWriter
: Add (Try)GetDateTime(Offset) to Utf8JsonReader corefx#35903.The text was updated successfully, but these errors were encountered: