Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time: unexpected change in parseRFC3339 behaviour #55876

Closed
synenka opened this issue Sep 26, 2022 · 9 comments
Closed

time: unexpected change in parseRFC3339 behaviour #55876

synenka opened this issue Sep 26, 2022 · 9 comments
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. release-blocker
Milestone

Comments

@synenka
Copy link

synenka commented Sep 26, 2022

What version of Go are you using (go version)?

go1.20

Does this issue reproduce with the latest release?

yes

What did you do?

ts, err := time.Parse(time.RFC3339, "2020-04-14T08:13:26--4:00")

After http://go.dev/cl/425038 change was submitted, "2020-04-14T08:13:26--4:00" date format, which previously could be parsed as RFC3339, fails to be parsed.

Here is a playground example: https://go.dev/play/p/PqyYj1t2T10 shows that this format was accepted with go1.19 but not with the latest dev version.

What did you expect to see?

time.Parse behavior is consistent and follows RFC3339 specification.

What did you see instead?

time.Parse behavior changed unexpectedly after http://go.dev/cl/425038.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Sep 26, 2022

CC @dsnet

@ianlancetaylor ianlancetaylor added this to the Go1.20 milestone Sep 26, 2022
@ianlancetaylor ianlancetaylor added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels Sep 26, 2022
@dsnet
Copy link
Member

dsnet commented Sep 26, 2022

Thanks for the report.

If RFC 3339 is the standard we're determining correctness by, then the old behavior was buggy and the new behavior is compliant.

The relevant grammar for RFC 3339 says:

time-hour       = 2DIGIT  ; 00-23
time-minute     = 2DIGIT  ; 00-59
time-numoffset  = ("+" / "-") time-hour ":" time-minute

From this, we can see that:

  • +00:00 is valid
  • -10:15 is valid
  • --4:00 is invalid since -4 violates the grammar for time-hour, which only permits two decimal digits
  • +24:00 is invalid since 24 exceeds the specified range of [0:23] for time-hour
  • +00:60 is invalid since 60 exceeds the specified range of [0:59] for time-minute

@apparentlymart
Copy link

apparentlymart commented Sep 27, 2022

It seems that this optimization (#54093) has implicitly introduced some (but not all) of the potential-new-strictness discussed in #54580. Was that intentional? (I see that @dsnet is the author of both.)

I think the ideas proposed in #54580 are worth considering and could give us a correct RFC 3339 parser whereas today we have only an approximation of one, but the current contract for time.Parse describes it only as an implementation of the generalized time format and doesn't say anything about particular format strings being treated as special.

I would personally prefer a change framed as an optimization to not change the documented behavior and to discuss the tradeoffs of stricter parsing separately in #54580. I maintain some software that accidentally exposed Go's previous not-quite-right implementation of RFC3339 parsing as a public interface and so while I would be pleased if Go had been stricter before that code was written I will now probably have to face the same sort of concern this issue is discussing but downstream one level. Specifically: I'm going to need to decide whether to accept this breaking change as a breaking change to my own software too, or to maintain my own parser that is "bug-compatible" with what Go used to do. That's of course my problem rather than yours 😀 but I'd prefer that tradeoff to be made in the context of your proposal issue rather than as a side-effect of a change presented only as an optimization.

@dsnet
Copy link
Member

dsnet commented Sep 27, 2022

I would personally prefer a change framed as an optimization to not change the documented behavior

CL 425038 is what changed the behavior, which does not present itself as an optimization. It presents itself as a bug fix.

In my opinion, this is fixing a real bug since it doesn't make sense for individual hour, minute, or second fields to each have a sign when the entire time zone offset already has a sign.

For example:

time.Parse(time.RFC3339, "2000-01-01T00:00:00-03:05") // -0305
time.Parse(time.RFC3339, "2000-01-01T00:00:00--3:-5") // +0305
time.Parse(time.RFC3339, "2000-01-01T00:00:00--3:+5") // +0255
time.Parse(time.RFC3339, "2000-01-01T00:00:00-+3:-5") // -0255
time.Parse(time.RFC3339, "2000-01-01T00:00:00-+3:+5") // -0305
time.Parse(time.RFC3339, "2000-01-01T00:00:00+-3:-5") // -0305
time.Parse(time.RFC3339, "2000-01-01T00:00:00+-3:+5") // -0255
time.Parse(time.RFC3339, "2000-01-01T00:00:00++3:-5") // +0255
time.Parse(time.RFC3339, "2000-01-01T00:00:00++3:+5") // +0305
time.Parse(time.RFC3339, "2000-01-01T00:00:00+03:05") // +0305

It is bizarre to allow back-to-back signs, where the scope of the very first sign is unclear in relationship to the sign present on all the individual fields.

If we want to revert CL 425038 in the name of backwards compatibility, then I feel all the more strongly that we provide a way to parse strictly according to RFC 3339 (#54580).

@bcmills bcmills added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Sep 27, 2022
@gopherbot gopherbot removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 27, 2022
@apparentlymart
Copy link

apparentlymart commented Sep 27, 2022

Knowing that this is limited only to extra signs in the timezone offset makes me feel better about keeping it, at least for my purposes. Sorry for my misunderstanding about the scope.

With that said, I do like the idea of there being a built-in way to strictly parse the RFC 3339 format and hope that will be approved independently of this issue.

@dsnet
Copy link
Member

dsnet commented Sep 27, 2022

No worries, I feel the same way as you that having the ability to strictly parse RFC 3339 is necessary when implementing protocols that do require true RFC 3339.

@heschi
Copy link
Contributor

heschi commented Oct 5, 2022

If the old behavior was a bug, and we only know of one affected user, then let's stick with the newly-compliant behavior unless there are strong objections I'm not aware of. No changes needed here as I understand it, closing.

@heschi heschi closed this as completed Oct 5, 2022
@bcmills
Copy link
Member

bcmills commented Oct 5, 2022

I agree with Heschi and Joe: this looks like a bug-fix.

From a Go compatibility perspective, the comment in https://pkg.go.dev/time@go1.19.1#pkg-constants says:

Numeric time zone offsets format as follows:

"-0700"     ±hhmm
"-07:00"    ±hh:mm
"-07"       ±hh
"-070000"   ±hhmmss
"-07:00:00" ±hh:mm:ss

Replacing the sign in the format with a Z triggers the ISO 8601 behavior of printing Z instead of an offset for the UTC zone.

Note that all of those forms include the leading - in the format token to be consumed and require a single explicit ± on the corresponding formatted value. The docs seem unambiguous here: the time.RFC3339 format string never should have accepted --4:00. (The documentation and the implementation disagreed, and here the implementation was fixed to match the existing documentation.)

@apparentlymart
Copy link

apparentlymart commented Oct 5, 2022

For what it's worth I'm sorry that I misunderstood the scope of this change at first read. My part of this was actually just me being careless: I had opened several tabs to try to understand what had changed here and then confused myself and reached the wrong conclusion.

Now that I am looking at the right set of changes, I totally agree that this was a bug that has now been fixed, and think it's perfectly reasonable to pass this same behavior change on to the users of my software with the same justification: there's no good reason to include additional sign markers inside the timezone offset components.

Of course you all don't need my permission to close this issue anyway 😀 but just wanted to affirm that this decision seems reasonable to me and I don't expect to do anything unusual in response to it other than to note the fix in my software's changelog once we upgrade to a version of Go whose standard library includes this fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. release-blocker
Projects
None yet
Development

No branches or pull requests

7 participants