-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
time: Parse too greedy when parsing some formats #12919
Comments
Basically, this means that we should change getnum to only take a single digit if taking two digits would cause the value to be out of range. Right now it always takes two digits if there are two digits, and then later checks whether they are out of range. I can't decide whether this change would be a good idea or not. |
That feels like it would invalidly parse (or attempt to) a lot of values. I think that would cause more problems that solve. Without field width, or field separators, I don't think you can solve all cases for this format (without making assumptions that may make some invalid strings valid). Thank you. |
This particular format is unambiguous, since "304 PM" indicates a 1-2 digit hour followed by exactly 2 digits for minutes, then whitespace, then an AM/PM indicator. The problem is, when you're looking at a starting digit of 1 for the hour, you don't know whether you're dealing with hour 1, 10, 11, or 12. Without the AM/PM indicator (24 hour clock), we would have the same issue with starting digits of 0 or 1. One way to resolve this would be to look at 304 as a special case, in which we gather 3-4 digits and then take the last 2 as minutes and the first 1-2 as the hour. 30405 could be treated the same way (5-6 digits). But then there's also 102, 106, 12006, 10206, 1022006, and so on. There are 5 variable width numeric fields and 5 corresponding fixed width numeric fields, plus 2 numeric year representations, so 5 * (2^4-1) * 3 + 5 * 2 = 235 unambiguous representations that we don't handle currently. Even more if you allow the variable width field to be somewhere other than at the front. Still, I can envision code that could handle all those situations gracefully. One option: when encountering a single digit format specifier, verify that the remaining adjacent numeric fields are fixed width, and look ahead to count the number of digits available. If there is an extra digit, parse 2, otherwise parse 1. If you have adjacent variable width numeric fields in the format, such as "34", this should probably be an "ambiguous layout" error, at least when parsing. |
Instead of treating all ambiguities I suggest that getnum function is doing its job perfectly only problem I see is that we need to throw a different error when input layout for HH:MM is given as H:MM / HH:M. |
What version of Go are you using (
go version
)?$ go version go version devel +2d823fd Tue Oct 13 01:04:42 2015 +0000 linux/amd64
What operating system and processor architecture are you using?
linux/amd64
What did you do?
Attempted to parse this time:
"618 AM MDT TUE OCT 13 2015"
with this format string:"304 PM MST Mon Jan 02 2006
"What did you expect to see?
No parsing error, and a time.Time representing that moment.
What did you see instead?
The parsing fails, and the error reported:
parsing time "618 AM MDT TUE OCT 13 2015": hour out of range
. The hour is interpreted as61
.The format works properly with a two digit hour. It seems
getnum()
in time/format.go takes up to two characters whenever the second character is valid which leads to the issue.Playground example with examples that fail and pass for the given format:
http://play.golang.org/p/mZ-m7gJlvy
Apologies if my code is messy...
The format is used as a time/version stamp for the US National Weather Service in their forecast discussion products, for a real world example, http://www.wrh.noaa.gov/total_forecast/getprod.php?wfo=boi&pil=AFD&sid=BOI&version=1
Thank you.
The text was updated successfully, but these errors were encountered: