Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strptime/1 ignores ISO-8601 TimeZone (format string "%z") #2195

Open
vintnes opened this issue Oct 15, 2020 · 5 comments
Open

strptime/1 ignores ISO-8601 TimeZone (format string "%z") #2195

vintnes opened this issue Oct 15, 2020 · 5 comments

Comments

@vintnes
Copy link

vintnes commented Oct 15, 2020

~ uname -a | awk '$2 = "REDACTED"'
Linux REDACTED 4.9.0-13-amd64 #1 SMP Debian 4.9.228-1 (2020-07-05) x86_64 GNU/Linux

~ jq/jq --version
jq-1.6-128-ga17dd32-dirty

~ jq/jq -nc \
' "2020-10-15T17:30:00-0400"
, "2020-10-15T17:30:00+0000"
| strptime("%FT%T%z")
'
[2020,9,15,17,30,0,4,288]
[2020,9,15,17,30,0,4,288]
@vintnes
Copy link
Author

vintnes commented Oct 16, 2020

Here is my absolutely horrific regex workaround with optional fractional seconds and colon offset.

def fromdateiso8601offset
  : capture
    ( "^"
    + "(?<datetime>[-:0-9T]{19})"
    + "(?<subseconds>\\.[0-9]+)?"  # optional
    + "(?<offset_sign>[-+])"
    + "(?<offset_hours>[0-9]{2})"
    + ":?"
    + "(?<offset_minutes>[0-9]{2})"
    + "$"
    )

  | .datetime += "Z"
  | .subseconds //= 0
  | .offset_sign += "1"  # string math ftw

  | (.subseconds, .offset_sign, .offset_hours, .offset_minutes) |= tonumber

  | (.datetime | fromdateiso8601)
  + ( ( .offset_hours * 3600
      + .offset_minutes * 60
      )
    * .offset_sign * -1  # the Earth rotates eastward
    )
  + .subseconds
  ;

@gpearce
Copy link

gpearce commented Feb 16, 2021

👍 +1

echo '{"time":"2021-02-16T15:36:29+0000"}{"time":"2021-02-16T15:36:29+0100"}' | 
jq '{input: .time, hour: (.time | strptime("%Y-%m-%dT%H:%M:%S%z") | strftime("%H") ), epoch: (.time | strptime("%Y-%m-%dT%H:%M:%S %z") | mktime)}'

❌ CentOS7:

$ jq --version
jq-1.6
---
{
  "input": "2021-02-16T15:36:29 +0000",
  "hour": "15",
  "epoch": 1613489789
}
{
  "input": "2021-02-16T15:36:29 +0100",
  "hour": "15",
  "epoch": 1613489789
}

✅ MacOS (Big Sur):

$ jq --version
jq-1.6
---
{
  "input": "2021-02-16T15:36:29+0000",
  "hour": "15",
  "epoch": 1613489789
}
{
  "input": "2021-02-16T15:36:29+0100",
  "hour": "14",
  "epoch": 1613486189
}

@gpearce
Copy link

gpearce commented Feb 16, 2021

@vintnes thanks for the example above! 🏅

We found l23 should be:

* (.offset_sign * -1)

else the offset is applied backwards.

Working example:

echo '{"time":"1970-01-01T02:00:00-0100"}{"time":"1970-01-01T02:00:00+0000"}{"time":"1970-01-01T02:00:00+0100"}' | 
jq -r 'def fromdateiso8601offset
  : capture
    ( "^"
    + "(?<datetime>[-:0-9T]{19})"
    + "(?<subseconds>\\.[0-9]+)?" # optional
    + "(?<offset_sign>[-+])"
    + "(?<offset_hours>[0-9]{2})"
    + ":?"
    + "(?<offset_minutes>[0-9]{2})"
    + "$"
    )

  | .datetime += "Z"
  | .subseconds //= 0
  | .offset_sign += "1" # string math ftw

  | (.subseconds, .offset_sign, .offset_hours, .offset_minutes) |= tonumber

  | (.datetime | fromdateiso8601)
  + ( ( .offset_hours * 3600
      + .offset_minutes * 60
      )
    * (.offset_sign * -1)
    )
  + .subseconds
  ; {time: .time, epoch: (.time | fromdateiso8601offset), utc: (.time | fromdateiso8601offset | strftime("%Y-%m-%dT%H:%M:%S %z"))}';
{
"time": "1970-01-01T02:00:00-0100",
"epoch": 10800,
"utc": "1970-01-01T03:00:00 +0000"
}
{
"time": "1970-01-01T02:00:00+0000",
"epoch": 7200,
"utc": "1970-01-01T02:00:00 +0000"
}
{
"time": "1970-01-01T02:00:00+0100",
"epoch": 3600,
"utc": "1970-01-01T01:00:00 +0000"
}

@vintnes
Copy link
Author

vintnes commented Feb 17, 2021

Thanks @gpearce for reminding me about the rotation of the earth.

@jmarianer
Copy link

I modified the regex provided by @vintnes to bypass strptime entirely and created a fork. I'm not sure what the performance implications are of using a regex as opposed to strptime (or sscanf on systems that don't have it); I also don't know what the performance goal is for jq. That said, my gut instinct is that (a) fromdateiso8601 isn't so widely used as to be a performance bottleneck, and (b) the regex is simple enough that it can be executed quickly (there's no backtracking, for example).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants