Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python date conversion issue with json_format #27

Closed
gitpushdashf opened this issue Nov 13, 2020 · 5 comments
Closed

Python date conversion issue with json_format #27

gitpushdashf opened this issue Nov 13, 2020 · 5 comments

Comments

@gitpushdashf
Copy link

First off, thank you so much for adding the json_format stuff. It's just been fantastic to work with.

Noticing a bug with our code, or could maybe be with json_format.

I know about the DST issue mentioned here: https://github.com/google/fhir/releases/tag/v0.6.2

We are trying to normalize to UTC (or rather, treat all dates as UTC). At least for now, in this bit of code.

What we're seeing is that when converting from FHIR JSON to protobuf to FHIR JSON, 2016-03-03T10:10:57-07:00 ends up getting represented as 2016-03-04. I've tried forcing UTC timezone on my system and am still seeing that.

Does this seem possible? Likely? One comparison I had seemed to indicate it was off by exactly 8 hours.

I can come up with a more concrete test case if you like. This is using Synthea generated JSON LPRs.

My colleague is suggesting that maybe it's how the protobufs are converting to microseconds without timezones, then back to a date string. I still don't see how the date would up being ahead and not behind (or the same, looking at the date and not the time).

Thank you!

@gitpushdashf
Copy link
Author

On first pass, not really able to reproduce this how I thought I would. I'll have to keep digging.

https://github.com/gitpushdashf/issue27

$ python3 issue.py | grep -- '[0-9]*T' | grep start | tail -n 1
              "start": "2020-07-24T08:07:57-06:00",
$ cat 25dc5040-5875-8486-a49c-ed4789b7c88c.json | grep -- '[0-9]*T' | grep start | tail -n 1
              "start": "2020-07-24T08:07:57-06:00",
$ cat 25dc5040-5875-8486-a49c-ed4789b7c88c.json | grep -- '[0-9]*T' | grep start > orig-starts
$ python3 issue.py | grep -- '[0-9]*T' | grep start > new-starts
$ diff -Naur new-starts orig-starts
$

@gitpushdashf
Copy link
Author

I'm also using freezegun, so maybe there's some compatibility quirk between the two.

@Cam2337
Copy link
Collaborator

Cam2337 commented Nov 13, 2020

Hey @gitpushdashf!

First off, thank you so much for adding the json_format stuff. It's just been fantastic to work with.

Thank you, and we're glad to hear it! :)

We are trying to normalize to UTC (or rather, treat all dates as UTC). At least for now, in this bit of code.

What we're seeing is that when converting from FHIR JSON to protobuf to FHIR JSON, 2016-03-03T10:10:57-07:00 ends up getting represented as 2016-03-04. I've tried forcing UTC timezone on my system and am still seeing that.

Hmm interesting - I'd like to see a dump between each phase of the "round trip". The issue you're describing is strange for a couple of reasons to me:

  1. The precision of the result (2016-03-04) is in DAY while your initial FHIR JSON value is specified in a higher precision; not sure if this was intentional?
  2. Given that your original timezone is 7hrs behind UTC, it's interesting that the resulting value would be basically be rolled-over into the next day..

What value of default_timezone are you providing to json_format when parsing? I'm wondering if maybe the Python libraries fail to pick up a timezone offset, and perhaps we should be failing more loudly?

@Cam2337
Copy link
Collaborator

Cam2337 commented Nov 13, 2020

I'm also using freezegun, so maybe there's some compatibility quirk between the two.

Oh interesting -- could you provide a gist of an example testcase using freezegun?

@gitpushdashf
Copy link
Author

Heh, I'm sorry to waste your time. It actually looks like json_format is working fine.

My initial discrepancies were with our protobuf LPRs, converted a while ago in Java. It seems that the time fields on those may have been thrown off, vs the original json.

With that test case I wasn't able to see a single difference between the source LPR from Synthea and the outputted one, after converting to protobuf and back. Seems pretty remarkable to me. Even timzeone notations were kept.

To add to the complication, my Synthea patient had two Colonoscopies back to back, one on the third and one on the fourth. I didn't notice the one on the fourth and thought it was just messing with me after finding there was some time discrepancy between the protobuf and json.

Anyway, I very much appreciate your reply. If we do indeed find some kind of time zone or time conversion issue, I'll write up some more involved repository that actually demonstrates it and will send it your way.

In the meanwhile, thank you for your time and I hope you have a wonderful weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants