-
-
Notifications
You must be signed in to change notification settings - Fork 17.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON native support for datetime encoding #4498
Conversation
Build failing on Python 2.6 and 3.2, I'll try to get my hands on 2.6 and take a look. |
|
Ok
Yes date parsing won't work properly otherwise, added an example to the docs about this. Rebased it all to current master. Want me to squash any of these commits? |
look good... I guess its 'impossible' to figure out if the user specified an 'invalid' date_unit, though if its out of range of valid dates then you could...? e.g. if I wote a date in ns most read-back in s should fail |
I guess we could try and sniff the values to try and deduce the relative magnitude, or alternatively do something like for unit in ('s', 'ms', 'us', 'ns'):
try:
to_datetime(date_data, unit=unit)
break
except OverflowError:
pass which would only happen if the user passed |
ok....that makes sense....default will be 'ms' anyhow, correct? |
Default will be |
Ok I've added timestamp precision detection when deserialising. It seems to work pretty well so I've set the default to |
raise ValueError('date_unit must be one of %s' % | ||
(self.stamp_units,)) | ||
else: | ||
self.min_stamp *= (next((pow(1e3, i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't the min_stamp just be in a class level dictionary? and then lookup the date_unit, which you then default if date_unit is None; also lower case the date_unit (in case you are passed something else)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok done
@Komnomnomnom just a couple of minor points....this is pretty awesome though! squash down to a couple of commits and do those minor points and can merge |
Ok added a commit with the fixes, will wait to hear back before squashing, so it's easier to review the latest changes. |
ok...one more thing....can you include a small json file from 0.12 (generate it with 0.12), and put in a compat checker, just to make sure that you would interpret correctly? (I think it works, just making sure we don't break the existing)... |
Good call @jreback! This exposed a few things: 1) The NaT handling in 0.12 was not correct, this has already been addressed by the previous changes but it means I wasn't able to include NaT's in my compat test: 0.12 In [81]: s = pd.Series([pd.NaT])
In [82]: s.to_json()
Out[82]: '{"0":-9223372036854775808}' This PR In [7]: s = pd.Series([pd.NaT])
In [8]: s.to_json()
Out[8]: '{"0":null}' Added a line to the release notes about this (bug fix). 2) This is an odd one, I don't fully understand why but if a date column including cdef inline int64_t cast_from_unit(object unit, object ts) except -1: All pandas tests still pass for me with this change but I am not that familiar with Cython or this code though so.... 3) Had to modify json.py a bit more to handle deserialising 4) I removed the casting to |
I think in the call for NaT testnig, you still need to check if any |
Sweet, that makes it compatible with the NaT handling from 0.12. It still won't be able to parse NaT's from 0.12 serialised with iso format though: In [187]: s = pd.Series([pd.NaT])
In [189]: s.to_json(date_format='iso')
Out[189]: '{"0":"0001-255-255T00:00:00"}' |
gr8; since default was looks good to go....pls squash down (after passing) and will get it in |
Any idea why Travis would be choking on the compat test, it appears it might be a problem related to reading the external file, do they need to be whitelisted or anything? Works for me locally on the same Python / numpy version. |
In
|
Doh didn't think of that, thanks |
Squishified commits and rebased to master. Good to go once travis signs off if you're happy with it |
looks good to me! thanks! bombs away.... |
JSON native support for datetime encoding
https://travis-ci.org/pydata/pandas/jobs/10223886 I had this fail once and then work locally....?? |
This adds support at the C level for encoding datetime values in JSON. I added a new parameter
date_unit
which controls the precision of the encoding, options are seconds, milliseconds, microseconds and nanoseconds. It defaults to milliseconds which should fix #4362.I also added support for NaTs and tried to detect when numpy arrays of datetime64 longs are passed.
Note that datetime decoding is still handled at the Python level but I added a date_unit param here too so timestamps can be converted correctly.
valgrind looks happy with the changes and all tests pass ok for me on Python 2.7 & 3.3