-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in num2date cinversion #134
Comments
Microseconds are now included in the comparison (#118) - and the 2nd and 5th dates are off by 13 microseconds. The calculation itself has an accuracy of about 100 milliseconds. Perhaps we should modify the comparison to ignore differences of less than a specified tolerance? @davidhassell - what do you think? |
Hi, Thank you for prompt response. I do not think that the comparison is an issue. I can imagine a case when 13 microseconds difference matters, so comparison did a good job. I would say, that in given example the issue is in handling dates as float/double. The integer number of seconds from midnight should never result in non-zero microseconds. In unix the internal format for time (struct timeval) is two integers: seconds and microseconds. It would be great to have something like that in cftime... |
cftime time uses julian day/fractional julian day for it's arithmetic - so it's subject to usual issues of floating point precision/comparison. Probably a better way to compare dates given a specified tolerance would be something like this: import numpy as np
import cftime as nc4
iarr=np.arange(86400, dtype=np.int32)
units='hours since 2018-01-01 00:00:00 UTC'
dates1 = nc4.num2date(iarr,units)[-5:]
dates2 = nc4.num2date(iarr[-5:],units)
datediff = [diff.microseconds for diff in dates1-dates2]
print(np.abs(datediff) < 100) # tolerance of 100 microseconds
[ True True True True True] |
Thank you for the workaround! I have already implemented something like this, but such solutions in my applications tend to strike back after some time... So, please, keep the comparison. It works perfectly well. If floating-point julian day is something that is not easy to fix, probably the dates produced by num2date could be rounded internally to, say, nearest millisecond? Then a millisecond accuracy could be declared as a feature of num2date. Makes sense? |
This is a very complicated issue that has been discussed often - there are no easy solutions. Rounding to the nearest millisecond will not give you millisecond accuracy, since a 64 bit julian day is only accurate to about O(10) milliseconds. There are certainly more accurate algorithms out there, but none that I have found so far that work will all the calendars available in the CF metadata standard. |
Thanks! Indeed, climatologists are quite creative in inventing oteher-planet calendars, and there even a few-days per year accuracy ("360_day" calendar) would be tolerable by design. But for calendars like proleptic_georgian accuracy does matter. Actually, I would be fine even with one-second accuracy, as long as it results in exact values when it is possible. The issue in the example above is that num2date returns different values of datetime depending on the length of the input array it is fed with. The input values are exactly the same... With the latest cftime even the returned objects are different.
I wonder if it is desirable behaviour... |
That is caused by this Line 501 in cb16118
which was intended to fix another problem (#78). If you comment out the if-block here Line 513 in cb16118
you will see that all the dates are the same, but the calendar formatting may be messed up because the date will not land exactly on midnight. The question is - which behaviour is less desirable? |
Sorry, I was mistaken - the cause is actually the fact that python datetime calculations are used in one instance and cftime calculations in the other. If you use Line 272 in cb16118
The issue with negative times was discovered here Arguably, |
Looks like a mess... Certainly, returning different objects depending on subtle differences in the input is confusing. Having two time-handling libraries, one precise, but limited in time, another imprecise, but more flexible and universal, i would prefer to explicitly force use of one or another depending on my application, may be even with different functoins. In the current implementation one can force cftime, but, unfortunately, one can not force using datetime. That seems to be a missing feature. Automatic choice and/or fallbacks is a way to an unexpected behaviour, so if datetime is forced, an exception should be generated. Makes sense? |
That is something I did not understand. The check for postimes checks for positive interval rather than positive time. The datetime module is perfectly capable of using negative intervals (timedeltas) or multiply positive intervals with negative numbers. The reason for this check was, probably, to guard against getting into before the datetime's MINYEAR with negative offsets. |
Yes, that was the reason. |
Thank you! Just coming back to the issue. Is there a way (or can it be implemented?) to force num2date to return datetime.datetime objects? |
If use_only_cftime_datetimes=False, python datetime instances will be returned where possible (but not for negative times - extra logic would need to be added check datetime's MINYEAR for that to work). |
Thank you! Indeed we regularly deal with negative times. I seek for much simpler option to get datetime instance where possible, and an exception when resulting date does not fit the datetime range. Not much logic is needed for that. |
Are you looking for a Or, perhaps instead of adding a new kwarg, we could use |
Yep. That's what i was after. I guess, it should also work for
While it is possible to instruct |
We don't want to have different objects returned depending on the input - as you suggest, that leads to surprises and hard to debug errors. I can add have num2date return python datetime objects if |
Great! That should be fine for my applications. |
@rkouznetsov - could you test PR #139 (branch |
@rkouznetsov - have you had a chance to try PR #139 (branch issue134) yet? |
Yep. Sorry. Got a bit confused with github interface. Please see my comment at #139 |
PR #139 merged |
Originally posted to Unidata/netcdf4-python#981, but is still reproducible with freshly-cloned cftime.
Results in
[ True False True True False]
Should be
[ True True True True True]
Environment:
Ubuntu 18.04, python-numpy 1:1.13.3-2ubuntu1, cftime (revision ea68823).
Thank you!
The text was updated successfully, but these errors were encountered: