Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time dimension with month units cannot be loaded #181

Closed
Datseris opened this issue Jun 28, 2022 · 9 comments
Closed

Time dimension with month units cannot be loaded #181

Datseris opened this issue Jun 28, 2022 · 9 comments

Comments

@Datseris
Copy link
Contributor

Datseris commented Jun 28, 2022

Hi there, I have this error:

julia> ds = NCDataset(joinpath(dir, "ts_COLDSTATE.nc"));

julia> ds["time"]
time (240)
  Datatype:    Float64
  Dimensions:  time
  Attributes:
   standard_name        = time
   units                = months since 1900-1-1 00:00:00
   calendar             = proleptic_gregorian
   axis                 = T

julia> ds["time"][:]
ERROR: MethodError: Cannot `convert` an object of type Float64 to an object of type DateTime
Closest candidates are:
  convert(::Type{T}, ::Intervals.AnchoredInterval{P, T}) where {P, T} at C:\Users\m300808\.julia\packages\Intervals\7oPSS\src\anchoredinterval.jl:181
  convert(::Type{T}, ::P) where {T, P<:(Polynomials.AbstractPolynomial{T})} at C:\Users\m300808\.julia\packages\Polynomials\Rrf70\src\common.jl:434
  convert(::Type{T}, ::Intervals.Interval{T}) where T at C:\Users\m300808\.julia\packages\Intervals\7oPSS\src\interval.jl:240
  ...
Stacktrace:
 [1] setindex!(A::Vector{DateTime}, x::Float64, i1::Int64)
   @ Base .\array.jl:903
 [2] macro expansion
   @ C:\Users\m300808\.julia\packages\NCDatasets\4ceso\src\cfvariable.jl:663 [inlined]
 [3] macro expansion
   @ .\simdloop.jl:77 [inlined]
 [4] CFtransformdata!
   @ C:\Users\m300808\.julia\packages\NCDatasets\4ceso\src\cfvariable.jl:662 [inlined]
 [5] CFtransformdata
   @ C:\Users\m300808\.julia\packages\NCDatasets\4ceso\src\cfvariable.jl:671 [inlined]
 [6] getindex(v::NCDatasets.CFVariable{DateTime, 1, NCDatasets.Variable{Float64, 1, NCDataset{Nothing}}, NCDatasets.Attributes{NCDataset{Nothing}}, NamedTuple{(:fillvalue, :missing_values, :scale_factor, :add_offset, :calendar, :time_origin, :time_factor), Tuple{Nothing, Tuple{}, Nothing, Nothing, String, Nothing, Nothing}}}, indexes::Colon)
   @ NCDatasets C:\Users\m300808\.julia\packages\NCDatasets\4ceso\src\cfvariable.jl:735
 [7] top-level scope
   @ REPL[21]:1

As you can see, even though the time specification follows CF conventions, it actually stores the data as floats. So, someone did a bad job exporting the files properly, that is true, however, I think in this case we shouldn't have an error. We should have a warning clause, and the dimension should be loaded as any other floating point dimension, once a detection of Floats in time happens.

What do you think? The data are here: https://owncloud.gwdg.de/index.php/s/qdLsvW6iMmnYEPK

Versions latest stable all:

 [85f8d34a] NCDatasets v0.12.5
@visr
Copy link
Contributor

visr commented Jun 28, 2022

Why do you think it was a bad job to store the time coordinates as Float64? The first example in the CF conventions on time, also uses a double underneath:

double time(time) ;
  time:long_name = "time" ;
  time:units = "days since 1990-1-1 0:0:0" ;

I normally thing integers are a better match for discrete timestamps though, which is the reason DateTime is backed by integers, and hence this issue.

To resolve this, I guess it's fine to round to the nearest millisecond?

@Datseris
Copy link
Contributor Author

Why do you think it was a bad job to store the time coordinates as Float64?

I've loaded data .nc from about 20 different sources/researchgroups/models, and this is the first time I got this specific error. So I thought it was an export mistake. I stand corrected, so I apologize for wrongly blaming the exporting person.

To resolve this, I guess it's fine to round to the nearest millisecond?

The problem is, I don't even know what the data are. I mean, I don't know if the data are literally counting the time entries from 1 to 240, or if they are representing milliseconds from an epoch. I would need to go back to xarray to actually load the data, which will take a bit of time. I've updated the original post with a link to the data for clarity.

@visr
Copy link
Contributor

visr commented Jun 28, 2022

The data are actually integers. You can avoid using the CF convention transformations like this:

julia> variable(ds, "time")[:]
240-element Vector{Float64}:
   0.0
   1.0
   2.0
   3.0

I see in the code that it already is doing rounding to milliseconds in lines like

@inline asdate(data,time_origin,time_factor,DTcast) =
    convert(DTcast,time_origin + Dates.Millisecond(round(Int64,time_factor * data)))

But for some reason it is not hitting that path. For scalar getindex it is also wrong I think since it doesn't return a DateTime.

julia> ds["time"][1]
0.0

julia> ds["time"][1:2]
ERROR: MethodError: Cannot `convert` an object of type Float64 to an object of type Dates.DateTime
Closest candidates are:
  convert(::Type{Dates.DateTime}, ::T2) where T2<:Union{DateTimeJulian, DateTimeProlepticGregorian, DateTimeStandard} at d:\visser_mn\.julia\packages\CFTime\n09Um\src\CFTime.jl:308
  convert(::Type{Dates.DateTime}, ::Dates.Date) at d:\visser_mn\.julia\juliaup\julia-1.8.0-rc1+0~x64\share\julia\stdlib\v1.8\Dates\src\conversions.jl:30
  convert(::Type{Dates.DateTime}, ::Dates.Millisecond) at d:\visser_mn\.julia\juliaup\julia-1.8.0-rc1+0~x64\share\julia\stdlib\v1.8\Dates\src\conversions.jl:34
  ...
Stacktrace:
 [1] setindex!(A::Vector{Dates.DateTime}, x::Float64, i1::Int64)
   @ Base .\array.jl:966
 [2] macro expansion
   @ d:\visser_mn\.julia\packages\NCDatasets\4ceso\src\cfvariable.jl:663 [inlined]

@Alexander-Barth
Copy link
Owner

Alexander-Barth commented Jun 28, 2022

Thanks, I did not see the link to the file initially. I got it.

@Alexander-Barth
Copy link
Owner

Alexander-Barth commented Jun 28, 2022

I think the issue is the month time unit. So far we check for these:

https://github.com/JuliaGeo/CFTime.jl/blob/master/src/CFTime.jl#L434

xarray does not decode time variables with month either:

In [1]: import xarray as xr

In [2]: xr.open_dataset("ts_COLDSTATE.nc")["time"]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/.local/lib/python3.8/site-packages/xarray/coding/times.py:261, in decode_cf_datetime(num_dates, units, calendar, use_cftime)
    260 try:
--> 261     dates = _decode_datetime_with_pandas(flat_num_dates, units, calendar)
    262 except (KeyError, OutOfBoundsDatetime, OverflowError):

File ~/.local/lib/python3.8/site-packages/xarray/coding/times.py:207, in _decode_datetime_with_pandas(flat_num_dates, units, calendar)
    206 delta, ref_date = _unpack_netcdf_time_units(units)
--> 207 delta = _netcdf_to_numpy_timeunit(delta)
    208 try:

File ~/.local/lib/python3.8/site-packages/xarray/coding/times.py:104, in _netcdf_to_numpy_timeunit(units)
    103     units = f"{units}s"
--> 104 return {
    105     "nanoseconds": "ns",
    106     "microseconds": "us",
    107     "milliseconds": "ms",
    108     "seconds": "s",
    109     "minutes": "m",
    110     "hours": "h",
    111     "days": "D",
    112 }[units]

KeyError: 'months'

ValueError: unable to decode time units 'months since 1900-1-1 00:00:00' with "calendar 'proleptic_gregorian'". Try opening your dataset with decode_times=False or installing cftime if it is not installed.

In [3]: import cftime

In [4]: cftime.__version__
Out[4]: '1.6.0'

But I think it is standard conform (https://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/build/ch04s04.html):

We recommend that the unit year be used with caution. The Udunits package defines a year to be exactly 365.242198781 days (the interval between 2 successive passages of the sun through vernal equinox). It is not a calendar year. Udunits includes the following definitions for years: a common_year is 365 days, a leap_year is 366 days, a Julian_year is 365.25 days, and a Gregorian_year is 365.2425 days.

For similar reasons the unit month, which is defined in udunits.dat to be exactly year/12, should also be used with caution.

@visr
Copy link
Contributor

visr commented Jun 28, 2022

Ah I see. But then why don't we hit the else error("unknown units $(tunit)") in the CFTime code you linked?

@Alexander-Barth
Copy link
Owner

Alexander-Barth commented Jun 28, 2022

Yes, we should have an error or warning. I guess we might had it in the past, but then it was silenced by an overzealous try/catch.

In master, we have now a warning:

ulia> ds["time"]
┌ Warning: unknown units "months"
└ @ NCDatasets ~/.julia/dev/NCDatasets/src/cfvariable.jl:406

With the master version of CFTime, the month units is now understood as year/12 where year is 365.242198781 days .

@Alexander-Barth Alexander-Barth changed the title Time dimension saved as Floats cannot be loaded Time dimension with month units cannot be loaded Jul 7, 2022
@Alexander-Barth
Copy link
Owner

Is there anything left before this can be closed?

@Datseris
Copy link
Contributor Author

Datseris commented Jul 7, 2022

ops, sorry, i haven't tested this yet. I'll test it and reopen if anything goes south! (I'm assuming taht an update will just fix things, as it will update CFTime)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants