-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle non-canonical NetCDF axis order #89
Comments
stars doesn't prescribe any order (or names), but has the labels "x" and "y" behind the x and y (spatial raster) dimensions, if there are any. It has (and uses) an |
Right. So it's important that we get the x and y spatial dimensions into the right slots in the st_dimensions of a stars object. I guess that's not quite a canonical axis order, but it is in a sense. |
Do we have a test case or example for this? @adrfantini I recall a netcdf file that took infinite time to read through GDAL, was that one with non-standard axis order? |
To be honest I do not recall this happening to me, but I might be wrong. Can you find the issue? |
This file had the axis order switched. https://github.com/r-spatial/stars/pull/88/files#diff-ae514b50c29888928e1d25553d71ec03 But I would actually say this should be closed until it becomes an issue with a real file. Theoretically people shouldn't be doing this. I know they do, but they shouldn't. |
I feel like EDIT: the difference between supporting generic and CF-compliant netCDF files should also be clearly stated in the docs. |
I 100% agree with this. This issue of axis order is somewhat grey though. COARDS relies on axis order and axis/dim name to link coordinate variables to dimensions. CF extends things to allow more generic treatment of axes so axis order can't always be assumed. In practice people (almost never) mess it up -- but I'll admit that I have -- and the files still work with CF-compliant implementations. |
I also agree. OTOH, the file read comes in as > r = read_ncdf('bcsd_obs_1999_borked.nc')
> r
stars object with 3 dimensions and 2 attributes
attribute(s):
pr [mm/m] tas [C]
Min. : 0.59 Min. :-0.421
1st Qu.: 56.14 1st Qu.: 8.899
Median : 81.88 Median :15.658
Mean :101.26 Mean :15.489
3rd Qu.:121.07 3rd Qu.:21.780
Max. :848.55 Max. :29.386
NA's :7116 NA's :7116
dimension(s):
from to offset delta refsys point values
time 1 12 NA NA POSIXct NA 1999-01-31,...,1999-12-31 [x]
longitude 1 81 -85 0.125 NA NA NULL [y]
latitude 1 33 33 0.125 NA NA NULL i.e. with > attr(attr(r, "dimensions"), "raster")$dimensions = c("longitude", "latitude")
> r
stars object with 3 dimensions and 2 attributes
attribute(s):
pr [mm/m] tas [C]
Min. : 0.59 Min. :-0.421
1st Qu.: 56.14 1st Qu.: 8.899
Median : 81.88 Median :15.658
Mean :101.26 Mean :15.489
3rd Qu.:121.07 3rd Qu.:21.780
Max. :848.55 Max. :29.386
NA's :7116 NA's :7116
dimension(s):
from to offset delta refsys point values
time 1 12 NA NA POSIXct NA 1999-01-31,...,1999-12-31
longitude 1 81 -85 0.125 NA NA NULL [x]
latitude 1 33 33 0.125 NA NA NULL [y] |
Yeah. It can also be figured out by linking the coordinate variable to the dimension id they are on then seeing that the data variable is defined YXT rather than XYT and correcting right up front. |
@edzer I think that we should provide a simpler one-liner function, other than Note that the CF conventions say:
So it is recommended, but NOT mandatory, to set a proper dimension ordering.
However...
So, in short, in the CF-conventions:
IMHO
|
I just introduced the nc_coord_var(f)
#> # A tibble: 5 x 6
#> variable X Y Z T bounds
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 RAINNC_present XLONG XLAT NA Time NA
#> 2 Time NA NA NA Time time_bnds
#> 3 T2_present XLONG XLAT NA Time NA
#> 4 U10_present XLONG XLAT NA Time NA
#> 5 V10_present XLONG XLAT NA Time NA It has a bunch of logic to figure out which variables belong with which coordinate. There is still the issue of what NetCDF dimension order each variable and coordinate variable are defined on. This is really only an issue when lat and lon are 2d. e.g there is a chance that lat is defined e.g. nc <- RNetCDF::open.nc(system.file("nc/test_stageiv_xyt.nc", package = "stars"))
add_dimids <- function(nc, vars) {
vars$dimids <- lapply(vars$name, function(x) RNetCDF::var.inq.nc(nc, x)$dimids)
} |
This issue has come up again in #199 -- should I implement the fix for it? I tried a while back by read_ncdf was in flux and I wasn't able to get my contribution in. |
It seems much less in flux now - @mdsumner ? |
I got in touch over in @mdsumner's fork. Am working on it now. |
Given the work done in recent PRs, I think this can be considered fixed. I've had really good luck reading non-canonical axis order files with the latest implementation. |
Thanks! |
While XYZT is the typical axis order, it is by no means a guarantee. I tried permuting the dimensions of a file to be TYX and convinced
read_ncdf
to give me a very odd result. We should be relying on the dimension ids of variables to determine the axis order and enforcing a canonical axis order after reading out of NetCDF.I'll see if I can work up a PR that handles this.
The text was updated successfully, but these errors were encountered: