-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'sub' does not allow to read all variables from a netcdf file, ignores lat and lon #65
Comments
The logic behind the GDAL netcdf driver is this: if the file contains multiple variables, they are presented as sub-datasets, and you can select one of them. If there is only one, as in this case, the driver doesn't return any subdatasets but only a single one, and you cannot select. However, you can explicitly ask GDAL to read it: > gdal_utils("info", "test.nc") output: Driver: netCDF/Network Common Data Format
Files: test.nc
Size is 424, 412
Coordinate System is `'
Metadata:
NC_GLOBAL#CDI=Climate Data Interface version 1.9.1 (http://mpimet.mpg.de/cdi)
NC_GLOBAL#CDO=Climate Data Operators version 1.9.1 (http://mpimet.mpg.de/cdo)
NC_GLOBAL#contact=obc@dmi.dk
NC_GLOBAL#Conventions=CF-1.6
NC_GLOBAL#CORDEX_domain=EUR-11
NC_GLOBAL#creation_date=2012-12-03 03:14:15
NC_GLOBAL#driving_experiment=ICHEC-EC-EARTH,rcp85,r3i1p1
NC_GLOBAL#driving_experiment_name=rcp85
NC_GLOBAL#driving_model_ensemble_member=r3i1p1
NC_GLOBAL#driving_model_id=ICHEC-EC-EARTH
NC_GLOBAL#experiment=Scenario run using ECEARTH as driving model
NC_GLOBAL#experiment_id=rcp85
NC_GLOBAL#frequency=mon
NC_GLOBAL#history=Wed Oct 03 11:53:06 2018: cdo -C seltimestep,1 /home/netapp-clima-scratch/fraffael/SPI/EC-EARTH_HIRHAM5/prOK_mon_2071-2098.nc /dev/shm/test.nc
Mon Oct 01 18:02:40 2018: cdo mulc,86400 pr_mon_2071-2098.nc prOK_mon_2071-2098.nc
Mon Oct 01 12:24:12 2018: cdo selyear,2071/2098 pr_mon_2071-2100.nc pr_mon_2071-2098.nc
Mon Oct 1 12:23:45 2018: ncrcat /home/esp-shared-a/RegionalModels/CORDEX/EUR-11/rcp85/MM/EC-EARTH_HIRHAM5/pr_EUR-11_ICHEC-EC-EARTH_rcp85_r3i1p1_DMI-HIRHAM5_v1_mon_207101-208012.nc /home/esp-shared-a/RegionalModels/CORDEX/EUR-11/rcp85/MM/EC-EARTH_HIRHAM5/pr_EUR-11_ICHEC-EC-EARTH_rcp85_r3i1p1_DMI-HIRHAM5_v1_mon_208101-209012.nc /home/esp-shared-a/RegionalModels/CORDEX/EUR-11/rcp85/MM/EC-EARTH_HIRHAM5/pr_EUR-11_ICHEC-EC-EARTH_rcp85_r3i1p1_DMI-HIRHAM5_v1_mon_209101-210012.nc pr_mon_2071-2100.nc
Mon Nov 18 04:53:38 2013: /usr/local/bin/ncks --mk_rec_dmn time pr_EUR-11_ICHEC-EC-EARTH_rcp85_r3i1p1_DMI-HIRHAM5_v1_mon_207101-208012.nc out1.nc
Fri Dec 07 08:23:08 2012: cdo setday,16 NAM-44/DMI/DMI-ECEARTH/rcp85/r3i1p1/DMI-HIRHAM5/v1/mon/tas/tas_NAM-44_DMI-ECEARTH_rcp85_r3i1p1_DMI-HIRHAM5_v1_mon_207101-208012.nc NAM-44/DMI/DMI-ECEARTH/rcp85/r3i1p1/DMI-HIRHAM5/v1/mon/tas/tas_NAM-44_DMI-ECEARTH_rcp85_r3i1p1_DMI-HIRHAM5_v1_mon_207101-208012_sub.nc
Fri Dec 07 08:23:07 2012: cdo monmean NAM-44/DMI/DMI-ECEARTH/rcp85/r3i1p1/DMI-HIRHAM5/v1/mon/tas/tas_NAM-44_DMI-ECEARTH_rcp85_r3i1p1_DMI-HIRHAM5_v1_mon_20710101-20801231.nc NAM-44/DMI/DMI-ECEARTH/rcp85/r3i1p1/DMI-HIRHAM5/v1/mon/tas/tas_NAM-44_DMI-ECEARTH_rcp85_r3i1p1_DMI-HIRHAM5_v1_mon_207101-208012.nc
Fri Dec 07 08:22:56 2012: cdo mergetime NAM-44/DMI/DMI-ECEARTH/rcp85/r3i1p1/DMI-HIRHAM5/v1/day/tas/tas_NAM-44_DMI-ECEARTH_rcp85_r3i1p1_DMI-HIRHAM5_v1_day_20710101-20751231.nc NAM-44/DMI/DMI-ECEARTH/rcp85/r3i1p1/DMI-HIRHAM5/v1/day/tas/tas_NAM-44_DMI-ECEARTH_rcp85_r3i1p1_DMI-HIRHAM5_v1_day_20760101-20801231.nc NAM-44/DMI/DMI-ECEARTH/rcp85/r3i1p1/DMI-HIRHAM5/v1/mon/tas/tas_NAM-44_DMI-ECEARTH_rcp85_r3i1p1_DMI-HIRHAM5_v1_mon_20710101-20801231.nc
NC_GLOBAL#institute_id=DMI
NC_GLOBAL#institution=Danish Meteorological Institute
NC_GLOBAL#model_id=DMI-HIRHAM5
NC_GLOBAL#NCO="4.6.4"
NC_GLOBAL#nco_openmp_thread_number=1
NC_GLOBAL#product=output
NC_GLOBAL#project_id=CORDEX
NC_GLOBAL#rcm_version_id=v1
NC_GLOBAL#tracking_id=e9cc9b3b-9755-400b-b101-49c24b811781
NETCDF_DIM_EXTRA={time}
NETCDF_DIM_time_DEF={1,6}
NETCDF_DIM_time_VALUES=44241.5
pr#cell_methods=time: mean
pr#coordinates=lat lon
pr#long_name=Precipitation
pr#missing_value=1e+20
pr#standard_name=precipitation_flux
pr#units=kg m-2 s-1
pr#_FillValue=1e+20
time#axis=T
time#bounds=time_bnds
time#calendar=proleptic_gregorian
time#standard_name=time
time#units=days since 1949-12-01 00:00:00
Subdatasets:
SUBDATASET_1_NAME=NETCDF:"test.nc":time_bnds
SUBDATASET_1_DESC=[1x2] time_bnds (64-bit floating-point)
SUBDATASET_2_NAME=NETCDF:"test.nc":lon
SUBDATASET_2_DESC=[412x424] longitude (64-bit floating-point)
SUBDATASET_3_NAME=NETCDF:"test.nc":lat
SUBDATASET_3_DESC=[412x424] latitude (64-bit floating-point)
SUBDATASET_4_NAME=NETCDF:"test.nc":pr
SUBDATASET_4_DESC=[1x412x424] precipitation_flux (32-bit floating-point)
Geolocation:
LINE_OFFSET=0
LINE_STEP=1
PIXEL_OFFSET=0
PIXEL_STEP=1
SRS=GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]]
X_BAND=1
X_DATASET=NETCDF:"test.nc":lon
Y_BAND=1
Y_DATASET=NETCDF:"test.nc":lat
Corner Coordinates:
Upper Left ( 0.0, 0.0)
Lower Left ( 0.0, 412.0)
Upper Right ( 424.0, 0.0)
Lower Right ( 424.0, 412.0)
Center ( 212.0, 206.0)
Band 1 Block=424x1 Type=Float32, ColorInterp=Undefined
NoData Value=1.00000002004087734e+20
Unit Type: kg m-2 s-1
Metadata:
cell_methods=time: mean
coordinates=lat lon
long_name=Precipitation
missing_value=1e+20
NETCDF_DIM_time=44241.5
NETCDF_VARNAME=pr
standard_name=precipitation_flux
units=kg m-2 s-1
_FillValue=1e+20 > (lon = read_stars('NETCDF:"test.nc":lon'))
stars object with 2 dimensions and 1 attribute
attribute(s):
NETCDF:"test.nc":lon [°]
Min. :-44.594
1st Qu.: -7.067
Median : 10.091
Mean : 10.225
3rd Qu.: 28.112
Max. : 64.964
dimension(s):
from to offset delta refsys point values
x 1 424 NA NA NA NA NULL [x]
y 1 412 NA NA NA NA NULL [y]
Warning message:
In get_raster(affine = pr$geotransform[c(3, 5)], dimensions = c("x", :
setting NA affine values to zero It's a bit convoluted, but this is how GDAL deals with these data. I'd be happy to add a |
That would be handy. |
This now works: > library(stars)
Loading required package: abind
Loading required package: sf
Linking to GEOS 3.6.2, GDAL 2.2.3, PROJ 4.9.3
> (r = read_stars('test.nc', curvilinear = c('lon', 'lat')))
stars object with 3 dimensions and 1 attribute
attribute(s):
test.nc
Min. : 0.000
1st Qu.: 1.171
Median : 2.236
Mean : 2.840
3rd Qu.: 3.961
Max. :92.606
dimension(s):
from to offset delta refsys point
x 1 424 NA NA +proj=longlat +datum=WGS8... NA
y 1 412 NA NA +proj=longlat +datum=WGS8... NA
time 1 1 2071-01-16 12:00:00 NA POSIXct NA
values
x [424x412] -44.5939 [°], ..., 64.9644 [°] [x]
y [424x412] 21.9878 [°], ..., 72.585 [°] [y]
time NULL
curvilinear grid
Warning messages:
1: In get_raster(affine = pr$geotransform[c(3, 5)], dimensions = c("x", :
setting NA affine values to zero
2: In get_raster(affine = pr$geotransform[c(3, 5)], dimensions = c("x", :
setting NA affine values to zero
3: In get_raster(affine = pr$geotransform[c(3, 5)], dimensions = c("x", :
setting NA affine values to zero |
I also see room for making this automatic: without specifying curvilinear, we see > (r = read_stars('test.nc'))
stars object with 3 dimensions and 1 attribute
attribute(s):
test.nc
Min. : 0.000
1st Qu.: 1.171
Median : 2.236
Mean : 2.840
3rd Qu.: 3.961
Max. :92.606
dimension(s):
from to offset delta refsys point values
x 1 424 NA NA NA NA NULL [x]
y 1 412 NA NA NA NA NULL [y]
time 1 1 2071-01-16 12:00:00 NA POSIXct NA NULL
Warning message:
In get_raster(affine = pr$geotransform[c(3, 5)], dimensions = c("x", :
setting NA affine values to zero i.e. no geotransform; in the metadata returned by gdal however we see a metadata element (in I was trying to automate this and looking into the |
I think we should use GDAL to discover the curvilinear arrays, then read and apply them with NetCDF. Rectilnear is a different game, still exploring |
To @edzer's last comment, yes GDAL has the best hueristics here and gdalwarp oftens does fine with no manual interpretation. I am still not au fait with GCPs vs geolocation arrays, and maybe there are other types too. |
If we can find an example of rectilnear grid that GDAL identifies coordinates for that would clear up a question for me. I think it might be an opportunity to improve that library to our benefit. It's very active in this area recently |
I'm not sure I understand, can you be a bit more specific? GDAL should be able to understand any regular rectilinear lat-lon grid, for example, but I'm clearly missing something. |
Oh that's a good point, I mean particularly non regular 1d arrays - gdal ignores them for the transform, but does it identify them as coordinates formally as it does for 2d ones? I need to show some examples. I definitely need more non-regular rectilinear cases. |
At your command! GDALINFO:
|
Cool, that's a good one - definitely haven't seen that kind of grid in CMIP5 yet - and the Arakawa grids I've used before are way more complicated than this. But, it looks like gridlon_t and gridlat_t only apply in a limited area? (Perhaps the central lines?). Can you map with only those axes ? (doesn't seem so) |
Or, are they pertinent only between a range of lats? (Greenland is obviously wrong, and the grid is less dense there as per geolon/lat_t, so does it apply only south of that, as per this mask classification?). It's not that good a data set because there's no way to know down south what anything is, do you have a data set that uses these coordinates and not just the mask? |
I really do not know much, this is an ocean model and they might have changed the land/topography/bathymetry to better suite their flows... and I don't have any idea what those variables are for. Here is the same dataset with 2 timesteps for SST, in case you need it. |
Well that's interesting, there's a relationship between the mask and the sst, but not by anything explicit (300 and 360 is clear, but that's pretty tenuous). The At any rate the SST data is more useful as an example, so I'm satisfied - thanks! |
Yeah I'm sorry I can't dig more into it right now, I just went scavenging for a suitable example in our archive, but I don't know anyone using those models right now. But yeah, the two grids actually are slightly different, despite having similar names and being in the same folder... they might come from different simulations, I don't know. |
Did you close the PR inadvertently? |
ah yes, was right in the crux of the GH outage - I'm still getting some confusing behaviour, but probably quirks of caching here and there. In the meantime I cleaned up, and will try to stick to "mdsumner-dev" branch on my side for PRs. |
@edzer
MRE with the above file: r = read_stars('test (2).nc', curvilinear = c('lon', 'lat'))
sf = st_sf(a=1, st_geometry(st_point(c(0, 30))), crs=4326) %>% st_buffer(1)
r[sf] |
Should work now; note that cropping doesn't work (happy to receive a PR on this if there is a need). |
Sorry I'm late. Can confirm working. Cropping would be nice but not a priority for me, for now (since I transform to |
I am reading the file from #64 : https://drive.google.com/open?id=1YnrUnpydoNdMUUSJ_TrTh-nBxl693dik
I want to read it as a curvilinear grid and as such I need to extract the variables
lat
andlon
, which are bidimensional, and assign the curvilinear values to the tridimensional variablepr
(although in this example the time dimension is redundant since it has length 1).However, it seems that no matter the value of
sub
,read_stars
only wants to read thepr
variable:read_ncdf
from Mike's branch instead works, so a perfectly viable workaround exists.I suspect this is a limitation of the underlying GDAL. Should I just default on reading
netCDF
withread_ncdf
when possible?The text was updated successfully, but these errors were encountered: