Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Error reading rainfall grids #261
Return an xarray Dataset like the following:
<xarray.Dataset> Dimensions: (latitude: 1, longitude: 1, time: 366) Coordinates: * time (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 ... * latitude (latitude) float64 -27.52 * longitude (longitude) float64 132.1 Data variables: rainfall (time, latitude, longitude) float32 0.0 0.0 7.44684e-13 0.0 ... Attributes: crs: EPSG:4326
Data Cube, version 1.3.2
And the conda environment at NCI is:
Fails hard with the first file:
Error opening source dataset: NETCDF:/g/data/rr5/agcd/0_05/rainfall/daily/2000/rr.2000010120000101.grid.nc:rain_day
And then continuing on in trying to assess the crs of the object which is None.
/g/data/v10/public/modules/agdc-py3/1.5.0/lib/python3.6/site-packages/datacube/storage/storage.py in _rasterio_crs_wkt(src) 62 if str(rasterio.__version__) >= '0.36.0': 63 def _rasterio_crs_wkt(src): ---> 64 return str(src.crs.wkt) 65 else: 66 def _rasterio_crs_wkt(src): AttributeError: 'NoneType' object has no attribute 'wkt'
Steps to reproduce the behaviour
import datacube dc = datacube.Datacube() rain = dc.load(product='bom_rainfall_grids', longitude=132.1, latitude=-27.5, time=('2000-1-1', '2001-1-1'))
The conda environment being used at NCI is:
We should read changelogs of our dependencies!
From rasterio 1.0a9:
This will need a small code update on our part to get this working again. And we should add a test for opening datasets with overridden CRSs.
Comment related to the source data: I'm trying reignite a conversation with BoM folks about open use of a daily synced and netcdf-ed version of their surface meteorology grids (such as rainfall here) that csiro operates for itself and its collaborators. This will hopefully spark an update to the NCI holdings of these data, which we can then inject into the data cube.
Cool, glad it was informative.
A question regarding the overriding an unknown CRS, it could become problemativ down the line if overriding with a default, as there is the chance that an incorrect datum could be assigned, thus yielding potential geospatial offsets.
Some packages tend to fall back to native image/array coordinates if the CRS is unknown. The dataset is still useable, but functions such as reprojection will become unusable.
Is it required to assign a default CRS for datasets with an unknown CRS?
The CRS override is set at the Dataset level, on the assumption that there is metadata available next to the data, but isn't smoothly readable by GDAL.
Geospatial is a core part of ODC, so I'm pretty sure that it's required to have a CRS and Transform available for every Dataset that we need to load.