Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error reading rainfall grids #261

Closed
sixy6e opened this issue Jul 12, 2017 · 6 comments

Comments

@sixy6e
Copy link
Contributor

commented Jul 12, 2017

Expected behaviour

Return an xarray Dataset like the following:

<xarray.Dataset>
Dimensions:    (latitude: 1, longitude: 1, time: 366)
Coordinates:
  * time       (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 ...
  * latitude   (latitude) float64 -27.52
  * longitude  (longitude) float64 132.1
Data variables:
    rainfall   (time, latitude, longitude) float32 0.0 0.0 7.44684e-13 0.0 ...
Attributes:
    crs:      EPSG:4326

Data Cube, version 1.3.2
GDAL 2.1.3, released 2017/20/01
rasterio, version 1.0a8

And the conda environment at NCI is:
/g/data/v10/public/modules/agdc-py3-env/20170427

Actual behaviour

Fails hard with the first file:

Error opening source dataset: NETCDF:/g/data/rr5/agcd/0_05/rainfall/daily/2000/rr.2000010120000101.grid.nc:rain_day

And then continuing on in trying to assess the crs of the object which is None.

/g/data/v10/public/modules/agdc-py3/1.5.0/lib/python3.6/site-packages/datacube/storage/storage.py in _rasterio_crs_wkt(src)
     62 if str(rasterio.__version__) >= '0.36.0':
     63     def _rasterio_crs_wkt(src):
---> 64         return str(src.crs.wkt)
     65 else:
     66     def _rasterio_crs_wkt(src):

AttributeError: 'NoneType' object has no attribute 'wkt'

Steps to reproduce the behaviour

import datacube
dc = datacube.Datacube()
rain = dc.load(product='bom_rainfall_grids', longitude=132.1, latitude=-27.5, time=('2000-1-1', '2001-1-1'))

Environment information

  • Which datacube --version are you using?
    Open Data Cube core, version 1.5.0

  • What datacube deployment/environment are you running against?
    GDAL 2.2.1, released 2017/06/23
    rasterio, version 1.0a9

The conda environment being used at NCI is:
/g/data/v10/public/modules/agdc-py3-env/20170710

@omad

This comment has been minimized.

Copy link
Member

commented Jul 12, 2017

We should read changelogs of our dependencies!

From rasterio 1.0a9:

  • The crs property of a dataset is now None instead of CRS() when the dataset's coordinate reference system is undefined (#1057).

This will need a small code update on our part to get this working again. And we should add a test for opening datasets with overridden CRSs.

omad added a commit that referenced this issue Jul 12, 2017

omad added a commit that referenced this issue Jul 12, 2017

Fix for being unable to read from files with missing CRS
Caused by the 1.0a9 release of rasterio

Closes #261
@omad

This comment has been minimized.

Copy link
Member

commented Jul 12, 2017

BTW, thanks for the great issue description Josh!

@mpaget

This comment has been minimized.

Copy link
Contributor

commented Jul 12, 2017

Comment related to the source data: I'm trying reignite a conversation with BoM folks about open use of a daily synced and netcdf-ed version of their surface meteorology grids (such as rainfall here) that csiro operates for itself and its collaborators. This will hopefully spark an update to the NCI holdings of these data, which we can then inject into the data cube.

@omad omad referenced this issue Jul 12, 2017
3 of 3 tasks complete
@sixy6e

This comment has been minimized.

Copy link
Contributor Author

commented Jul 12, 2017

Cool, glad it was informative.

A question regarding the overriding an unknown CRS, it could become problemativ down the line if overriding with a default, as there is the chance that an incorrect datum could be assigned, thus yielding potential geospatial offsets.

Some packages tend to fall back to native image/array coordinates if the CRS is unknown. The dataset is still useable, but functions such as reprojection will become unusable.
It just means that while one can analyse their dataset, the software just can't guarantee the usability of a dataset, rather than give a kind of false assurance. Provenance tracking would be impacted to.

Is it required to assign a default CRS for datasets with an unknown CRS?

@sixy6e

This comment has been minimized.

Copy link
Contributor Author

commented Jul 12, 2017

@mpaget
That'd be great to get daily synced data, as well as definitive CRS info.

@omad

This comment has been minimized.

Copy link
Member

commented Jul 12, 2017

The CRS override is set at the Dataset level, on the assumption that there is metadata available next to the data, but isn't smoothly readable by GDAL.

Geospatial is a core part of ODC, so I'm pretty sure that it's required to have a CRS and Transform available for every Dataset that we need to load.

@omad omad closed this in #262 Jul 13, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.