Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

odc-geo migration #1424

Merged
merged 24 commits into from May 5, 2023
Merged

odc-geo migration #1424

merged 24 commits into from May 5, 2023

Conversation

Ariana-B
Copy link
Contributor

@Ariana-B Ariana-B commented Mar 24, 2023

Reason for this pull request

Geometry utilities have already been extracted from core into a separate package, odc-geo, however the code has not been replaced with calls to the odc-geo library.

Proposed changes

  • Replace all datacube.utils.geometry imports with the corresponding odc.geo imports

  • Remove geometry tests

  • Mark datacube geometry utilities as deprecated but still maintain backwards compatibility for now

  • Remove NetCDF Dataset wrapper

  • Mark xarray .geobox property as deprecated

  • Closes #xxxx

  • Tests added / passed

  • Fully documented, including docs/about/whats_new.rst for all changes


📚 Documentation preview 📚: https://datacube-core--1424.org.readthedocs.build/en/1424/

@robbibt
Copy link
Contributor

robbibt commented Mar 26, 2023

Hey @Ariana-B, really looking forward to this - will be fantastic to have odc-geo functionality available natively in datacube. A question from a user perspective: one of the most useful things about odc-geo for my work is being able to access odc-geo functionality via the .odc extension directly from an xarray dataset, which at the moment is achieved by importing odc.geo.xr: https://odc-geo.readthedocs.io/en/latest/intro-xr.html

Will it be possible to access that .odc extension on data loaded from the datacube with the changes you're making here too? For example, I'd love to be able to do something like this (without having to manually import odc.geo.xr):

import datacube

dc = datacube.Datacube()

ds = dc.load(product=..., x=..., y=..., time=...)

# Access odc-geo Geobox for loaded data
ds.odc.geobox

@codecov
Copy link

codecov bot commented May 3, 2023

Codecov Report

Patch coverage: 97.01% and project coverage change: -6.34 ⚠️

Comparison is base (001870f) 91.97% compared to head (c905fd4) 85.63%.

Additional details and impacted files
@@               Coverage Diff               @@
##           develop-1.9    #1424      +/-   ##
===============================================
- Coverage        91.97%   85.63%   -6.34%     
===============================================
  Files              135      134       -1     
  Lines            15070    14958     -112     
===============================================
- Hits             13860    12809    -1051     
- Misses            1210     2149     +939     
Impacted Files Coverage Δ
datacube/drivers/netcdf/_write.py 100.00% <ø> (ø)
datacube/virtual/impl.py 72.88% <66.66%> (-11.56%) ⬇️
datacube/api/query.py 93.75% <83.33%> (+0.03%) ⬆️
datacube/api/core.py 95.69% <100.00%> (+0.05%) ⬆️
datacube/api/grid_workflow.py 97.05% <100.00%> (ø)
datacube/drivers/_types.py 100.00% <100.00%> (ø)
datacube/drivers/netcdf/writer.py 94.70% <100.00%> (+0.06%) ⬆️
datacube/drivers/postgis/_api.py 87.26% <100.00%> (ø)
datacube/drivers/postgis/_connections.py 83.45% <100.00%> (ø)
datacube/drivers/postgis/_spatial.py 94.11% <100.00%> (+0.04%) ⬆️
... and 19 more

... and 6 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@Ariana-B Ariana-B changed the base branch from develop to develop-1.9 May 3, 2023 03:36
datacube/api/core.py Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to other reviewers: Ariana and I looked into it and the latest versions of the NetCDF library accept unicode strings, so this module is no longer required.

Copy link
Contributor

@SpacemanPaul SpacemanPaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looking great. See various minor inline comments, also you said "Mark datacube geometry utilities as deprecated but still maintain backwards compatibility for now" but that doesn't seem to have been done.

@Ariana-B Ariana-B requested a review from SpacemanPaul May 3, 2023 06:36
Copy link
Contributor

@SpacemanPaul SpacemanPaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks Ariana.

As discussed, we'll give Kirill a chance to review before merging.

@robbibt
Copy link
Contributor

robbibt commented May 4, 2023

Very excited about this PR! So at the moment (datacube < 1.9), the main way we access geo information about data loaded from the datacube is via the .geobox property that's automatically added to xarray data after you import datacube, e.g.:

ds.geobox

With odc.geo.xr imported, this geo information is automatically created via the new and more feature-rich .odc. extension instead, e.g.:

ds.odc.geobox

Is the plan that data loaded from datacube 1.9 will contain just the new automatically created .odc.geobox Geobox extension? Or will datacube 1.9 still add the older .geobox property to loaded data?

My feeling is that we don't need both - we should rely on the .odc.geobox one created by odc.geo instead as it's much more powerful, and it will potentially be confusing to users to have the same geobox acessible from different places (although it probably depends on how much backwards compatibility matters to us for the 1.9 release, as removing .geobox is likely to break user code).

@Kirill888
Copy link
Member

This document captures the main differences of odc-geo vs datacube.geometry.

via the .geobox property that's automatically added to xarray data after you import datacube

One could expose .geobox directly on an xarray for backwards compatibility using the following function in odc-geo

https://github.com/opendatacube/odc-geo/blob/b47103463f5beab27c48819744794e72c405bccc/odc/geo/_xr_interop.py#L762-L767

I think there is no issue having import odc.geo.xr present somewhere under import datacube code-path. It's not loaded by default in odc-geo imports because xarray is an optional and expensive dependency of odc-geo. This iss not the case with datacube.

Probably best to keep .geobox accessor for now, but it should generated deprecation warning on (first?) access. Attaching methods like this directly to xarray is now considered sub-optimal as there is now extension mechanisms present in xarray library.

@@ -44,7 +50,7 @@
mid_longitude,
)

from .tools import (
from .tools import ( # noqa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding these to __all__ below would remove the need for # noqa

Copy link
Member

@Kirill888 Kirill888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Ariana-B for going through this important but tedious change. I think this is good to merge. I would probably make sure that xx.geobox still works before making a release, it should issue deprecation warning stating that xx.odc.geobox is the new way to access this information.

from datacube.utils.math import affine_from_axis
from odc.geo import CRS, CRSError, wh_
from odc.geo.geobox import GeoBox

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing that I have missed in the previous review. This file should just delegate geobox extraction to .odc.geo.xr with a warning, this extraction logic is in odc-geo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented the change; let me know if it aligns with what you had in mind.

@robbibt
Copy link
Contributor

robbibt commented May 4, 2023

Thanks @Kirill888, happy with us just marking the current .geobox as deprecated for now.

On a similar note, should these also be marked as deprecated? They all seem like things we can access more reliably via .odc.
https://github.com/opendatacube/datacube-core/pull/1424/files/2eb5ef6c81b46939126208c2a344ce74d9f08e44#diff-32d508f00b999e6cf6606a0676f5f5262674086e8cef094545e42eb776630555L182-L187

xarray.Dataset.geobox = property(_xarray_geobox)    # type: ignore
xarray.Dataset.affine = property(_xarray_affine)    # type: ignore
xarray.Dataset.extent = property(_xarray_extent)    # type: ignore
xarray.DataArray.geobox = property(_xarray_geobox)  # type: ignore
xarray.DataArray.affine = property(_xarray_affine)  # type: ignore
xarray.DataArray.extent = property(_xarray_extent)  # type: ignore

@Ariana-B Ariana-B merged commit 5c96e60 into develop-1.9 May 5, 2023
23 checks passed
@Ariana-B Ariana-B deleted the odc_geo_migration branch May 5, 2023 06:03
@Ariana-B Ariana-B self-assigned this Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants