Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi location deprecation #1546

Merged
merged 24 commits into from Feb 14, 2024
Merged

Conversation

SpacemanPaul
Copy link
Contributor

@SpacemanPaul SpacemanPaul commented Feb 9, 2024

Reason for this pull request

Support for multiple location in the ODC is rarely used (indeed has NEVER been used as far as I am aware) and adds needless complexity to the index driver API, making it unnecessarily difficult to implement new features and maintain existing features which ARE used regularly. What support for multiple locations exists is not particularly consistent, and appears to only be potentially useful in very specific circumstances (i.e. the NCI, as it was 5 years ago).

As per EP-13, this PR deprecates all location methods, and either introduces new simpler methods to replace them or makes minor changes to the behaviour of existing methods to better suit a single location-per-dataset design in future.

It's a bit of a beast, my sincere apologies to the reviewers.

Proposed changes

Refer to section 7.4 of EP-13 for a full description of how location behaviour will change from 1.8 through 1.9 to 2.0, but in brief:

Dataset Resource methods

  • get_locations(id_) deprecated in favour of new get_location(id_) method that always returns a single location.
  • get_archived_locations(), get_archived_location_times() deprecated (no replacements as functionality will be dropped)
  • add_location(), remove_location(), archive_location() and restore_location() all deprecated (no replacements as functionality will be dropped).
  • get_datasets_for_location() kept for now. (No changes)
  • update() has some backwards incompatible changes when used to modify location. In 1.8, update with a new location always added the location, keeping the old one. In 1.9, if the dataset currently has a single location in the index, AND the Dataset model passed into update has a single (new) location, then the new location REPLACES the old location, which is removed (not archived). If either the dataset currently has multiple locations in the index OR the model passed to update has multiple locations, then new locations are added, keeping all old ones, as per 1.8 behaviour.

Dataset model

  • The uris argument to the Dataset constructor is now deprecated in favour of the new uri argument that takes a single location only.

  • uris was formerly an attribute on the Dataset model, it now a deprecated read-only property exposing a private attribute _uris.

  • New uri attribute on the Dataset model holds a single uri.

  • New temporary transition property is_legacy on Dataset object returns true if the dataset has multiple locations, false if one or zero locations.

  • The local_uri and local_path properties on the the Dataset object will be preserved, with their behaviour in 2.0 trivially downgraded to:

    • local_uri() returns the uri if is "local" (file:) or None if no location, or a non-local uri.
    • local_path() will return the uri as a file path if it is local, or None otherwise.

DatasetTuple

The dataset tuple object, as used for bulk read and copy operations, now has some special sauce to assist migration from a multi-location design to a single-location design.

Misc

  • Change "locations" or "uris" to "location" or "uri" in various CLI and API report formats. The behaviour of some CLI and API reporting endpoints may change slightly in the multilocation case.
  • Comment tests where deprecated methods/properties are called, to make it easier to spot internal usages of deprecated methods/properties.
  • Misc code cleanup, mostly related to location handling, but also some parallel cleanup of other recently deprecated functionality, e.g. the old GridSpec class. I also changed all the remaining self.types references in the postgres driver to self.products.
  • Tests added / passed
  • Fully documented, including docs/about/whats_new.rst for all changes

Copy link

codecov bot commented Feb 9, 2024

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (b0946a3) 85.43% compared to head (b0e009b) 85.65%.

Files Patch % Lines
datacube/index/postgres/_datasets.py 95.00% 2 Missing ⚠️
datacube/index/abstract.py 96.66% 1 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff               @@
##           develop-1.9    #1546      +/-   ##
===============================================
+ Coverage        85.43%   85.65%   +0.21%     
===============================================
  Files              140      140              
  Lines            15047    15175     +128     
===============================================
+ Hits             12856    12998     +142     
+ Misses            2191     2177      -14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@SpacemanPaul SpacemanPaul marked this pull request as ready for review February 12, 2024 02:13
datacube/index/hl.py Show resolved Hide resolved
datacube/index/null/_datasets.py Outdated Show resolved Hide resolved
datacube/index/memory/_datasets.py Show resolved Hide resolved
datacube/index/memory/_datasets.py Show resolved Hide resolved
datacube/index/postgis/_datasets.py Show resolved Hide resolved
datacube/index/postgis/_datasets.py Outdated Show resolved Hide resolved
datacube/index/postgis/_datasets.py Outdated Show resolved Hide resolved
datacube/index/postgres/_datasets.py Show resolved Hide resolved
datacube/model/__init__.py Show resolved Hide resolved
datacube/model/__init__.py Show resolved Hide resolved
Copy link
Contributor

@Ariana-B Ariana-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, although if at all possible it would be nice if we could better differentiate between the various 'locations' methods/properties to avoid confusion

@SpacemanPaul SpacemanPaul merged commit 2ee5434 into develop-1.9 Feb 14, 2024
28 checks passed
@SpacemanPaul SpacemanPaul deleted the multi_location_deprecation branch February 14, 2024 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants