Skip to content
This repository has been archived by the owner on Sep 9, 2024. It is now read-only.

Features/update asset roles #73

Merged
merged 11 commits into from
Feb 6, 2024
Merged

Conversation

anilnatha
Copy link
Collaborator

@anilnatha anilnatha commented Jan 31, 2024

Purpose

Updated dataset asset lists so that we now use the URI/HREF of the asset as the key in the list the assets. The role/type of the data (e.g. "metadata", "data", etc.) is now an attribute of the asset with a key of "roles".

Proposed Changes

  • [CHANGE] Use href as unique keys for list of assets instead of asset role

Issues

Testing

  • Tested using built-in unit tests

…onger used a key to group assets. Instead an asset role is a piece of metadata for an asset and the fileame of the asset is used as the key.
…fied as metadata key/value pair. In these situations we use the key as the asset role.
@coveralls
Copy link

coveralls commented Jan 31, 2024

Coverage Status

coverage: 85.912% (+0.2%) from 85.749%
when pulling e31fbd5 on features/update-asset-roles
into 1c06546 on develop.

Comment on lines 71 to 74
location = dataset['assets'][filename]['href']
file_type = dataset['assets'][filename]['type']
title = dataset['assets'][filename]['title']
description = dataset['assets'][filename]['description']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type, Title, and Description are all optional elements. What will happen here if they aren't provided?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with this code block is we are parsing STAC from U-DS differently than we are parsing STAC from the to/from_stac methods. in the to/from_stac methods, we map the DataFile.type property to an asset_role, here we are mapping it to the asset.type property, which are not the same and will cause issues here. (e.g. the 'type' of a file might be "application/netcdf" but if we call "to_stac" on that, we'll get some bizarre asset-roles.

here we should either treat this like the CMR query (where we look for asset roles and if they are there use them, if not, check the "key" if it looks like "data" or "metadata" then make that the filetype, but we cannot mix/match these.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if we shouldn't just call "from_Stac" against the results here to keep it acting the same?

mike-gangl and others added 3 commits January 31, 2024 08:43
…ding to be stored for the properties type and roles. Also added new property "roles" to Datafile to assist with the previously mentioned improvement.
@anilnatha
Copy link
Collaborator Author

Hey @mike-gangl I've made the necessary updates, and tested using both the unit tests (all passed) and the 5_working_with_chirp_data.ipynb tutorial notebook. please review at your earliest.

I was being cautious with trying not to conflate the role and type information. The changes I've made work for CMR presently, and my hope is that they will also work if/when changes are made to CMR to adopt the usage of role information in the same manner that we will be using it with stac.

The information you provided through this effort really helped me better understand our usage of Collections, datasets, and datafiles. 👍🏽

@anilnatha
Copy link
Collaborator Author

Oh and the checks on Github are failing because calls to SPS are returning a status of 500. will need to look into why that is happening.

Comment on lines 71 to 74
location = dataset['assets'][filename]['href']
file_type = dataset['assets'][filename]['type']
title = dataset['assets'][filename]['title']
description = dataset['assets'][filename]['description']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if we shouldn't just call "from_Stac" against the results here to keep it acting the same?

@mike-gangl mike-gangl merged commit 746b930 into develop Feb 6, 2024
7 checks passed
@mike-gangl mike-gangl deleted the features/update-asset-roles branch February 6, 2024 17:17
mike-gangl added a commit that referenced this pull request Feb 12, 2024
* Downgrade38 (#59)

* downgrade to support python3.8

* updated poetry version

---------

Co-authored-by: Anil Natha <anilnatha@users.noreply.github.com>

* Pytest (#57)

* adding pytest coverage commands

* remove unneeded python versions form testing

* updated poetry version

* updated coverate command

* Update python-app.yml

added coverall support

* fixed poetry lock

---------

Co-authored-by: Anil Natha <anilnatha@users.noreply.github.com>

* version bump and release fix to use python 3.8 and correct poetry version (#61)

Co-authored-by: Anil Natha <anilnatha@users.noreply.github.com>

* Added missing files and improved README (#55)

Co-authored-by: mike-gangl <59702631+mike-gangl@users.noreply.github.com>
Co-authored-by: Anil Natha <anilnatha@users.noreply.github.com>

* 0.2.1 merge (#62) (#63)

* Downgrade38 (#59)

* downgrade to support python3.8

* updated poetry version

---------



* Pytest (#57)

* adding pytest coverage commands

* remove unneeded python versions form testing

* updated poetry version

* updated coverate command

* Update python-app.yml

added coverall support

* fixed poetry lock

---------



* version bump and release fix to use python 3.8 and correct poetry version (#61)



---------

Co-authored-by: Anil Natha <anilnatha@users.noreply.github.com>

* Issues/58 (#64)

* removed un-needed mercury dashboard

* restructured environments to dev/test/prod values

* added environment settings, added process tests (high level) to ensure venue and project setting was successfully available.

* updated unity environments with correct endpoints

* updated changelog

* fixed json encoding error

* Features/update asset roles (#73)

* Modified the way assets are catalogued so that the asset type is no longer used a key to group assets. Instead an asset role is a piece of metadata for an asset and the fileame of the asset is used as the key.

* Updated changelog and version number.

* We now use the href and not just the filename for the asset key to ensure keys are unique.

* Modified how to handle CMR assets that don't have an asset role specified as metadata key/value pair. In these situations we use the key as the asset role.

* Updated changelog.

* Fixed description in changelog

* added correct client ids for dev, prod

* Updated many methods in Collection to avoid conflating the values needing to be stored for the properties type and roles. Also added new property "roles" to Datafile to assist with the previously mentioned improvement.

* Fixed bug with to_stac method.

* Fixed pluralization with roles key in data_service

---------

Co-authored-by: mike-gangl <michael.e.gangl@jpl.nasa.gov>

* roles is already an array, removed "array of array" (#74)

* Filename key (#75)

* added conversion of asset key to filename if it starts with './'

* smarter basepath naming

---------

Co-authored-by: Anil Natha <anilnatha@users.noreply.github.com>
Co-authored-by: Rishi Verma <riverma@users.noreply.github.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants