-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ghgc 269/fix workflows api #179
Conversation
…ackend convention
…e item assets for GHGC
1. removed CMR input and its usages 2. added in missing attributes into s3Input which helps in item discovery. The s3Input is later used in dataset schema and then cogDataset Schema 3. also added in pydantic validators for the new attributes
…collection matches
…eda-data-airflow into GHGC-269/fix-workflows_api
…eda-data-airflow into GHGC-269/fix-workflows_api
fix: list workflows API
Add renders to schema
2. filter out unnecessary attributes that doesnot align with the stac specs 3. user provided attribute has greater precedence than the created one
…_payload json prepared for ingest api now allows arbitrary key value pairs
TestingCase 1: No discovery_items.assets, but item_assets provided omi-trno2-item-assets-only{
"collection": "omi-trno2-item-assets-only",
"data_type": "cog",
"spatial_extent": {
"xmin": -127,
"ymin": 29,
"xmax": -103,
"ymax": 52
},
"temporal_extent": {
"startdate": "1995-01-01T00:00:00Z",
"enddate": "2095-03-31T00:00:00Z"
},
"description": "OMI_trno2 - 0.10 x 0.10 Annual as Cloud-Optimized GeoTIFFs (COGs)",
"discovery_items": [
{
"bucket": "veda-data-store-staging",
"datetime_range": "year",
"discovery": "s3",
"filename_regex": "^(.*).tif$",
"prefix": "OMI_trno2-COG/"
}
],
"is_periodic": true,
"license": "MIT",
"sample_files": ["s3://veda-data-store-staging/OMI_trno2-COG/OMI_trno2_0.10x0.10_2005_Col3_V4.tif"],
"providers": [
{
"name": "NASA VEDA",
"roles": [
"host"
],
"url": "https://www.earthdata.nasa.gov/dashboard/"
}
],
"renders": {
"dashboard": {
"assets": [
"cog_default"
],
"colormap_name": "reds",
"rescale": [
[
0,
3000000000000000
]
],
"title": "VEDA Dashboard Render Parameters"
}
},
"item_assets": {
"no2": {
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"roles": [
"data",
"layer"
],
"title": "NO2 values",
"description": "description",
"other_attr": "lets see"
}
},
"assets": {
"thumbnail": {
"description": "Photo by [Mick Truyts](https://unsplash.com/photos/x6WQeNYJC1w) (Power plant shooting steam at the sky)",
"href": "https://thumbnails.openveda.cloud/no2--dataset-cover.jpg",
"roles": [
"thumbnail"
],
"title": "Thumbnail",
"type": "image/jpeg"
}
},
"time_density": "year",
"title": "DELETE ME OMI_trno2"
} Output: Case 2: item_assets and discovery.assets provided omi-trno2-custom-assets{
"collection": "omi-trno2-custom-assets",
"data_type": "cog",
"spatial_extent": {
"xmin": -127,
"ymin": 29,
"xmax": -103,
"ymax": 52
},
"temporal_extent": {
"startdate": "1995-01-01T00:00:00Z",
"enddate": "2095-03-31T00:00:00Z"
},
"description": "OMI_trno2 - 0.10 x 0.10 Annual as Cloud-Optimized GeoTIFFs (COGs)",
"discovery_items": [
{
"bucket": "veda-data-store-staging",
"datetime_range": "year",
"discovery": "s3",
"filename_regex": "^(.*).tif$",
"prefix": "OMI_trno2-COG/",
"assets": {
"no2": {
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"roles": [
"data",
"layer"
],
"title": "NO2 values",
"description": "description",
"other_attr": "lets see",
"regex": ".*"
}
}
}
],
"is_periodic": true,
"license": "MIT",
"sample_files": ["s3://veda-data-store-staging/OMI_trno2-COG/OMI_trno2_0.10x0.10_2005_Col3_V4.tif"],
"providers": [
{
"name": "NASA VEDA",
"roles": [
"host"
],
"url": "https://www.earthdata.nasa.gov/dashboard/"
}
],
"renders": {
"dashboard": {
"assets": [
"cog_default"
],
"colormap_name": "reds",
"rescale": [
[
0,
3000000000000000
]
],
"title": "VEDA Dashboard Render Parameters"
}
},
"assets": {
"thumbnail": {
"description": "Photo by [Mick Truyts](https://unsplash.com/photos/x6WQeNYJC1w) (Power plant shooting steam at the sky)",
"href": "https://thumbnails.openveda.cloud/no2--dataset-cover.jpg",
"roles": [
"thumbnail"
],
"title": "Thumbnail",
"type": "image/jpeg"
}
},
"item_assets": {
"no2": {
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"roles": [
"data",
"layer"
],
"title": "NO2 values",
"description": "description",
"other_attr": "lets see"
}
},
"time_density": "year",
"title": "DELETE ME OMI_trno2"
} Output: Case 3: No assets anywhere omi-trno2-no-assets{
"collection": "omi-trno2-no-assets",
"data_type": "cog",
"spatial_extent": {
"xmin": -127,
"ymin": 29,
"xmax": -103,
"ymax": 52
},
"temporal_extent": {
"startdate": "1995-01-01T00:00:00Z",
"enddate": "2095-03-31T00:00:00Z"
},
"description": "OMI_trno2 - 0.10 x 0.10 Annual as Cloud-Optimized GeoTIFFs (COGs)",
"discovery_items": [
{
"bucket": "veda-data-store-staging",
"datetime_range": "year",
"discovery": "s3",
"filename_regex": "^(.*).tif$",
"prefix": "OMI_trno2-COG/"
}
],
"is_periodic": true,
"license": "MIT",
"sample_files": ["s3://veda-data-store-staging/OMI_trno2-COG/OMI_trno2_0.10x0.10_2005_Col3_V4.tif"],
"providers": [
{
"name": "NASA VEDA",
"roles": [
"host"
],
"url": "https://www.earthdata.nasa.gov/dashboard/"
}
],
"renders": {
"dashboard": {
"assets": [
"cog_default"
],
"colormap_name": "reds",
"rescale": [
[
0,
3000000000000000
]
],
"title": "VEDA Dashboard Render Parameters"
}
},
"assets": {
"thumbnail": {
"description": "Photo by [Mick Truyts](https://unsplash.com/photos/x6WQeNYJC1w) (Power plant shooting steam at the sky)",
"href": "https://thumbnails.openveda.cloud/no2--dataset-cover.jpg",
"roles": [
"thumbnail"
],
"title": "Thumbnail",
"type": "image/jpeg"
}
},
"time_density": "year",
"title": "DELETE ME OMI_trno2"
} Output: Case 4: multi assets climdex-tmaxxf-access-cm2-ssp126-multi-asset{
"collection": "climdex-tmaxxf-access-cm2-ssp126-multi-asset",
"data_type": "cog",
"spatial_extent": {
"xmin": -180,
"ymin": -90,
"xmax": 180,
"ymax": 90
},
"temporal_extent": {
"startdate": "2015-01-01T00:00:00Z",
"enddate": "2101-12-31T23:59:59Z"
},
"description": "CLIMDEX ACCESS CM2 SSP125 - variable tmaxXF",
"is_periodic": true,
"license": "MIT",
"item_assets": {
"cog_default": {
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"roles": [
"data",
"layer"
],
"title": "Default COG Layer",
"description": "Cloud optimized default layer to display on map"
},
"tmax_above_86": {
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"roles": [
"data",
"layer"
],
"title": "Tmax Above 86",
"description": "Tmax Above 86"
},
"tmax_above_90": {
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"roles": [
"data",
"layer"
],
"title": "Tmax Above 90",
"description": "Tmax Above 90"
},
"tmax_above_100": {
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"roles": [
"data",
"layer"
],
"title": "Tmax Above 100",
"description": "Tmax Above 100"
},
"tmax_above_110": {
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"roles": [
"data",
"layer"
],
"title": "Tmax Above 110",
"description": "Tmax Above 110"
},
"tmax_above_115": {
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"roles": [
"data",
"layer"
],
"title": "Tmax Above 115",
"description": "Tmax Above 115"
}
},
"sample_files": ["s3://veda-data-store-staging/climdex-tmaxxf-access-cm2-ssp126/tmaxXF-ACCESS-CM2-ssp126_2099_tmax_above_86.tif"],
"providers": [
{
"name": "NASA VEDA",
"url": "https://www.earthdata.nasa.gov/dashboard/",
"roles": [
"host"
]
}
],
"renders": {
"dashboard": {
"assets": [
"cog_default"
],
"colormap_name": "reds",
"rescale": [
[
0,
3000000000000000
]
],
"title": "VEDA Dashboard Render Parameters"
}
},
"assets": {
"thumbnail": {
"title": "Thumbnail",
"description": "Photo by NASA (CMIP6 Climdex TmaxXF Screenshot)",
"href": "https://thumbnails.openveda.cloud/cmip6-climdex-tmaxxf-access-cm2.png",
"type": "image/png",
"roles": ["thumbnail"]
}
},
"time_density": "year",
"title": "DELETE ME CLIMDEX",
"discovery_items": [
{
"collection": "climdex-tmaxxf-access-cm2-ssp126-deleteme",
"bucket": "veda-data-store-staging",
"prefix": "climdex-tmaxxf-access-cm2-ssp126/",
"filename_regex": ".*-ssp126_209(.*)_tmax.*.tif$",
"id_regex": ".*-ssp126_(.*)_tmax.*.tif$",
"id_template": "climdex-tmaxxf-access-cm2-ssp126-{}",
"datetime_range": "year",
"assets": {
"tmax_above_86": {
"title": "Tmax Above 86",
"description": "Tmax Above 86",
"regex": ".*-ssp126_(.*)_tmax_above_86.tif"
},
"tmax_above_90": {
"title": "Tmax Above 90",
"description": "Tmax Above 90",
"regex": ".*-ssp126_(.*)_tmax_above_90.tif"
},
"tmax_above_100": {
"title": "Tmax Above 100",
"description": "Tmax Above 100",
"regex": ".*-ssp126_(.*)_tmax_above_100.tif"
},
"tmax_above_110": {
"title": "Tmax Above 110",
"description": "Tmax Above 110",
"regex": ".*-ssp126_(.*)_tmax_above_110.tif"
},
"tmax_above_115": {
"title": "Tmax Above 115",
"description": "Tmax Above 115",
"regex": ".*-ssp126_(.*)_tmax_above_115.tif"
}
},
"discovery": "s3",
"upload": false
}
]
} Output: |
} | ||
} | ||
response = await start_discovery_workflow_execution(discovery) | ||
if (dataset.item_assets): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use item_assets is no assets provided, and if that's also not provided send nothing to airflow - which handles adding the cog_default asset there
I think we need an exclude_unset=True somewhere in the dataset path to publishing a collection record to pgstac, maybe as the last step before posting to the ingest API? I see some published EDIT: this also means we don't have all the validations we thought we had in place for the ingest-api. I think we will probably be addressing that in the transactions work in a way that can be applied in the ingest api as well. |
I think it is just a switch to the stac_pydantic to_dict method which defaults exclude_unset=True |
print("Success:", response.json()) | ||
else: | ||
print("Error:", response.status_code, response.text) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were apparently two ingest
methods defined - that tripped me up for a while 😓
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh no! 🙃
@@ -57,7 +57,6 @@ class DashboardCollection(Collection): | |||
assets: Optional[Dict] | |||
extent: SpatioTemporalExtent | |||
renders: Optional[Dict] | |||
stac_extensions: Optional[List[str]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stac_pydantic.Collection
already defines this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One change to id_template default value + I am still looking at whether these fields should be required or if the preference was to default them when not provided (from datasets/publish). Either way I think we can open a new issue to address this error.
{'loc': ('body', 'COGDataset', 'spatial_extent'), 'msg': 'field required', 'type': 'value_error.missing'}
{'loc': ('body', 'COGDataset', 'temporal_extent'), 'msg': 'field required', 'type': 'value_error.missing'},
{'loc': ('body', 'COGDataset', 'sample_files'), 'msg': 'field required', 'type': 'value_error.missing'}
Summary: Summary of changes
Addresses GHGC-269: Fix workflows api for GHGC
Changes
PR Checklist