Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minimal profile for backends #56

Closed
wants to merge 5 commits into from
Closed

minimal profile for backends #56

wants to merge 5 commits into from

Conversation

jdries
Copy link
Contributor

@jdries jdries commented Dec 15, 2022

No description provided.

@jdries jdries requested a review from m-mohr December 15, 2022 13:41
## Collections
A minimal backend shall have at least one raster collection, allowing to build raster data cubes.

## Processes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but the problem is that ESA asked for something even more minimal, as the core profile for openEO platform is already quite demanding. (Minimal openEO backends would not be able to join the openEO platform federation, then they would indeed hace to support the core profile.)

Copy link
Collaborator

@m-mohr m-mohr Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... somewhat makes sense. A couple of questions arise though:

  • How was this list compiled? Looking at the "minimal" list is looks like it could be more minimal (e.g. remove normalized_difference? filters?).
  • What's the difference between a minimal openEO backend and a backend that wants to join the federation? Why do we specifiy something for a minimal openEO backend here? Thus, I understand this as "minimal openEO Platform backend".
  • What's the use of having two lists? One specified a minimal profile, the other one was until now a requirement to be implemented as part of the federation contract. That's contradicting.
  • Should we have this in one place and one table?
  • Do these lists actually make sense at all? Maybe more as a guideline? A "vector backend" would never be able to fullfill these...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo it makes no sense to provide and maintain two different lists as platform if we don't do anything with this one here.
As it's not that easy to actually fulfil the platform core profile (as discussed our backend is missing some of them) I would be in favour of a simpler list. Having all processes listed here would of course not mean that you automatically become a platform backend (there are other points to consider that should be clearer defined somewhere).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A profile is indeed mostly a guideline. Basically, anyone wanting to implement a backend will need guidance on which processes to implement first, and then a minimal list like this can be helpful.
If we don't have a 'minimal' list for raster backends, providers can implement anything at all and call themselves openEO compliant.
If we only have the openEO platform list, then the threshold is quite high.
Hence the two lists?
As to the contents of the list, I'm fine with reducing it or adapting in any way.
I specified it here because didn't have another good location, but openEO.org is also a good option?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't have a 'minimal' list for raster backends, providers can implement anything at all and call themselves openEO compliant. [...] but openEO.org is also a good option?

Indeed and that is intentional. You can choose what you need or want, openEO doesn't tell you what you have to implement. If you just implement load_collection and save_result you can be openEO compliant and I don't think why we should raise the bar. Guidance is okay, but it shouldn't be required to be openEO compliant. But that's unrelated to openEO Platform, where we can define a profile for becoming part of the federation or so. But we have two lists now and I still don't exactly understand why we need two of them.

If you actually want to define a "minimal" guideline for openEO itself, I'd say propose it in the PSC and/or community meetings. If this is meant for openEO Platform then it's a different story and should live here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, moved the discussion elsewhere so we can indeed have it in the PSC:
https://github.com/Open-EO/openeo-api/issues/490


https://api.openeo.org/#tag/Capabilities/operation/connect

[Authentication](https://api.openeo.org/#section/Authentication) is also mandatory, but this profile does not define a specific method. We highlight that openID connect + PKCE is one of the only standardized and secure methods that are available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PKCE is conceptually only a consideration between the client and identity provider, the back-end has little involvement (at most in specifying the grant_types in the default_clients for providers under /credentials/oidc). To keep the scope of this minimal profile minimal, it is maybe better to not mention PKCE?



## File formats
[GeoTiff](./fileformats.md#geotiff) is the only mandatory output format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have GeoJSON as mandatory input/output format too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does any of the processes listed actually require this as output format? I don't think so so am wondering why we would list it here as output format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as output format probably not, but filter_spatial requires it as input format I think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True! If we add input format here it should be included as the list filter_spatial below as minimal process.


https://api.openeo.org/#tag/Capabilities

https://api.openeo.org/#tag/Capabilities/operation/connect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m-mohr are these #tag/... URL stable enough to point to a specific API subset or are there better ways?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As stable as redoc is... Should be good enough for now.

### Data Cubes

- `apply`: Apply a process to each pixel
- `apply_dimension`: Apply a process to pixels along a dimension
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't reduce_dimension a better candidate for a minimal profile than apply_dimension? (especially because there are no UDFs in the minimal profile)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, under Math and logic below there are currently no processes that take an array as input (like mean, min, max), so apply_dimension nor reduce_dimension can not be used anyway

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, you need reduce_dimension to be able to easily get results for the only supported file format, GeoTiff, in the minimal profile. So reduce_dimension should be included with at least a small number of reducers for composites (min/max/mean?), I'm not sure about apply_dimension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batch jobs do allow writing to geotiff without reduce_dimension, it just writes a geotiff per date in that case.
But we can indeed switch to reduce_dimension, not sure why I went for apply_dimension...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your implementation can do that and it's allowed, indeed. But that's not minimal for me. Minimal, for me, is providing a 3D cube to write a single GeoTiff.

- `not`: Inverting a boolean
- `or`: Logical OR
- `xor`: Logical XOR (exclusive or)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted above about apply_dimension: there is probably need for some minimal array proceses too: array_element, mean, min, max?

## Collections
A minimal backend shall have at least one raster collection, allowing to build raster data cubes.

## Processes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo it makes no sense to provide and maintain two different lists as platform if we don't do anything with this one here.
As it's not that easy to actually fulfil the platform core profile (as discussed our backend is missing some of them) I would be in favour of a simpler list. Having all processes listed here would of course not mean that you automatically become a platform backend (there are other points to consider that should be clearer defined somewhere).

- `gt`: Greater than comparison
- `gte`: Greater than or equal to comparison
- `if`: If-Then-Else conditional
- `is_infinite`: Value is an infinite number - **experimental**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not list experimental processes as required in a minimal_profile.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it is also not implemented yet by any backend afaik. I thought backends would implement it soon when we added it to the other profile as it's a relatively simple process, but that did not happen yet. So I think we can remove it for now (or start implementing it).


For backends with more than one collection:
- `merge_cubes`: Merge two data cubes
- `resample_cube_spatial`: Resample the spatial dimensions to match a target data cube
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this one is not that minimal. It's from this list the only one we couldn't implemented in our backend. This is certainly due to our approach but maybe also means it's a more complicated to support than others.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resample_cube_spatial is now implicitly part of merge_cubes and mask as of v2.0.0 so if we have merge_cubes it seems we should also have resample_cube_spatial. But hearing that SH can't implement it(?), I fear that we may actually regret doing Open-EO/openeo-processes#405 - Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dthiex if you do support merge_cubes, but not resample_cube_spatial, is it possible that cubes on sentinelhub or somehow aligned already by default more easily? Or does merge_cubes result in an error in a lot of cases?

In any case, if you remove it from this list, then there's a lot of example process graphs that don't work on a minimal backend.

@@ -0,0 +1,175 @@
# Minimal profile for openEO backends
This document describes the requirements for a backend that aims to provide minimal functionality for access and processing of gridded
earth observation data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earth Observation (EO)



## File formats
[GeoTiff](./fileformats.md#geotiff) is the only mandatory output format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does any of the processes listed actually require this as output format? I don't think so so am wondering why we would list it here as output format.

@dthiex
Copy link
Contributor

dthiex commented May 5, 2023

@jdries I assume we currently don't want to merge this and want to wait for the profiles defined within SAP09. Can we close this? I don't think the effort here was useless and once we have the profiles define we can us the description here and just need to updated it with the respective info.

@jdries
Copy link
Contributor Author

jdries commented May 5, 2023

indeed closing this one, in fact, we agreed to open one in openeo.org

@jdries jdries closed this May 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants