Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Process function generation #40

Open
bgoesswe opened this issue Nov 21, 2018 · 9 comments
Open

Dynamic Process function generation #40

bgoesswe opened this issue Nov 21, 2018 · 9 comments

Comments

@bgoesswe
Copy link
Member

Since the back ends may be capable of a different amount of processes and they can be retrieved by the GET /processes end point, it would be a major improvement to generate the process functions dynamically when a back end provider is chosen.
e.g.: https://stackoverflow.com/questions/23812760/dynamic-functions-creation-from-json-python

It is at least something I want to look into.

@lforesta
Copy link
Contributor

This is an old issue, but it's becoming more important as time passes since back-ends have more functionality now.
And it seems to me also non-custom processes are missing in the client, e.g. linear_scale_range was not there #96 , does the current client need a one-to-one mapping of the processes defined in openEO?

@soxofaan
Copy link
Member

I think there are 2 separate aspects to this:

  1. make sure all "official" processes from the openEO API are supported by the python client
  2. detect that a backend does not support a process and fail early in the client instead of waiting for the backend to fail

Concerning 1.: I'm not a big of dynamic generation of functions/methods as this breaks some features that are important for the end user: normal discovery and documentation of methods (by looking at the source code of ImageCollectionClient, or using the code inspection features of their IDE), straightforward exceptions and backtraces when something goes wrong, lower barrier to entry to contribute/fix things, ....

An alternative solution for 1. is still using traditional hardcoded methods but using unit tests that compare the openEO API description with the available methods of ImageCollectionClient and fail when something is missing. We are using this approach in the python driver to support dedicated python exceptions for each openEO error code (as defined in https://open-eo.github.io/openeo-api/errors/):

That being said, there are probably some ways to reduce the necessary boilerplate code and overhead to implement a process as a method in the client.

About 2.: this should be relatively straightforward to implement. It should be optional at the moment however, because probably not all backends properly declare which processes are supported in the capabilities endpoint (the VITO backend doesn't for example)

@jdries
Copy link
Collaborator

jdries commented Nov 13, 2019

There exists a way to add custom/unsupported processes:
https://open-eo.github.io/openeo-python-client/#openeo.rest.imagecollectionclient.ImageCollectionClient.graph_add_process

Perhaps we need to improve documentation so that people find it more easily?

About dynamically generated processes, I agree with Stefaan. I would however not object to someone showing how this can be done in the python client (as a separate way of building process graphs, separate from the ImageCollection class).

@bgoesswe
Copy link
Member Author

I'll investigate on possible dynamic generation strategies in the "process_generation" branch.

@bgoesswe
Copy link
Member Author

bgoesswe commented Jan 28, 2020

So I worked on that issue now for a while, and haven't found a working suitable solution to dynamically generating processes using the Python client other than doing it myself.
Therefore, in the branch "process_generation", I created a Python tool to generate a Python file given a backend URL (e.g. see here for EURAC).
The "ProcessParser" can be used eighter in a python client script or as a command-line call with arguments.
How to use the generated processes can be seen in this example.
The advantage is that you can use the static defined processes of the Python client with the generated processes available from the backend within the same program, so you do not have to choose the strategy of using the Python client. A disadvantage is that you are relying on the documentation provided by the backend.
I was also thinking about doing it in a more Object Oriented way by generating a new class that inherits from "ImageCollectionClient" with additional generated methods, but there I ran (atm) into the issue of methods with the same name and I am not sure if this is a good way to go anyways.

@soxofaan
Copy link
Member

Interesting work,
can you create a pull request? that might help with further fine tuning and discussion

@bgoesswe
Copy link
Member Author

Some points to consider from today's discussion:

Issues static:

  • How to handle different backend implementations (e.g. some parameters of processes might not be supported on every backend)
  • In API version 1.0 there are now custom_processes, which can not be defined statically.
  • Moving error handling to the backend: If something is not supported it will not show the line of code where an error happened, but a backend error message.

Issues dynamic:

  • No convenience functions possible (e.g. operations)
  • No auto-completion (if not generated before)
  • Dependent on backend definition and documentation.

In my opinion, the basic functionality (e.g. load_collection, filters) should be static for convenience reasons. Other things should be dynamically generated (e.g. custom processes). The main issue is to decide where to draw the line on what should be static or not.

soxofaan added a commit to soxofaan/openeo-python-client that referenced this issue Feb 11, 2020
adds a property `dynamic` to ImageCollectionClient instances,
which allows to call processes that are dynamically detected from the
backend process listing (and not necessarily predefined in the client)
soxofaan added a commit to soxofaan/openeo-python-client that referenced this issue Feb 11, 2020
adds a property `dynamic` to ImageCollectionClient instances,
which allows to call processes that are dynamically detected from the
backend process listing (and not necessarily predefined in the client)
@soxofaan
Copy link
Member

Inspired by yesterday's discussion I also played a bit with the following idea: add a property .dynamic to ImageCollection objects that delegates all function calls to corresponding dynamically detected process. The full pull request (WIP) is at #118

The basic unit test shows how it is intended to work:

  • see
    https://github.com/Open-EO/openeo-python-client/pull/118/files#diff-8759acd033807b1640bc4f0c60fa473b
  • I first create a dummy backend with a custom process make_larger that takes a raster cube and float as parameters:
        {
            "id": "make_larger",
            "description": "multiply a raster cube with a factor",
            "parameters": [
                {"name": "data", "schema": {"type": "object", "subtype": "raster-cube"}},
                {"name": "factor", "schema": {"type": "float"}},
            ]}
  • Then I can "call" this process through the dynamic property as follows
        cube = session040.load_collection("SENTINEL2")
        cube = cube.dynamic.make_larger(factor=42)
  • This will inject a "make_larger" node into the process graph of the resulting cube

Some notes:

  • the name .dynamic is the best I could come up with for now, if someone has a better idea: please let me know
  • by using a property .dynamic you can clearly separate "static" predefined methods and dynamically detected processes. Obviously, it allows to have a predefined convenience function hardcoded in the client and a custom process in a backend with the same name
  • Current implementation only supports processes that have exactly one "raster-cube" parameter (to which the self of the ImageCollection instance will be bound). However, to support all kinds of processes we could also define a comparable .dynamic property on the Connection object

@m-mohr
Copy link
Member

m-mohr commented Feb 14, 2020

@soxofaan

  • by using a property .dynamic you can clearly separate "static" predefined methods and dynamically detected processes. Obviously, it allows to have a predefined convenience function hardcoded in the client and a custom process in a backend with the same name

In general I like this, but I don't think a user cares about whether something is dynamic or hard-coded. That should be completely hidden at best and simply be cube.make_larger(factor=42).

  • the name .dynamic is the best I could come up with for now, if someone has a better idea: please let me know

If we need such a "prefix": custom? proprietary?

From the API perspective there are in general two kind of processes:

  • user-defined (/process_graphs) -> listUserProcesses()
  • predefined (/processes) -> listProcesses()

If required, you could split predefined into two categories

  • core (i.e. the one following the process spec)
  • custom / proprietary (i.e. processes that don't follow the process spec) - custom is less descriptive, proprietary is descriptive but hard to type (IMHO).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants