Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize load_geojson #415

Closed
soxofaan opened this issue Mar 10, 2023 · 3 comments · Fixed by #412, #413, #428 or #427
Closed

Generalize load_geojson #415

soxofaan opened this issue Mar 10, 2023 · 3 comments · Fixed by #412, #413, #428 or #427
Labels
Milestone

Comments

@soxofaan
Copy link
Member

I considered raising this in #412 or #322 but maybe it's better to keep this a separate thread:

load_geojson (as introduced in #412) is closely tied to GeoJSON (by name and supported format), which has the following problems:

  • it's out of line with other load_ processes which support multiple formats, and use /file_formats to express that
  • it supports loading data from inlined data in the process graph and from remote URLs, but remote vector data might also come in other formats (geopackage, shapefiles, next generation GeoJSON, ...), so why not support that too? (Inline data can also come in other formats actually: WKT string, CovJSON, ...)

Couple of lines of thought:

  • generalize load_geojson, e.g. to load_vector, to also support other vector formats
  • generalize even further to e.g. load_cube to also support raster data, not only something like a remote GeoTIFF, but also some kind of inline raster data cube construct (also see OpenEO processes validation suite #204)
  • generalize, but split the inline and external URL case (to avoid some schema specification issues) :
    • load_inline which could support inline GeoJSON, WKT, other JSON based vector format (e.g. CovJson), some kind of inline raster data format (again see OpenEO processes validation suite #204), ...
    • load_external to load from a URL
@m-mohr
Copy link
Member

m-mohr commented Mar 10, 2023

I'm fine with load_external.

I have some issues with load_inline though as it is problematic to make nice UIs for it. Right now I can easily detect in the Web Editor if a parameter is geojson and then I can render a map for it. If we have load_inline with a file format parameter I can only guess from the file format name whether something is GeoJSON or not, same applies for other formats. That makes the implementation much more difficult, especially when the parameter order is in a way that you specify the data first and the format second. load_geojson makes the use of this (at least in the UI) much more simple.

In general, we run into issues again with the fact that we don't define file formats in openEO. The "nice" behavior of load_geojson with e.g. the properties parameter is then not defined centrally but defined by back-ends so that there is usually no common behavior for importing. There shouldn't be because we can't do that for all file format and it should be open to other implementations, on the other hand it feels nice to have for simple formats such as GeoJSON that are just used so often and in so many central places.

@m-mohr m-mohr added this to the 2.0.0 milestone Mar 10, 2023
This was linked to pull requests Mar 10, 2023
@m-mohr m-mohr added the vector label Mar 10, 2023
@jdries
Copy link
Contributor

jdries commented Mar 13, 2023

A reminder perhaps, but another issue with a specific process like load_geojson is in UDP's where the data to load is a parameter, and the UDP designer would like a user friendly approach to allow loading vector data from different sources without having to make a complex construction inside the UDP.
Could potentially be solved by simply having a vector cube as a udp parameter but then we do have more reliance on the ability to construct vector cubes loading logic client side.

@m-mohr
Copy link
Member

m-mohr commented Mar 31, 2023

@jdries Yes, a vector cube should probably be passed in. geojson as subtype has be deprecated (except for load_geojson, if accepted).

Other than that, I propose explicit processes if we can inline the data (e.g. load_geojson, load_covjson) and then processes for loading from other sources (e.g. load_uploaded_files for the user workspace and load_http for loading from external HTTP(S) URLs).

m-mohr added a commit that referenced this issue Mar 31, 2023
m-mohr added a commit that referenced this issue Mar 31, 2023
This was referenced Mar 31, 2023
This was linked to pull requests Mar 31, 2023
m-mohr added a commit that referenced this issue Apr 3, 2023
* Add load_http #415
* load_http -> load_url
* Apply suggestions from code review

Co-authored-by: Stefaan Lippens <soxofaan@users.noreply.github.com>
m-mohr added a commit that referenced this issue Apr 3, 2023
* Add load_geojson #346 #415
@m-mohr m-mohr closed this as completed Apr 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment