Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataCube.aggregate_spatial should return a vector cube #386

Closed
soxofaan opened this issue Mar 10, 2023 · 3 comments
Closed

DataCube.aggregate_spatial should return a vector cube #386

soxofaan opened this issue Mar 10, 2023 · 3 comments

Comments

@soxofaan
Copy link
Member

DataCube.aggregate_spatial currently returns a (raster) DataCube:

@openeo_process
def aggregate_spatial(
self,
geometries: Union[shapely.geometry.base.BaseGeometry, dict, str, pathlib.Path, Parameter, "VectorCube"],
reducer: Union[str, PGNode, typing.Callable],
target_dimension: Optional[str] = None,
crs: str = None,
context: Optional[dict] = None,
# TODO arguments: target dimension, context
) -> 'DataCube':
"""
Aggregates statistics for one or more geometries (e.g. zonal statistics for polygons)
over the spatial dimensions.
:param geometries: a shapely geometry, a GeoJSON-style dictionary,
a public GeoJSON URL, or a path (that is valid for the back-end) to a GeoJSON file.
:param reducer: a callback function that creates a process graph, see :ref:`callbackfunctions`
:param target_dimension: The new dimension name to be used for storing the results.
:param crs: The spatial reference system of the provided polygon.
By default longitude-latitude (EPSG:4326) is assumed.
:param context: Additional data to be passed to the reducer process.
.. note:: this ``crs`` argument is a non-standard/experimental feature, only supported by specific back-ends.
See https://github.com/Open-EO/openeo-processes/issues/235 for details.
"""
# TODO #279 aggregate_spatial should return a VectorCube, not a DataCube
valid_geojson_types = [
"Point", "MultiPoint", "LineString", "MultiLineString",
"Polygon", "MultiPolygon", "GeometryCollection", "Feature", "FeatureCollection"
]
geometries = self._get_geometry_argument(geometries, valid_geojson_types=valid_geojson_types, crs=crs)
reducer = self._get_callback(reducer, parent_parameters=["data"])
return self.process(
process_id="aggregate_spatial", data=THIS, geometries=geometries, reducer=reducer,
**dict_no_none(target_dimension=target_dimension, context=context)
)

but it should actually return a vector cube.

Changing this should be easy, but I'm unsure if this would cause some trouble down the line in usecases that depend on the current implementation (of returning a DataCube object)

@jdries
Copy link
Collaborator

jdries commented Mar 13, 2023

I would change it. The only use case I can think of is running a udf on the result of aggregate_spatial, so if our new vector cube has the same methods for that, we should be fine?

@soxofaan
Copy link
Member Author

I indeed started with a PR (#389). I'm already touching more code than I anticipated, and I still have failing tests

@soxofaan
Copy link
Member Author

done

The only use case I can think of is running a udf on the result of aggregate_spatial, so if our new vector cube has the same methods for that, we should be fine?

Indeed. That's actually even being improved in #385 / #388

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants