Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce data structure to simplify multigeometry API #703

Open
5 of 6 tasks
isVoid opened this issue Sep 29, 2022 · 1 comment · Fixed by #976
Open
5 of 6 tasks

Introduce data structure to simplify multigeometry API #703

isVoid opened this issue Sep 29, 2022 · 1 comment · Fixed by #976
Assignees
Labels
improvement Improvement / enhancement to an existing function libcuspatial Relates to the cuSpatial C++ library

Comments

@isVoid
Copy link
Contributor

isVoid commented Sep 29, 2022

[UPDATE] 04/10/2023

Recent development added multipoint_range, multilinestring_range, multipolygon_range, which are flexible views over geometry arrays. cuspatial's API should be refactored using these data structures. For refactor demos. see:
#979 (polygons argument of quadtree_point_in_polygon)
and
https://github.com/rapidsai/cuspatial/blob/branch-23.06/cpp/include/cuspatial/experimental/linestring_distance.cuh

Original Post

As demonstrated in #677, the geometry input of the header only API can quickly get out of control for complex geometry. We should try to simplify the APIs to improve developer experience.

Device view over the physical memory of nested types

The data structure should be a view over the physical memory layouts, thus it should be cheap to construct the data structure on host, passing the structure to device and invoke the method on device.

Unlike cudf::column_view and cudf::column_device_view, the data structure is completely templated (not type erased), and always assumes the view are device views.

Assumes GeoArrow memory layout (#649)

The structures holds a view to offset arrays and point arrays. Offset arrays are assumed to always conform to arrow's offset array layout, which is specified in (https://arrow.apache.org/docs/format/Columnar.html#variable-size-binary-layout).

Accessors

Due to the complex nature of geometries, developers may want to support different patterns of access from the data structure.

Element-wise accessors (top-down access)

The simplest of all is element-wise accessor. A kernel is launched on a per-geometry level, accessors like .element(idx) should return a geometry object from the array.

Component-wise accessors (bottom-up access)

Other parallel patterns may require a thread to work on one component from the array, such as paralleled on a point of a multilinestring_array. In this case bottom-up traversal utilities should be supported, and should be easily implemented by a binary search in the offset array.

Tasks

  1. improvement libcuspatial
    isVoid
  2. improvement libcuspatial
    isVoid
  3. improvement libcuspatial
    isVoid
  4. improvement libcuspatial tech debt
    isVoid
  5. improvement libcuspatial tech debt
    isVoid
@isVoid isVoid added improvement Improvement / enhancement to an existing function c++ labels Sep 29, 2022
@isVoid isVoid self-assigned this Sep 29, 2022
@harrism harrism changed the title Introduce certain structure to simplify multigeometry API Introduce data structure to simplify multigeometry API Oct 4, 2022
rapids-bot bot pushed a commit that referenced this issue Oct 19, 2022
…I to support multipoint to multipoint distance. (#731)

closes #704

Contributes to #703 

This PR introduces `Multipoint_range` interface, and simplifies the API of `point_distance`. Also updates the `point_distance` to support multipoint-multipoint distance.

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - H. Thomson Comer (https://github.com/thomcom)
  - Mark Harris (https://github.com/harrism)

URL: #731
rapids-bot bot pushed a commit that referenced this issue Nov 7, 2022
…nge`, adds support to multilinestring distance (#755)

Note, this is the first part of `pairwise_linestring_distance` refactoring, part 1 of PR: #753

Depends on #752 

Contributes to #706, #703

Closes #745

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - H. Thomson Comer (https://github.com/thomcom)
  - Mark Harris (https://github.com/harrism)

URL: #755
@harrism harrism added libcuspatial Relates to the cuSpatial C++ library and removed c++ labels Nov 16, 2022
@rapids-bot rapids-bot bot closed this as completed in #976 Mar 17, 2023
rapids-bot bot pushed a commit that referenced this issue Mar 17, 2023
…cts (#976)

- Adds `multipolygon_range`, `multipolygon_ref`, `polygon_ref` as non-owning objects for geoarrow compliant polygon types.

- Adds `pairwise_point_polygon_distance` to compute the shortest distances between two columns of multipoints and multipolygons.

- Refactors `is_point_in_polygon` with geometry object input. Dependent on #973 since geometry objects requires geoarrow input.

closes #703

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - H. Thomson Comer (https://github.com/thomcom)

URL: #976
@isVoid
Copy link
Contributor Author

isVoid commented Apr 10, 2023

Since the related PR lists haven't all completed. Reopening this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function libcuspatial Relates to the cuSpatial C++ library
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.

2 participants