Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sg/area dist debug #33

Merged
merged 20 commits into from
Dec 28, 2023
Merged

Sg/area dist debug #33

merged 20 commits into from
Dec 28, 2023

Conversation

skygering
Copy link
Collaborator

I realized that I missed a few geometry types that were coming up in my simulations.

I also realized that my code failed with empty geometries as it was trying to access the first point. This then led me to the discovery of an issue with the implementation of isempty at least with LibGEOS. So now I have to check isempty and find if there is a non-zero number of points to determine the type of the points within a geometry that returns an area of 0.

import LibGEOS as LG
import GeoInterface as GI

p = LG.readgeom("POLYGON EMPTY")
GI.isempty(p)  # false
GI.npoint(p) # 0

This seems like too much work just to return 0. I am wondering if it would be bad to just return an Int 0 for everything but non-empty polygons / multipolygon / collections.

This isn't the best type-wise, but there is no guarantee that we can even determine the type.

Thoughts?

Copy link
Member

@rafaqz rafaqz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to either pick a fixed floating point output type (e.g. Float64), or detect it once very early and push that through. Then all those zeros will be type stable and we cont need to do the zero(typeof(x(...

function area(::GI.MultiPointTrait, multipoint)
GI.isempty(multipoint) && return 0
np = GI.npoint(multipoint)
np == 0 && return 0
Copy link
Member

@rafaqz rafaqz Dec 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make area always return Float64 for type stability?

Another option is to check the type once at the start, then push it through as the first argument T to an _area(T, trait, geom) method that returns e.g. zero(T). We can let users specify it too if they need to

function area(::CT, curve) where CT <: GI.AbstractCurveTrait
GI.isempty(curve) && return 0
np = GI.npoint(curve)
np == 0 && return 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@skygering
Copy link
Collaborator Author

I wanted to preserve the flexibility so that if I run a simulation with Float32 or Float64 polygons, the area returns the correct type. But I can see why that is tricky, especially with the empty geometries for which we would have to default to something. Do you think it is better just for it to return Float64 so that the user can plan?

@rafaqz
Copy link
Member

rafaqz commented Dec 28, 2023

Maybe default to Float64 but let the user pass in the type?

@skygering
Copy link
Collaborator Author

Okay, I updated the type stuff! They all have optional parameters now. I also just simplified some stuff.

Something I am noting is that I am doing GI.getpoint(geom, i) a lot in order to deal with "closed" curves. This is my way of making it so that it doesn't matter if a Geometry library repeats the last coordinates in linear rings / polygons, we still repeat the last point for our algorithms. I don't have an idea for a better way of doing this though.

As discussed in #32, this can be problematic if a library does a poor job of implementing GeoInterface. I kinda feel like that is the library's fault though, not ours. For example, using tuples to convert an input geometry into a GeoInterface geometry feels like cheating, because we aren't really dealing with an arbitrary polygon, we're converting a given polygon to a specific type...

However, if you think this is still slower than GI.getpoint(geom) and using the iterator regardless, then we can think up a new way of iterating. Maybe that can be a separate PR though.

Copy link
Member

@rafaqz rafaqz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it will be much faster :)

Lets not worry about getgeom for now. But I do worry that getting individual points from LibGeos or ArchGDAL will never be fast (and geointerface should actually caution against operating at the point level too much)

@rafaqz rafaqz merged commit 4544514 into JuliaGeo:main Dec 28, 2023
3 checks passed
@skygering skygering deleted the sg/area_dist_debug branch January 2, 2024 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants