Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hexify multipolygons #124

Merged
merged 2 commits into from Jan 18, 2021
Merged

hexify multipolygons #124

merged 2 commits into from Jan 18, 2021

Conversation

knaaptime
Copy link
Member

@knaaptime knaaptime commented Jan 18, 2021

#125 is resolved by first exploding the input, then handling via split-apply-combine if the resulting unary_union is a multipolygon itself

@codecov-io
Copy link

codecov-io commented Jan 18, 2021

Codecov Report

Merging #124 (399ea20) into master (080fca1) will increase coverage by 4.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #124      +/-   ##
==========================================
+ Coverage   78.30%   82.32%   +4.02%     
==========================================
  Files          15       15              
  Lines         742      758      +16     
==========================================
+ Hits          581      624      +43     
+ Misses        161      134      -27     
Impacted Files Coverage Δ
tobler/tests/test_utils.py 100.00% <100.00%> (ø)
tobler/util/util.py 89.33% <100.00%> (+2.03%) ⬆️
tobler/area_weighted/area_interpolate.py 78.92% <0.00%> (+11.15%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 080fca1...399ea20. Read the comment docs.

# h3 hexes only work on polygons, not multipolygons
source = source.explode()

if type(source.unary_union) == Polygon:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do expensive source.unary_union here and then again in _to_hex or in the loop below. In larger dataframes, that can cause a significant slowdown. My sense is that you spend most of the time actually doing unary union than anything else in this function, so we surely don't want to do it twice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, thanks. I added a new PR that switches to a try/except which should cut down one unary_union call when it's not necessary, unless there's yet a more efficient way you can think of?

Copy link
Member

@darribas darribas Jan 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For an approximate but faster solution, we could offer an option to simply cover the bounding box (or its alpha shape). That'd avoid the unary union and, in some cases, it should be the same.

Otherwise, I'd agree w/ @martinfleis: do the unary union once and re-use it if else

This was referenced Jan 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants