Perf: Use bounding boxes as pre-step to speed up around statement #167
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Recently, someone posted a question on help.openstreetmap.org regarding the following very long running query:
The following screenshot gives a first impression on the runtime issue in terms of objects to be checked. Our reference for comparison is a large bbox around Rennes, which already includes a 10km buffer. All the other smaller bboxes are in fact some boundary=administrative, where each way segment / node needs to be checked against the way segments/nodes in the large bbox. For the sake of clarity both segments/nodes are not shown on the screenshot.
Performance for the around statement was already in focus about about two years ago (see e.g. #25). One of the possible optimizations we already discussed in the past was the introduction of an additional bbox check to quickly discard non-intersecting ways.
This PR now includes a first version for a bbox-based pruning of irrelevant segments: the previous logic is still fully in place, but I've added additional bbox intersection checks to a number of places to avoid expensive calculations. I see this as a starting point for further discussion and improvements/cleanup of the coding in this pull request.
Speedups are quite nice so far: instead of 211.5 minutes, the above mentioned query now only takes 1.75 minutes (120x).
I have a few open points, which deserve some closer look.
=> For those corner cases, we're now falling back to the current logic.
Finding Points Within a Distance of a Latitude_Longitude Using Bounding Coordinates.pdf