Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Densifier runs expensive buffer operation #541

Open
aaime opened this issue May 20, 2020 · 4 comments
Open

Densifier runs expensive buffer operation #541

aaime opened this issue May 20, 2020 · 4 comments

Comments

@aaime
Copy link

aaime commented May 20, 2020

The GeoTools rendering code uses the densifier to add extra points in long lines, to prepare them for reprojection, which could generate curves out of the original straight lines.
We noticed that when dealing with large geometries (e.g., Natural Earth 10m land, containing geometries with tens to hundred of thousands of points) it takes a lot of time, and uses much memory.

The culprit seems to be the buffer(0) call here:
https://github.com/locationtech/jts/blob/master/modules/core/src/main/java/org/locationtech/jts/densify/Densifier.java#L156

We don't really need to make invalid inputs valid, we'd be content to have an invalid output when the input is invalid.

Also, the algorithm is generating lots of Coordinate objects, while the input was an optimized coordinate sequence, would be nice if the code was creating another of the same type:

  • No un-packing of the original coordinate sequence into Coordinate[]
  • Creation of an output coordinate sequence of the same type (this might be difficult, one has to foresee exactly how many points the output will have, it would require two passes over the input I guess...)

As usual, timing will likely mean we'll create our own rendering-optimized variant of the densifier, but wanted to at least let you know about the issue.

@dr-jts
Copy link
Contributor

dr-jts commented May 20, 2020

Thanks for pointing this out. Yes, buffer is an expensive operation (and is also a bit of a hack, but is safe enough here.

Sounds like it would be nice to add an option to turn validation off, if not needed. That is simple enough to do with a flag.

This is one of the (many) JTS classes which has not be modernized to use the CoordinateSequence API properly. Although as you point out in this case it's a bit awkward to do so. Two passes would work, I think. Although I wonder if it would be better to use an intermediate "extendable" CoordinateSequence, and then convert to the required one for output? I think there might be a PR around for such a thing.

Anyway, seems perfectly reasonable for GeoTools to roll its own code, since this is a fairly simple algorithm, and it then can introduce whatever optimizations or refinements that are needed.

@FObermaier
Copy link
Contributor

Although I wonder if it would be better to use an intermediate "extendable" CoordinateSequence

See PR #271

@dr-jts
Copy link
Contributor

dr-jts commented Sep 11, 2020

Just to be clear, the use of buffer(0) is not to fix invalid input. It is used to ensure that densified geometry is valid. A densified polygonal might become invalid, since adding vertices may move line segment slightly due to numerical precision limitations.

This should be a very rare situation, however. And if the densified geometry is only used for rendering purposes, then a mild invalidity seems like it should not cause any problems? So the addition of a flag to disable polygon topology checking/correction still seems like a good solution.

@dr-jts
Copy link
Contributor

dr-jts commented Sep 12, 2020

Partially fixed by #595

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants