Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ST_Simplify (and family) #480

Closed
wants to merge 42 commits into from

Conversation

Algunenano
Copy link
Member

@Algunenano Algunenano commented Sep 23, 2019

First the benchmarks:

  • Dataset: 13 multipolygons with an average of 260843.07 points per geometry.
  • Method: Number of requests from pgbench running the same query with tolerance 1M, 1 and 0 during 20 seconds.

Trunk:

  • Tolerance 1000000: 302 its / 20 seconds.
  • Tolerance 1: 39 its / 20 seconds.
  • Tolerance 0: 39 its / 20 seconds.

PR:

  • Tolerance 1000000: 612 its / 20 seconds: 2x as fast as trunk
  • Tolerance 1: 93 its / 20 seconds: 2.38x as fast as trunk
  • Tolerance 0: 107 its / 20 seconds. 2.74x as fast as trunk

Now the meat:
There isn't an algorithm change to get this speed up, since at the end of the day Douglas-Peucker is still Douglas-Peucker, so how did I do it? (ordered by impact):

  • ptarray_dp_findsplit_in_place now contains the code distance2d_sqr_pt_seg inlined manually to avoid recalculations. As the segment AB is always the same during the loop, caching as much intermediate results as possible yields a big performance win. It also includes some tricks (using AB == -BA, avoiding the division, A/B > 1 == A > B if both are positive) that the compiler isn't clever enough to see.
  • ptarray_simplify_in_place has been rewritten to use an array of bools instead of the outlist + sort. The conditions and input for ptarray_dp_findsplit_in_place have also changed to, IMO, make the intent clearer. It now uses memcpy (without checking for i != j) which is slightly better but only noticeable after the rest of the improvements.
  • lwgeom_simplify_in_place now returns an int indicating whether the geometry has been modified or not and drops the bbox if it has, which allow us to avoid serialization in some cases.
  • lwgeom_simplify_in_place now stops simplifying a polygon once a ring is dropped, as any inner ring should be smaller than that according to the spec.
  • ST_Simplify now clones the gserialized input and calls lwgeom_simplify_in_place instead of getting a pointer and cloning the geometry, which was slower.

Functions affected by these changes: ST_Simplify directly and ST_Subdivide indirectly.

Other stuff:

  • distance2d_sqr_pt_seg: I've applied the same tricks as ptarray_dp_findsplit_in_place when possible and removed distance2d_sqr_seg to force callers to use this other, faster, function and only calculate the square root when necessary (which was almost never).
    Functions (indirectly) affected by this: ST_Split, ST_Node, ST_OffsetCurve.

Note: I haven't cleanup the commits, so seeing changes as a whole is preferable.

@pramsey
Copy link
Member

pramsey commented Sep 23, 2019

Regards the performance numbers, cool. How does it work on more "normally" size data sets (npoints < 100)? Presumably well, because the no-op overhead is less, but good to confirm?

@Algunenano
Copy link
Member Author

Algunenano commented Sep 23, 2019

Regards the performance numbers, cool. How does it work on more "normally" size data sets (npoints < 100)? Presumably well, because the no-op overhead is less, but good to confirm?

Testing with a table of 11680 ST_MultiLineString (AVG(ST_NPoints == 43)):

Trunk:

  • Tolerance 1000000: 659 its / 20 seconds.
  • Tolerance 1: 39 its / 20 seconds.
  • Tolerance 0: 38 its / 20 seconds.

PR:

  • Tolerance 1000000: 651 its / 20 seconds: ~Same as trunk
  • Tolerance 1: 95 its / 20 seconds: 2.43x as fast as trunk
  • Tolerance 0: 109 its / 20 seconds. 2.86x as fast as trunk

@Algunenano
Copy link
Member Author

Note that in the case where the performance is the same in trunk and with the changes, the % of time spent in lwgeom_simplify_in_place is 6.34%, the rest is spent reading from disk, joining parallel plans and serializing and deserializing the geometry, so not much to gain there by these changes.

@dr-jts
Copy link
Contributor

dr-jts commented Sep 23, 2019

Is the goal of this to make MVT generation faster? If so, would a simpler simplification algorithm be better? (I.e. simple decimation, or something based on a sliding window?

@Algunenano
Copy link
Member Author

In this case it's about rendering faster, CARTO uses ST_Simplify from Mapnik (png and other formats) and also as part of ST_AsMVTGeom so it's a 2 for 1.

It might be possible to find a better simplification algorithm for MVTs but the simplification step only has a performance impact in some corner cases (big polygons with high simplification outside the tile box) and I hope the improvements both here and in St_RemoveRepeatedPoints will reduce them quite a bit, so the focus will be again in the whole clipping + validation (the slowest step in most cases).

@pramsey
Copy link
Member

pramsey commented Sep 23, 2019

The simplify is generally preceded by a remove-repeated-points(tolerance), so to some extend that "rough filter" has already been applied.

@Algunenano
Copy link
Member Author

So the numbers for the line benchmark were looking too familiar, so I rechecked it and I was using the old table (polygons) for tolerance 1 and 0 😓. Here are the proper numbers:

Trunk:

Tolerance 1000000: 685 its / 20 seconds.
Tolerance 1: 491 its / 20 seconds.
Tolerance 0: 488 its / 20 seconds.

PR:

Tolerance 1000000: 701 its / 20 seconds: ~Same as trunk
Tolerance 1: 658 its / 20 seconds: 1.34x as fast as trunk
Tolerance 0: 694 its / 20 seconds. 1.42x as fast as trunk

At this point the time spent in the simplification function is ~6%. In fact, the current high cost of the function is giving us a parallel plan which is slower, but the same happens with ST_RemoveRepeatedPoints. I'm considering reducing both their cost to LOW (100 vs 10000) as long as we don't have a system in place to scale the cost based on tuple / geometry size.

@Komzpa
Copy link
Member

Komzpa commented Sep 24, 2019

For tolerance 0, further opt is possible: in O(N) you loop over points and check if previous one is on the line between pre-previous and current. If it is, rewrite it with current, otherwise append. This will help Subdivide more, together with MVT.

@strk strk closed this in 86057e2 Sep 24, 2019
Algunenano pushed a commit to Algunenano/postgis that referenced this pull request Oct 2, 2019
Closes #4510
Closes postgis#480

git-svn-id: http://svn.osgeo.org/postgis/trunk@17821 b70326c6-7e19-0410-871a-916f4a2858ee
@Algunenano Algunenano deleted the speed_simplify branch November 15, 2019 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants