Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEOSGeom_setPrecision returns error for valid geometry #511

Open
dr-jts opened this issue Nov 20, 2021 · 19 comments
Open

GEOSGeom_setPrecision returns error for valid geometry #511

dr-jts opened this issue Nov 20, 2021 · 19 comments
Assignees
Labels
Bug JTS Issue also appears in JTS

Comments

@dr-jts
Copy link
Contributor

dr-jts commented Nov 20, 2021

Original issue: GEOS-1127.

GEOSGeom_setPrecision returns error on valid polygon geometry input, with certain large grid sizes.

This fails in JTS as well. Note that in JTS GeometryPrecisionReducer is used directly. That takes a scale factor, which is the reciprocal of the grid size.

Test Cases

Note: Case 1 is now resolved by #504, but in included here so it can be used for testing solutions.

Analysis

The reason this fails is because the Snap-Rounding noding used in GeometryPrecisionReducer is only designed to work for situations where the precision of the geometry coordinates are reduced from the full 53 bits of FP precision. Using non-integral precision scale factors (as in the failure cases) does not reduce the coordinate precision, and thus encounters robustness issues.

Possible Solution

The PrecisionModel is designed to support integral scale factors for larger scales (= small grid sizes). Large grid sizes can produce small scale factors which contain high-precision decimal places. The PrecisionModel class can be extended to support integral grid sizes as well by allowing it to accept negative inputs, which will be interpreted as grid Sizes. (So for example a PrecisionModel(-1123) specifies a grid size of 1123.). If a grid size is specified then the internal precision rounding calculation will use the grid size rather than the reciprocal scale factor. This will prevent increasing the precision of the rounded numbers.

This has been tested in a JTS prototype and works well.

The GEOSGeom_setPrecision grid size can accept negative inputs and use them as scale factors for GeometryPrecisionReducer.

@dr-jts dr-jts added the Bug label Nov 20, 2021
@dr-jts
Copy link
Contributor Author

dr-jts commented Nov 20, 2021

@mngr777 Here's the precision reduction issue.

A question for you: the proposed fix will work if integral grid sizes are used (i.e. 153 instead of 152.87406, or 1223 instead of 1222.99245256). As long as the integral grid size is larger than the fractional one this should not cause downstream problems. Will this work for your use case?

@mngr777
Copy link
Contributor

mngr777 commented Nov 21, 2021

@Komzpa please take a look too.

@Komzpa
Copy link
Contributor

Komzpa commented Nov 21, 2021

Snap-Rounding noding used in GeometryPrecisionReducer is only designed to work for situations where the precision of the geometry coordinates are reduced from the full 53 bits of FP precision does not sound reasonable? The datatypes in the intermediate representations are always floats so there can't really be a situation where precision is reduced ever. The only thing it should be doing is to widen the gaps between individual coordinates along one axis to be at least gridSize, the coordinates are still stored as doubles.

To me intuitively the current error mode looks like a simple mistake in error propagation analysis when spelling out the algorithm (something along the line of using heuristics of gridSize/100 and gridSize instead of gridSize/2 and thoroughly checking > >= compares to be complimentary). I am not convinced that making a float-point in nature grid "integerish" will fix anything rather than sweeping the issue under the carpet; the approach will still show interesting effects on coordinates after 2**54 where the distance between floats is bigger than 1.

scale = 0.000817666534169 is nowhere near any floating point precision range boundary that I'd call an excuse.

Generally speaking, if ST_MakeValid(ST_SnapToGrid(ST_Segmentize(geom, x/2),x)) can succeed, then GEOSGeom_setPrecision should succeed and give a similar result. If a dumb rounding can get two nodes to the same coordinates then the smart snapper should also be able to - at the very least by using dumb rounding to figure out what's supposed to become same node. The charts by @kalenikaliaksandr suggest that it often fails to do so:

image

For grid sizes: these sized are pixel sizes of Web Mercator maps, are industry standard and can't really be changed. Not having them in proper way can create Moiré artifacts on rendering. The following grids have to be supported:

>>> [(n, 2*6378137*math.pi / 256 / 2**n) for n in range(0,22)]
[(0, 156543.03392804097), (1, 78271.51696402048), (2, 39135.75848201024), (3, 19567.87924100512), (4, 9783.93962050256), (5, 4891.96981025128), (6, 2445.98490512564), (7, 1222.99245256282), (8, 611.49622628141), (9, 305.748113140705), (10, 152.8740565703525), (11, 76.43702828517625), (12, 38.21851414258813), (13, 19.109257071294063), (14, 9.554628535647032), (15, 4.777314267823516), (16, 2.388657133911758), (17, 1.194328566955879), (18, 0.5971642834779395), (19, 0.29858214173896974), (20, 0.14929107086948487), (21, 0.07464553543474244)]

@dr-jts
Copy link
Contributor Author

dr-jts commented Nov 21, 2021

The datatypes in the intermediate representations are always floats so there can't really be a situation where precision is reduced ever.

In this context "precision" means the number of significant digits stored. Even though the representation is FP (for convenience), the values stored can be lower precision. E.g.. if you store a value of 1 or 10 or 100 in an FP number, that is lower precision than storing 3.14159265358979323846. Values with low precision are less likely to cause numerical robustness problems.

To me intuitively the current error mode looks like a simple mistake in error propagation analysis when spelling out the algorithm (something along the line of using heuristics of gridSize/100 and gridSize instead of gridSize/2 and thoroughly checking > >= compares to be complimentary).

Intuition is not code :). The Snap-Rounding code has turned out to be surprisingly delicate. But this might be a line of research to try.

Generally speaking, if ST_MakeValid(ST_SnapToGrid(ST_Segmentize(geom, x/2),x)) can succeed, then GEOSGeom_setPrecision should succeed and give a similar result.

That's an interesting comparison, but not quite apples-to-apples, since it adds (a lot) of vertices to the input geometry. It would be interesting to know if using ST_Segmentize before applying precision reduction would also work.

Also note that the algorithm used by ST_MakeValid can introduce vertices of higher precision at intersection points. That might still work for in the precision reduction case - have to test it. More experimentation required.

Moire artifacts on rendering

Is this a proven issue, or just a theoretical concern?

For vector tiles the output is eventually stored as fairly low-precision integers. Is this the case here? If so it seems like the coordinates will be rounded off anyway?

@Komzpa
Copy link
Contributor

Komzpa commented Nov 22, 2021

In this context "precision" means the number of significant digits stored

floats store as 0.x * 10^y. there are always 53 front-aligned digits, not back-aligned so you can have insignificant leading digits.

In geometry robustness is dictated by amount of (countable) float grid within the allowed error bound. A great write up by Vladimir Agafonkin https://observablehq.com/@mourner/non-robust-arithmetic-as-art

It's also the reason why old overlay code sometimes can compute overlays near (0,0) and can't in their real position: https://github.com/gojuno/lostgis/blob/master/sql/functions/ST_Safe_Intersection.sql#L31

Intuition is not code :).

A hack we found that works is to ST_SnapToGrid(geom, eps/3) before feeding into ST_ReducePrecision(geom, eps) - it essentially classifies the nodes as "on grid lines" and "on grid centers". This confirms my sus that there is a bogus if (within 0.1*eps) somewhere that is misclassifying nodes.

The Snap-Rounding code has turned out to be surprisingly delicate. But this might be a line of research to try.

Another one is that if you promise that integer grid sizes are robust (or that there is at least one robust grid size), then we can just wrap the thing into a scaler that will rescale the geometry so that the grid size is that legal one, snap and then scale back.

Moire artifacts on rendering

Is this a proven issue, or just a theoretical concern?

Similar things in the past triggered a vertical or horizontal gap between squares on square grid.

@dr-jts
Copy link
Contributor Author

dr-jts commented Nov 22, 2021

A hack we found that works is to ST_SnapToGrid(geom, eps/3) before feeding into ST_ReducePrecision(geom, eps) - it essentially classifies the nodes as "on grid lines" and "on grid centers". This confirms my sus that there is a bogus if (within 0.1*eps) somewhere that is misclassifying nodes.

Sounds like a good hack! Not sure if it really reveals the cause of the underlying issue, but it might be a direction worth pursuing.

Another one is that if you promise that integer grid sizes are robust (or that there is at least one robust grid size), then we can just wrap the thing into a scaler that will rescale the geometry so that the grid size is that legal one, snap and then scale back.

Yes, I thought of this as well. Unfortunately I just tried it, and it looks like the GeometryPrecisionReducer still fails on the scaled geometry (using a scale factor of 1).

However, it simplifies the situation which might make it easier to debug. It's possible that the problem is not with the Snap-Rounding algorithm, but with how the topology it generates is interpreted by the topology-building code.

@dr-jts
Copy link
Contributor Author

dr-jts commented Nov 24, 2021

JTS now has the ability to specify a PrecisionModel via a gridsize. A negative argument to GeometryPrecisionReducer indicates a grid size rather than a scale factor. This should get ported to GEOS soonish.

Update: the support for grid size is now ported to GEOS: 20a8074

Note that for GEOSGeom_setPrecision (which takes grid size as a parameter) the following holds:

  • A negative argument indicates a scale factor
  • Grid size arguments are now processed more accurately, since they can be passed with full precision to GeometryPrecisionReducer

@pramsey
Copy link
Member

pramsey commented Nov 26, 2021

Closed by #513

@pramsey pramsey closed this as completed Nov 26, 2021
@mngr777
Copy link
Contributor

mngr777 commented Dec 2, 2021

@pramsey please consider re-opening the issue as second case haven't been fixed and I am working on it currently.

I'm currently investigating the second case. Here's a simplified case that I believe still captures the issue:
POLYGON((2027603.4369208599 8183234.881054974, 2027112.7433669677 8183234.881054974, 2027112.7433669677 8183529.661146606, 2027603.4369208599 8183529.661146606, 2027603.4369208599 8183234.881054974 ),( 2027343.8203659565 8183290.493334692, 2027335.5549540422 8183271.617113829, 2027338.5549540422 8183271.617113829, 2027340.0243713208 8183271.314402774, 2027339.723808695 8183273.065802623, 2027340.8926633487 8183267.6602488635, 2027345.8926633487 8183267.6602488635, 2027343.8203659565 8183290.493334692))

Same exception is raised as for the original case: TopologyException: Ring edge missing at 2027415.7837199997 8183195.557739999

This geometry is successfully simplified if tolerance is set to 0.0 in SnapRoundingNoder::addIntersectionPixels
reduced-zero-tolerance

Enlarged picture of a hole in the polygon:
hole

The only 2 intersections added by SnapRoundingIntersectionAdder::processNearVertex heuristic are points a and b with segment (c, d).

@mngr777
Copy link
Contributor

mngr777 commented Dec 2, 2021

Reproducing in PostGIS:

# SELECT ST_ReducePrecision('SRID=3857; POLYGON((2027603.4369208599 8183234.881054974, 2027112.7433669677 8183234.881054974, 2027112.7433669677 8183529.661146606, 2027603.4369208599 8183529.661146606, 2027603.4369208599 8183234.881054974 ),( 2027343.8203659565 8183290.493334692, 2027335.5549540422 8183271.617113829, 2027338.5549540422 8183271.617113829, 2027340.0243713208 8183271.314402774, 2027339.723808695 8183273.065802623, 2027340.8926633487 8183267.6602488635, 2027345.8926633487 8183267.6602488635, 2027343.8203659565 8183290.493334692))', 152.87406);
ERROR:  lwgeom_reduceprecision: GEOS Error: TopologyException: Ring edge missing at 2027415.7837199997 8183195.5577399991

@pramsey pramsey reopened this Dec 2, 2021
@pramsey pramsey added the JTS Issue also appears in JTS label Dec 2, 2021
@pramsey pramsey assigned pramsey and dr-jts and unassigned pramsey Dec 2, 2021
@dr-jts
Copy link
Contributor Author

dr-jts commented Dec 5, 2021

Snap-Rounding is designed to operate in a finite fixed-precision grid. As such in theory it can/should be implemented using integral arithmetic. (This is what wagyu does). The JTS implementation is an attempt to support this concept but using FP numbers. I'm starting to wonder if it would be worth implementing a Snap-Rounding algorithm purely using integers, at least to support GeometryPrecisionReducer. This will involve input geometries being mapped onto an integer grid using an appropriate scale factor and offset, and then mapped back after the snap-rounding has restored topology (note that I think the integer precision will have to be limited to < 53 bits, so the inverse mapping does not itself cause precision loss. I think wagyu uses 32 bit ints?). There may or may not be a performance penalty - the mapping will cost some time, but perhaps the integer snap-rounding will be faster.

This is one reason why I still assert that GeometryPrecisionReducer should not be expected to handle high-precision grid sizes. My assumption is that working in the integer domain will cause them to be rounded anyway.

@Komzpa
Copy link
Contributor

Komzpa commented Dec 5, 2021

If you're willing to rewrite it to be on integer grid then all the hacks that do anything about eps/100 need to be ripped away first, as they don't belong to snap rounding - the only op that's going to be needed to be used is grid-cell-equality-comparator.

If you just grid it to the eps naively first it does work.

23:52:29 [kom] > SELECT ST_AsText(ST_ReducePrecision(ST_SnapToGrid('SRID=3857; POLYGON((2027603.4369208599 8183234.881054974, 2027112.7433669677 8183234.881054974, 2027112.7433669677 8183529.661146606, 2027603.4369208599 8183529.661146606, 2027603.4369208599 8183234.881054974 ),( 2027343.8203659565 8183290.493334692, 2027335.5549540422 8183271.617113829, 2027338.5549540422 8183271.617113829, 2027340.0243713208 8183271.314402774, 2027339.723808695 8183273.065802623, 2027340.8926633487 8183267.6602488635, 2027345.8926633487 8183267.6602488635, 2027343.8203659565 8183290.493334692))',152.87406), 152.87406));
                                                                                                                                                    st_astext                                                                                                                                                      
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
POLYGON((2027415.7837199997 8183195.557739999,2027415.7837199997 8183348.431799999,2027262.9096599997 8183195.557739999,2027110.0355999998 8183195.557739999,2027110.0355999998 8183501.305859999,2027568.6577799998 8183501.305859999,2027568.6577799998 8183195.557739999,2027415.7837199997 8183195.557739999))
(1 row)

Time: 0,806 ms

@dr-jts
Copy link
Contributor Author

dr-jts commented Dec 6, 2021

If you just grid it to the eps naively first it does work.

Yes, but this is almost certainly just due to the fact that with robustness issues like this one almost any "jiggling" of the input coordinates will no longer trigger the robustness failure.

And this not a general-purpose fix. Some geometry can become invalid when snapped to a grid, and then the reducing precision algorithm fails. Here's an example:

SELECT ST_AsText(ST_ReducePrecision(ST_SnapToGrid('POLYGON ((1 20, 9 20, 8.49999 1, 2.499999 18, 2.7 10, 5.66 9, 1 1, 1 20))',1), 1));

ERROR:  lwgeom_reduceprecision: GEOS Error: TopologyException: side location conflict at 5 8

image

@mngr777
Copy link
Contributor

mngr777 commented Dec 6, 2021

Case 2 does not depend on exact numbers and is easy to reproduce with following conditions:

  • overall similar geometry, of course, rectangle with a hole near it's lower edge etc.;
  • points are snapped as shown on the diagram below, blue lines separate points being snapped to different coordinates;
  • point a is closer than gridSize / 100 to b -- this means that intersection will not be added for point a and segment (b, c) in SnapRoundingIntersectionAdder::processNearVertex;
  • point a is closer than gridSize / 100 to (c, d), so that intersection is detected.

hole-snapping-1

E.g. here's a similar case with different numbers:

# SELECT ST_ReducePrecision('SRID=3857; POLYGON((200 0, -100 0, -100 200, 200 200, 200 0),(50.1 80, 30 40, 49.6 40, 50.4 39.6, 50.1 55, 50.6 38, 60 40, 50.1 80))', 100.0);
ERROR:  lwgeom_reduceprecision: GEOS Error: TopologyException: Ring edge missing at 100 0

I couldn't find any related bugs to fix. It seems like an error that intersection is added at point a for (c, d) segment, but not for (b, c) since (a, a') perpendicular to (c, d) has to cross (b, c), if that makes sense. Adding nodes for either both segments or none fixes the issue. The only fix I see (me being very new to computational geometry) is to collect all cases of p being close to a segment (p1, p2) ignoring if p is too close to p1 or p2, and then decide which ones to use so that this situation does not happen.

@dr-jts
Copy link
Contributor Author

dr-jts commented Dec 6, 2021

Good analysis, @mngr777 . Well, as @Komzpa points out, the introduction of hacks using magic numbers (or magic division factors, in this case) is always suspect and asking for trouble. My original thinking (blind hope?) was that the snap-rounding would mask the use of a fudge-factor. Clearly it does not. More research required, I'm afraid.

One thing I have recalled to mind is that technically speaking snap-rounding only applies to line arrangements whose coordinates already lie on the snapping grid. And as I showed above, it's easy to find examples where rounding the coordinates ahead of time corrupts the geometry topology. This is why the JTS snap-rounding executes in the input coordinate domain, at least initially. But this is pushing the approach perhaps further than it allows.

This also says to me that just working in the integer domain may not be a solution for this particular issue, since it necessarily requires rounding the input coordinates up front. (It may be that wagyu is a counter-example for this in some clever way. I don't know enough about its failure modes and algorithm to say).

@Komzpa
Copy link
Contributor

Komzpa commented Dec 7, 2021

@mngr777 it feels like an simple fix.

The epsilon-distance circle lookup has to be replaced with box lookup originating from the snaprounding grid. If a line crosses a box of values that are going to be rounded to the same value, a node has to be dropped on that line, so that after rounding they are snapped together.

In the case on the image it will cause them to just go to separate resulting grid cells and not interfere in any way.

@dr-jts
Copy link
Contributor Author

dr-jts commented Dec 7, 2021

The epsilon-distance circle lookup has to be replaced with box lookup originating from the snaprounding grid. If a line crosses a box of values that are going to be rounded to the same value, a node has to be dropped on that line, so that after rounding they are snapped together.

In the case on the image it will cause them to just go to separate resulting grid cells and not interfere in any way.

This is basically the description of the classic Snap-Rounding algorithm. But as I said, Snap-Rounding assumes input coordinates lie on the grid. If this is not the case then this is really outside the design envelope of SR. I found some failure cases which I tried to solve with that heuristic. They are in the JTS unit tests.

Maybe it's worth trying removing the nearnessTol heuristic in SnapRoundingIntersectionAdder::processNearVertex for cases which fail with it. (Will need to control this via a flag). Then can try both with and without, and perhaps that will reduce the failures to zero.

Could also make the SnapRoundingNoder.NEARNESS_FACTOR a parameter, and then try with increasing values of that.

@dr-jts
Copy link
Contributor Author

dr-jts commented Dec 7, 2021

Here's another possible solution:

  • Try the current GeometryPrecisionReducer algorithm
  • If it fails, do this:
    • Perform a pointwise precision reduction
    • If the result is valid, return it
    • If the result is invalid, use GeometryFixer to make it valid
    • GeometryFixer can introduce full-precision coordinates, so repeat until a valid result is obtained

@mngr777 @Komzpa would you be able to try this on your dataset? This will require a bit of GEOS code to be added around the call to GeometryPrecisionReducer.

@pramsey
Copy link
Member

pramsey commented Mar 9, 2023

Is this still live?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug JTS Issue also appears in JTS
Projects
None yet
Development

No branches or pull requests

4 participants