Improve scale handling for PrecisionModel #956

dr-jts · 2023-09-08T23:24:36Z

Problem

GEOS_setPrecision accepts a parameter specifying the grid size of the desired precision. The most common usage is to reduce the number of decimal places in a geometry (e.g. to 3 decimal places). In this case, the grid size is a negative power of 10 (e.g. 0.001). However, this can result in inexact rounding of ordinate values.

For example, this geosop command reproduces this PostGIS issue and shows how reducing the precision of a geometry using grid size 0.001 produces unwanted "extra decimals" (note that in geosop the reducePrecision operation parameter is negative to indicate the value is a grid size):

bin/geosop -a "LINESTRING (657035.913 6475590.114,657075.57 6475500)" reducePrecision N-0.001

LINESTRING (657035.9130000001 6475590.114, 657075.5700000001 6475500)

The same issue can happen when using a scale factor that is a negative power of 10:

bin/geosop -a "POINT (674169.89 198051.619820510769063)" reducePrecision .00001
LINESTRING (700000 199999.99999999997, 700000 1400000)

Cause

There are two causes of this problem:

computing the scale factor as the reciprocal of the grid size may be inexact for some values (i.e. the reciprocal of a fractional power of 10 may not be an exact integral value)
the arithmetic used to scale values is subject to rounding issues when non-integral values are used:

return util::round(val / gridSize) * gridSize;

Because negative powers of 10 cannot be represented exactly in binary FP numbers this can cause the resulting "rounded" ordinate values to be inexact.

In contrast, if an integral scale factor is used directly, the rounding arithmetic is:

return util::round(val * scale) / scale;

This produces the desired result:

bin/geosop -a "LINESTRING (657035.913 6475590.114,657075.57 6475500)" reducePrecision 1000   

LINESTRING (657035.913 6475590.114, 657075.57 6475500)

It appears that in FP arithmetic multiplying by an integer value is more accurate than dividing by an (approximately equal) fractional value

Proposed Fix

This PR proposes a fix which has no impact on the GEOS API or semantics (and hence on downstream usage).

The fix does two things:

It detects when the PrecisionModel scale factor OR the equivalent grid size is "essentially integral" (i.e. very close to an integer). If one of these is the case, then the exact integer value of grid size or scale factor is used.
The arithmetic to adjust ordinate values uses the larger of the scale factor and grid size. This ensures that an integral grid size or scale factor is used if present (which is almost always the case). This produces a more exact scaled value (thanks to the magic voodoo of FP arithmetic).

Almost all (or all?) uses of GEOS_setPrecision use either grid sizes equivalent to integral scale factors OR grid sizes which are obviously NOT integral (e.g. say 0.5), so the heuristic of deciding "close" is fine for practical purposes.

In the very rare cases where both grid size and scale factor are fractional, the original values are used for scaling. Since they are fractional already, the output won't be exact in any case.

As a nice side effect, this fix allows dropping support for negative scale factors, which was introduced solely to accommodate the use of grid size rather than scale factor. (This PR will be extended to delete this code if that is agreed to).

dr-jts · 2023-09-09T00:04:19Z

I wonder if the inverse problem may occur as well: an integral grid size (e.g. 1000) would produce undesired "slop" when converted to a reciprocal scale factor?

This might require building a fuzzer to test a lot of values to see if this occurs. It's possible that the guard digits in FP arithmetic will handle this problem.

capi/geos_ts_c.cpp

pramsey

Looks OK to me. How do we feel about the behaviour when gridSize is > 0 (1, 10, 100, etc)?

dr-jts · 2023-09-11T21:51:34Z

How do we feel about the behaviour when gridSize is > 0 (1, 10, 100, etc)?

It seems to work. But perhaps should only do the integral test when the gridSize is < 1.

If the gridSize > 1, I haven't seen this produce rounding issues. Perhaps that's because the scaling is cutting off digits before the decimal point, so the FP arithmetic works better?

strk · 2023-09-12T16:33:30Z

Seems ok to me, now I wonder how results compare with ST_SnapToGrid

dr-jts · 2023-09-12T16:53:27Z

I wonder how results compare with ST_SnapToGrid

ST_SnapToGrid has the same issue - see PG-5425 and PG-3929. That PostGIS code could benefit from a similar fix.

I will add that data as a test case.

Update

Indeed, this is exactly the same behaviour, and the new code will fix it.
(Note that using a negative value forces geosop reducePrecision to treat the value as a gridSize rather than a scale factor, which ends up using the scaling arithmetic that produces rounding errors).

bin/geosop -a "LINESTRING(674169.89 198051.38,674197.7 198065.55,674200.36 198052.38)" reducePrecision N-0.001  
-->  
LINESTRING (674169.89 198051.38, 674197.7000000001 198065.55000000002, 674200.36 198052.38)

bin/geosop -a "LINESTRING(674169.89 198051.38,674197.7 198065.55,674200.36 198052.38)" reducePrecision 1000
-->
LINESTRING (674169.89 198051.38, 674197.7 198065.55, 674200.36 198052.38)

mwtoews · 2023-09-12T20:46:11Z

ST_SnapToGrid has the same issue - see PG-5425 and PG-3929. That PostGIS code could benefit from a similar fix.

Also recently reported on postgis-users.
As @pramsey noted a workaround for PostGIS is to use ST_AsText(g1, maxdecimaldigits = 3).
A similar workaround for GEOS is to use GEOSWKTWriter_setRoundingPrecision() / writer->setRoundingPrecision()

mwtoews · 2023-09-12T21:38:02Z

Another quirk that could be worth fixing here or in a separate issue is 0.001 or N-1000:

$ ./bin/geosop -a "LINESTRING (657035.913 6475590.114,657075.57 6475500)" reducePrecision 0.001
LINESTRING Z EMPTY

where the expected result should have been:

LINESTRING (657000 6476000, 657000 6476000)

And what should happen with N-100000000? LINESTRING EMPTY? or LINESTRING (0 0, 0 0)?

dr-jts · 2023-09-12T22:44:41Z

Also recently reported on postgis-users.

Good find. There are lots of examples of this occurring in ST_SnapToGrid! All solved by this fix.

dr-jts · 2023-09-12T22:48:09Z

Another quirk that could be worth fixing here or in a separate issue

@mwtoews This is a different problem, so should be a separate issue. (Is there already an issue for this?)

Technically zero-length linestrings are invalid, but it seems good to bend the rules in this case.

what should happen with N-100000000? LINESTRING EMPTY? or LINESTRING (0 0, 0 0)?

Probably the latter - so as to not lose information.

dr-jts · 2023-09-13T15:20:13Z

How do we feel about the behaviour when gridSize is > 0 (1, 10, 100, etc)?

I have found a case where a large gridsize (100,000) causes rounding error when using the reciprocal scale factor.

Using the gridsize 100,000 directly works:

bin/geosop -a "LINESTRING(674169.89 198051.619820510769063, 674197.71234 1448065.55674200)" -f wkt reducePrecision N-100000
LINESTRING (700000 200000, 700000 1400000)

Using the reciprocal scale factor of 0.00001 causes rounding error:

bin/geosop -a "LINESTRING(674169.89 198051.619820510769063, 674197.71234 1448065.55674200)" -f wkt reducePrecision .00001
LINESTRING (700000 199999.99999999997, 700000 1400000)

To resolve this, gridsizes > 1 need to be computed using the more stable arithmetic formula:

return util::round(val / gridSize) * gridSize;

Also, it's best to move the integral rounding logic into PrecisionModel, so it is widely available.

capi/geos_ts_c.cpp

src/geom/PrecisionModel.cpp

dr-jts · 2023-12-06T18:29:27Z

Another case of this: https://gis.stackexchange.com/questions/471495/st-reduceprecision-function-does-not-work-properly/471517

dr-jts requested review from pramsey and dbaston September 8, 2023 23:24

robe2 requested a review from strk September 9, 2023 17:59

dbaston reviewed Sep 11, 2023

View reviewed changes

capi/geos_ts_c.cpp Outdated Show resolved Hide resolved

pramsey approved these changes Sep 11, 2023

View reviewed changes

dr-jts force-pushed the fix-setprecision-scale-factor branch from c5c0733 to 3d19930 Compare September 12, 2023 18:41

mwtoews mentioned this pull request Sep 13, 2023

Should reducing precision produce empty or zero-length results? #957

Open

pramsey reviewed Sep 14, 2023

View reviewed changes

capi/geos_ts_c.cpp Outdated Show resolved Hide resolved

pramsey reviewed Sep 14, 2023

View reviewed changes

src/geom/PrecisionModel.cpp Show resolved Hide resolved

dr-jts added 12 commits September 20, 2023 12:45

Improve scale handling for GEOS_setPrecision

10ef3b8

Refactor Grid Size integer tolerance as const

88de03e

Simplify logic, ensure floor works

4d2ffcb

Use better bump for safe floor evaluation

4754d77

Use std::round

7ece4a6

Only check integral scale facter for gridsize lt 1

7173943

Add test from PostGIS SnapToGrid issue

d43f7e0

Add SnapToGrid test case

6c7c52c

Add more tests

0662640

Add test case

f97c617

Add utility test function

a4f4c61

Add test with multiple scales

eaf387a

dr-jts added 6 commits September 20, 2023 12:45

Move rounding logic into PrecisionModel

7cdbd35

Avoid use of negative grid size for PM

1b3f1af

Make helper function memsafe

7d0214d

Remove dead code

7d5a5f8

Handle scale = 0 situation

778c028

Add test case for large scale factor

1bede1e

dr-jts force-pushed the fix-setprecision-scale-factor branch from 88909c1 to 1bede1e Compare September 20, 2023 19:47

dr-jts merged commit a423241 into libgeos:main Sep 20, 2023
27 checks passed

dr-jts deleted the fix-setprecision-scale-factor branch September 20, 2023 22:11

dr-jts changed the title ~~Improve scale handling for GEOS_setPrecision~~ Improve scale handling for PrecisionModel Sep 20, 2023

dr-jts added a commit that referenced this pull request Sep 22, 2023

Improve scale handling for PrecisionModel (#956)

cbd4f26

dr-jts mentioned this pull request Nov 6, 2023

set_precision on a valid polygon gives error: TopologyException: Ring edge missing at ... #985

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve scale handling for PrecisionModel #956

Improve scale handling for PrecisionModel #956

dr-jts commented Sep 8, 2023 •

edited

Loading

dr-jts commented Sep 9, 2023

pramsey left a comment

dr-jts commented Sep 11, 2023 •

edited

Loading

strk commented Sep 12, 2023

dr-jts commented Sep 12, 2023 •

edited

Loading

mwtoews commented Sep 12, 2023

mwtoews commented Sep 12, 2023

dr-jts commented Sep 12, 2023

dr-jts commented Sep 12, 2023

dr-jts commented Sep 13, 2023

dr-jts commented Dec 6, 2023

Improve scale handling for PrecisionModel #956

Improve scale handling for PrecisionModel #956

Conversation

dr-jts commented Sep 8, 2023 • edited Loading

Problem

Cause

Proposed Fix

dr-jts commented Sep 9, 2023

pramsey left a comment

Choose a reason for hiding this comment

dr-jts commented Sep 11, 2023 • edited Loading

strk commented Sep 12, 2023

dr-jts commented Sep 12, 2023 • edited Loading

Update

mwtoews commented Sep 12, 2023

mwtoews commented Sep 12, 2023

dr-jts commented Sep 12, 2023

dr-jts commented Sep 12, 2023

dr-jts commented Sep 13, 2023

dr-jts commented Dec 6, 2023

dr-jts commented Sep 8, 2023 •

edited

Loading

dr-jts commented Sep 11, 2023 •

edited

Loading

dr-jts commented Sep 12, 2023 •

edited

Loading