Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RFC99 text: Geometry coordinate precision #9300

Merged
merged 6 commits into from
Mar 14, 2024
Merged

Conversation

rouault
Copy link
Member

@rouault rouault commented Feb 24, 2024

@rouault rouault marked this pull request as draft February 24, 2024 17:27
@rouault rouault force-pushed the rfc99_text branch 8 times, most recently from 19aaf96 to b95ed37 Compare February 26, 2024 17:37

It will *not* be able to store it in its metadata. The possibility of storing
the coordinate metadata in the .csvt side-car file has been considered, but it
would not be backwards-compatible.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can/should precision be detected in the way that column types are?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously it could, but I'm not sure I want to do that implementation effort :-). It could be relatively easy when constructing point geometries from X,Y columns, but it becomes trickier when importing them from WKT, because our WKT import code should take note of that.

Comment on lines +236 to +238
WKB export methods will be modified in a similar way as in the prototype
https://github.com/OSGeo/gdal/pull/6974 to nullify least significant bits from
the precision specifications.
Copy link
Contributor

@theroggy theroggy Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly (which might not be the case!), the coordinates will be rounded as such...
However, simply rounding coordinates can/will lead to invalid polygons (e.g. self-intersections, polygon rings touching,...).
So, I wonder, why not use GEOS set_precision to round the coordinates in the export methods?

Copy link
Member Author

@rouault rouault Mar 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the coordinates will be rounded as such...

yes

However, simply rounding coordinates can/will lead to invalid polygons (e.g. self-intersections, polygon rings touching,...).

This may indeed be possible if users indicate a precision not consistent of the actual point spacing of their geometries

So, I wonder, why not use GEOS set_precision to round the coordinates in the export methods?

Good suggestion. However there would be a cost in doing so systematically. But I'm inclined to add a GEOSGeom_setPrecision_r() step in a ogr2ogr workflow when the user explicitly specifies -xyRes, which can indeed causes generation of invalid geometries. For other workflows, the assumption is that the coordinate precision of the source is correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFC text and implementation updated with the above

Copy link
Contributor

@theroggy theroggy Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a configuration option could be introduced to give the user the option to apply set_precision systematically when writing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a configuration option could be introduced to give the user the option to apply set_precision systematically when writing?

It is not obvious to me that this is needed at that point, because either people use ogr2ogr and it takes care of that, or they use individual steps of the OGR API (CreateLayer(), CreateFeature()) and can explicitly invoke OGRGeometry::SetPrecision() if needed. A config option to do that systematically (not clear at that point if that would be in each driver that honours coordinate precision, or in in the generic CreateFeature() code) could potentially be added as a further enhancement

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about the drivers that honour the coordinate precision: if the precision is set and the configuration option is enabled, the validity of the geometries are guaranteed by gdal but a performance penalty can occur. If the the precision is set and configuration option is not enabled... it is the user's responsibility to make sure the geometries are rounded in a way they are valid as otherwise they can end up becoming invalid when saved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh well, I might consider doing this, but for GeoPackage (or formats that would use the same strategy), the value that is written will not be "exactly" the one after rounding to the precision, due to zeroing out least significant bits. It will be compatible with the requested precision however. That is if you ask for a precision of let's say 0.1 and you encode 0.1, the value you get when reading back will be 0.125 for example. Would that be a problem? My intuition would be that if the geometries are valid with respect to the 0.1 precision, those small changes of coordinate values < 0.1 shouldn't induce new in-validities, but I might be wrong...
If so, I was thinking about potentially also applying the precision when reading back so that 0.125 gets seen as 0.1, but that cost might be undesirable, and would only be done by the GDAL reader, so likely not a good option.
Or perhaps we should apply the zeroing out of LSBs in GeoPackage WKB only if a specific creation option is specified like DISCARD_COORD_LSB=YES ? (not sure which default should be used)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good questions. I follow the same intuition regarding the probable limited impact of zeroing LSBs, but I'm not at all a specialist in floating point representations, so not sure at all.

Applying precision when reading indeed sounds as not ideal... hopefully it can be avoided.

Maybe it is worth a test to see if zeroing out the LSB's after using GEOS set_precision is still having a significant impact?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is worth a test to see if zeroing out the LSB's after using GEOS set_precision is still having a significant impact?

on a quick check, as expected, the zeroing out of the LSB's can be reversed to the initial "decimal" precision, since we are conservative on the number of zeroed LSB's, and the geometry is still valid. However I've revered the default behaviour of the GPKG driver to not do that zeroing out by default. See the last commit (4fb6233) of this PR for the updates, as well as the new OGR_APPLY_GEOM_SET_PRECISION configuration option that is taken into account by the generic code path of CreateFeature() to run OGRGeometry::SetPrecision().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

@rouault rouault force-pushed the rfc99_text branch 2 times, most recently from f9093e0 to 0dfef9a Compare March 4, 2024 23:21
@PostholerCom
Copy link

  1. It might be interesting, if not useful, to break out the -xyRes into -xRes and -yRes as to keep continuity with -zRes and -mRes (and gdal*) switches for complete control over the precision and for non strictly geo-spatial uses of ogr2ogr.

  2. Say I have a CSV file with the desired highest resolution found in any geom at 4 decimal places. When I ogr2ogr that CSV to Shapefile it adds excess precision. Is it possible to have ogr detect the highest precision without implicitly specifying it? Can that be done without reading ahead the entire source file? Maybe precision can be adjusted upwards on-the-fly as the file is read?

If so, the precision can be determined from the source without specifically adding meta data or modifying the driver(s)?

@rouault
Copy link
Member Author

rouault commented Mar 5, 2024

  1. It might be interesting, if not useful, to break out the -xyRes into -xRes and -yRes as to keep continuity with -zRes and -mRes (and gdal*) switches for complete control over the precision and for non strictly geo-spatial uses of ogr2ogr.

I'm not super keen on that. This could be done outside of GDAL if needed

Say I have a CSV file with the desired highest resolution found in any geom at 4 decimal places. When I ogr2ogr that CSV to Shapefile it adds excess precision. Is it possible to have ogr detect the highest precision without implicitly specifying it? Can that be done without reading ahead the entire source file?

It would require to read ahead the entire source file. Could potentially be done as a further enhancement of the CSV driver if desired (cf #9300 (review))

Maybe precision can be adjusted upwards on-the-fly as the file is read?

Wouldn't fit at all with the spirit of this RFC where the coordinate precision is, once determined, a constant value at the geometry-column level. In ogr2ogr type of scenarios, you need to declare the coordinate precision of the output layer at the time it is created

- a few drivers (GeoJSON, JSONFG, OpenFileGDB) have layer creation options to
specify coordinate precision, but there is currently no driver agnostic way
of specifying it.

Copy link
Contributor

@theroggy theroggy Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is relevant, but just for information.

The motivation why I'm interested in applying precisions in general... is to avoid slivers. And I like that this feature offers a way to standardise/specify the precision at the layer/dataset level in a clear and transparent way. When calculating several overlays between different layers after each other, or if calculating overlays between layers where features have been snapped on line segments of the other layer, slivers or dangling nodes of typically ~1e-8 (m) or narrower will appear in the result due to rounding issues. Using a GEOS set_precision of >=1e-8 will remove such dangling nodes from larger polygons and will change sliver polygons to EMPTY geometries or NULL, making it easy to identify and remove them.

@rouault rouault marked this pull request as ready for review March 6, 2024 19:58
@schwehr
Copy link
Member

schwehr commented Mar 7, 2024

Can we work towards unifying on just one of British or American English? SetFromMetre versus SetFromMeter.

I suggest we aim for American English based on that dominating the current code base.

A partial look at the code base for just meter versus metre that shows that the code base is very inconsistent for this particular word.

meter:

find frmts ogr port autotest -name '*.cpp' | xargs egrep 'meter' | egrep -i -v 'diameter|parameter' | wc -l
183
find frmts ogr port autotest -name '*.h' | xargs egrep 'meter' | egrep -i -v 'diameter|parameter' | wc -l
98
find frmts ogr port autotest -name '*.py' | xargs egrep 'meter' | egrep -i -v 'diameter|parameter' | wc -l
46

metre:

find frmts ogr port autotest -name '*.cpp' | xargs egrep 'metre' | egrep -i -v 'diameter|parameter' | wc -l
74
find frmts ogr port autotest -name '*.h' | xargs egrep 'metre' | egrep -i -v 'diameter|parameter' | wc -l
3
find frmts ogr port autotest -name '*.py' | xargs egrep 'metre' | egrep -i -v 'diameter|parameter' | wc -l
215

@rouault
Copy link
Member Author

rouault commented Mar 7, 2024

I suggest we aim for American English

done

@rouault rouault changed the title [WIP] Add RFC99 text: Geometry coordinate precision Add RFC99 text: Geometry coordinate precision Mar 7, 2024
@rouault
Copy link
Member Author

rouault commented Mar 13, 2024

Motion adopted. Will wait before merging (after updating voting history & status) that the PR with the candidate implementation is also merged

@rouault rouault added this to the 3.9.0 milestone Mar 14, 2024
@rouault rouault merged commit 9930cd7 into OSGeo:master Mar 14, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants