Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordinate rounding should use seven digits #466

Open
tucotuco opened this issue Jan 25, 2021 · 3 comments
Open

Coordinate rounding should use seven digits #466

tucotuco opened this issue Jan 25, 2021 · 3 comments

Comments

@tucotuco
Copy link

It is not clear where the choice to use six digits for rounding came from. Georeferencing best practices have said to use seven digits since 2001. The reason for this is to insure the preservation of transforms back and forth between coordinates systems or formats without coordinate drift. The effect of using six digits is that georeferences done using best practices will all come up with this flag, which is an unfortunate result for those actually going to the effort to do things right. I recommend changing the flag to use seven digits (

* <p>Coordinate precision will be 6 decimals at most, any more precise values will be rounded.
).

See
https://docs.gbif-uat.org/georeferencing-quick-reference-guide/1.0/en/#s-coordinate-format
https://github.com/VertNet/georefcalculator/blob/eb1d7ad4b92a523c7e5c649764d1fbe9dfe896c4/source/python/point.py#L27

@timrobertson100
Copy link
Member

timrobertson100 commented Jan 25, 2021

This is probably carried over from days when we had to reduce cardinalities for search technology. Originally we have 5DP (~1m precision) in very old generations of indexing on MySQL, but then when things progressed that was increased to 6DP (~10cm). My guess is that we assumed ~10cm precision was plenty for a global search system.

Increasing to 7DP shouldn't be an issue since we geohash for most geo search now, but we should keep an eye on the batch map tile pyramid build performance. It needs to run every 2hrs in an acceptable time. If necessary we can apply the grouping in the map build but I suspect it won't be an issue.

@MattBlissett correctly highlights it might impact cache hit ratios.

@MattBlissett
Copy link
Member

I think we could keep 7 digits, but query the geocoding tables after rounding to 5-6 digits -- the source maps aren't accurate to centimetre precision anyway, and return multiple possible locations within several kilometres of any borders (ordered by distance from the point).

Otherwise, we would massively increase the processing needed for reinterpretation with new geo layers.

@timrobertson100
Copy link
Member

timrobertson100 commented Jan 25, 2021

Otherwise, we would massively increase the processing needed for reinterpretation with new geo layers.

Should we verify this? We might find that rounding to 7DP doesn't actually increase the number of distinct points significantly (how many records are really having different coordinates within a radius of ~10cm).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants