Use 64bit floats for geoDistance calculation #58476

dadebue · 2024-01-03T17:15:23Z

Describe the unexpected behaviour

The geoDistance function (and the other geographical distance functions) only uses Float32 values for the calculation
This doesn't provide a high enough precision for high precision positioning systems (like RTK GPS) which are able to determine the position up to ±1cm precision
When using the geoDistance function with close enough positions the result is always 0, which is incorrect

How to reproduce

most recent Clickhouse version
SELECT geoDistance(8.623000, 49.355000, 8.6230001, 49.3550001); returns 0
The exact distance is 0.01327m (link to online calculation)
- Latitude 1: 49.3550001
- Latitude 2: 49.3550000
- Longitude 1: 8.6230001
- Longitude 1: 8.6230000
The exact distance can be calculated with Clickhouse too when using the full haversine formula manually (returns 0.01327037431348616):

SELECT (2*atan2(sqrt((sin((49.3550001-49.3550000)*pi()/360)*sin((49.3550001-49.3550000)*pi()/360))+(cos(49.3550001*pi()/180)*cos(49.3550000*pi()/180)*sin((8.6230001-8.6230000)*pi()/360)*sin((8.6230001-8.6230000)*pi()/360))),sqrt(1-(sin((49.3550001-49.3550000)*pi()/360)*sin((49.3550001-49.3550000)*pi()/360))+(cos(49.3550001*pi()/180)*cos(49.3550000*pi()/180)*sin((8.6230001-8.6230000)*pi()/360)*sin((8.6230001-8.6230000)*pi()/360)))))*6371000 distance;

Expected behavior

The geoDistance function should use Float64 values for calculation to support high precision position values
SELECT geoDistance(8.623000, 49.355000, 8.6230001, 49.3550001); should return 0.01327

Additional context

Changes were introduced in Geo distance functions improve performance #37524
I'm no c++ expert but I suppose those lines in src/Functions/greatCircleDistance.cpp need to be changed

The text was updated successfully, but these errors were encountered:

alexey-milovidov · 2024-01-03T23:56:37Z

Let's switch to Float64 and check if there will be any noticeable difference in performance.
I thought there would be no difference.

dadebue · 2024-01-04T10:26:14Z

BTW this chart clearly shows the difference between the geoDistance and manual haversine formula and the lack of precision...

`

geetptl · 2024-01-14T03:35:04Z

I'd like to work on this.

alexey-milovidov · 2024-01-14T04:22:16Z

@geetptl, thank you! This will be very helpful.

Uzair-90 · 2024-01-14T17:21:14Z

I'm interested in working on this issue. It aligns well with my skills and interests. I've reviewed the problem, and I believe I can contribute a solution.

alexey-milovidov · 2024-01-15T09:09:06Z

@Uzair-90, @geetptl, I don't know who will propose a working solution first, but I'd appreciate your work!

geetptl · 2024-01-16T07:35:51Z

Unsure how to link this PR here, but here's the link to it:

#58847

Uzair-90 · 2024-01-16T14:41:14Z

@alexey-milovidov Thank you for the opportunity I will give it a try and highly appreciate it.

dadebue · 2024-01-29T13:05:35Z

@alexey-milovidov What do you think about the PR from geetptl #58847?

alexey-milovidov · 2024-01-29T17:47:52Z

@dadebue, it is unfinished - there is performance degradation.

alexey-milovidov · 2024-03-24T22:15:44Z

#61848

alexey-milovidov · 2024-03-26T00:54:23Z

@dadebue, @Uzair-90, @geetptl, I've reimplemented it. As a result, the performance increased instead of being degraded as expected before. Also, I've managed to make it 100% compatible and even allow the user to control the behavior.

https://s3.amazonaws.com/clickhouse-test-reports/61848/c2209c997ca08466a595b3ff9fec6db78e244123/performance_comparison_[2_4]/report.html

geetptl · 2024-03-26T01:57:57Z

@alexey-milovidov It was vacation when I was working on this issue, but student life caught up to me, along with my inexperience in C++. But thank you for your guidance in the now closed PR!

alexey-milovidov · 2024-03-26T04:05:37Z

Thank you, and it is actually your contribution :) I will add you as a co-author.

dadebue added the unfinished code label Jan 3, 2024

alexey-milovidov added the easy task Good for first contributors label Jan 3, 2024

alexey-milovidov assigned geetptl Jan 14, 2024

alexey-milovidov assigned Uzair-90 Jan 15, 2024

alexey-milovidov unassigned Uzair-90 and geetptl Mar 24, 2024

alexey-milovidov mentioned this issue Mar 24, 2024

Double precision of geoDistance if the arguments are Float64 #61848

Merged

18 tasks

alexey-milovidov closed this as completed in #61848 Mar 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use 64bit floats for geoDistance calculation #58476

Use 64bit floats for geoDistance calculation #58476

dadebue commented Jan 3, 2024 •

edited

alexey-milovidov commented Jan 3, 2024

dadebue commented Jan 4, 2024 •

edited

geetptl commented Jan 14, 2024

alexey-milovidov commented Jan 14, 2024

Uzair-90 commented Jan 14, 2024

alexey-milovidov commented Jan 15, 2024

geetptl commented Jan 16, 2024

Uzair-90 commented Jan 16, 2024

dadebue commented Jan 29, 2024

alexey-milovidov commented Jan 29, 2024

alexey-milovidov commented Mar 24, 2024

alexey-milovidov commented Mar 26, 2024 •

edited

geetptl commented Mar 26, 2024

alexey-milovidov commented Mar 26, 2024

Use 64bit floats for geoDistance calculation #58476

Use 64bit floats for geoDistance calculation #58476

Comments

dadebue commented Jan 3, 2024 • edited

alexey-milovidov commented Jan 3, 2024

dadebue commented Jan 4, 2024 • edited

geetptl commented Jan 14, 2024

alexey-milovidov commented Jan 14, 2024

Uzair-90 commented Jan 14, 2024

alexey-milovidov commented Jan 15, 2024

geetptl commented Jan 16, 2024

Uzair-90 commented Jan 16, 2024

dadebue commented Jan 29, 2024

alexey-milovidov commented Jan 29, 2024

alexey-milovidov commented Mar 24, 2024

alexey-milovidov commented Mar 26, 2024 • edited

geetptl commented Mar 26, 2024

alexey-milovidov commented Mar 26, 2024

dadebue commented Jan 3, 2024 •

edited

dadebue commented Jan 4, 2024 •

edited

alexey-milovidov commented Mar 26, 2024 •

edited