Skip to content
This repository has been archived by the owner on Feb 2, 2023. It is now read-only.

The same Dubai is sent 2 times from 2 different sources #819

Closed
Zlitus opened this issue Jun 12, 2019 · 1 comment
Closed

The same Dubai is sent 2 times from 2 different sources #819

Zlitus opened this issue Jun 12, 2019 · 1 comment

Comments

@Zlitus
Copy link

Zlitus commented Jun 12, 2019

Do you want to request a feature or report a bug?
Bug

What is the current behavior?
When I do this search with Algolia Places, I retrieve 2 times “Dubai” in the search results:

{
	"hitsPerPage": 5,
	"language": "fr",
	"type": "city",
	"query": "dubai"
}

The first one come apparently from geoNames and the other one from OSM, they have a different postal code. Seem like a duplication issue?

See a screenshot of the 2 results:
d0f90fd928b4f3d9c84b77492bbfb9d61cf02f4a_2_1380x674

Thank you.

@JonathanMontane
Copy link
Contributor

Hi, thanks for the report.
This issue is due to the fact that these two records of Dubai are more than 25km apart (literally 25.6km apart).
Since there are a lot of duplicates in OSM and in Geonames, we have rules in place to merge them, and one of them is that we combine records that have the same administrative field, share the same name and are within 25km of each other. The issue here is that we can't remove the distance rule, as many countries have homonyms within the same administrative field, but sufficiently far apart that it is not an issue. Additionally, we cannot rely on lower level administrative fields as they have a fairly high probability of being missing/incorrect, so we are using a heuristic that we found worked the best to merge the data, and this case falls right outside of our checks, unfortunately.

That being said, one of these two points is probably incorrect, and you can update the data in either OSM or Geonames with a more correct center point and the duplication issue will be resolved the next time we update the data.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants