-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_geoPoint sorting not working #1872
Comments
Hello @otuerk Thanks for your report! Implementing this feature request should solve part of the problem you reported. Thanks again for this feedback! |
Thanks for the explanation @curquiza! Is it planned to make the sort more accurate in a future release? |
@meilisearch/product-team can tell you more about what we plan in the future :) |
Hello @otuerk 👋! Thanks for the feedback and sorry for the late reply! Yes, we are planning to fix this problem. We're sorry to see this happen. We have been thinking about several solutions like the one presented by @curquiza. We will decide in the next few days which one to implement and how far we will go if we add this type of behavior (I'm thinking of exposing an additional parameter that allows you to set the However, the presented solution might not be suitable for you if you need to have accurate sorting on very close distances. For example, if 2 documents are 200m apart and are placed in the same bucket because of this rate difference, they will have to be separated by other criteria. Since the search request presented here does not contain a |
I would prefer if meilisearch could provide a good default for the rate difference. |
Hi @gmourier and thank you for your explanation again. Just as a feedback from my side: For my use-case it's necessary to sort documents (each document represents a store in my case) pretty accurately since almost all data are located in the same city and sometimes very close by. I understand that the proposed solution might be a workaround, but I think that when you look at this feature in the docs you would expect it to sort accuartely and not just approximately. Right now the only way to get accurate data in my case is by doing the distance calculation directly via a DB query. But obviously this is much slower than what Meili returns. So I would really love to see an accurate way of sorting documents by distance - even if they are close by. |
Thank you! Can your end-users (if there are any) interact with an interface to search for stores by name, brand, or filter by other criteria? |
So yeah, in reality, there are two problems. The first one (which I think can be fixed quite easily) is an error of approximation with the data structure we use to store the geopoint. If this was implemented, your issue would disappear. The second problem we have, and that doesn’t seem to be an issue for you, is that currently, when you sort with a _geoPoint, you are only sorting your documents by geo distances and nothing more. [
{
"id": 0,
"_geo": { "lat": 1, "lng": 0 },
"food": "kebab",
},
{
"id": 1,
"_geo": { "lat": 1.5, "lng": 0 },
"food": "pizza",
},
{
"id": 2,
"_geo": { "lat": 2, "lng": 0 },
"food": "kebab",
},
{
"id": 3,
"_geo": { "lat": 5, "lng": 0 },
"food": "kebab",
}
] And your user sends this request:
He would get the three documents in the following order: If this doesn’t work for you, I would love to hear what is your use case! 😁 |
I get your point that prefering the documents containing kebab totally makes sense in this case since you provided a query that narrows down the search even more and pizza results might not be important in this case. But the case where I would like to use this feature is that on our landing page there's a map that shows stores close to your current location. Underneath the map there's a short list of 5 results closest to you and a button to paginate more results. There's no additional query here. The problem is that the result from Meili is not ordered properly (it doesn't make sense that the result with a distance of 4548 is sorted before the one with 3938). Of course I could order the results on the client side but when you use pagination that does not work anymore since you might get results that should actually have been sorted before the results you already have. E.g: Initial load returns results with distance (let's assume the distance is correct)
But the next pagination request to the server returns:
Now 350 should have been in the initial result, and re-sorting the whole result on the client side now is completely confusing for the user. And this is what actually happens when we use Meili for that kind of distance sorting. And now back to the example you provided, I would assume that even if I provide an additional query - if there are enough results that e.g. pagination is a thing you'd have the same problem again. |
Ok thank you, I guess we would need another query setting as @gmourier was saying to let you say you want to order documents with 0% of imprecision.
And this is our fault but I think it could be easily fixed when I have the time 😁 |
Hello @otuerk, I've been working on your bug. If you have the time I would love it if you could run your test again and tell me if it works with this branch: https://github.com/meilisearch/MeiliSearch/tree/fix_geo_search You'll need to re-index all your documents from a new |
424: Store the geopoint in three dimensions r=Kerollmops a=irevoire Related to this issue: meilisearch/meilisearch#1872 Fix the whole computation of distance for any “geo” operations (sort or filter). Now when you sort points they are returned to you in the right order. And when you filter on a specific radius you only get points included in the radius. This PR changes the way we store the geo points in the RTree. Instead of considering the latitude and longitude as orthogonal coordinates, we convert them to real orthogonal coordinates projected on a sphere with a radius of 1. This is the conversion formulae. ![image](https://user-images.githubusercontent.com/7032172/145990456-eefe840a-384f-4486-848b-81d0036814ec.png) Which, in rust, translate to this function: ```rust pub fn lat_lng_to_xyz(coord: &[f64; 2]) -> [f64; 3] { let [lat, lng] = coord.map(|f| f.to_radians()); let x = lat.cos() * lng.cos(); let y = lat.cos() * lng.sin(); let z = lat.sin(); [x, y, z] } ``` Storing the points on a sphere is easier / faster to compute than storing the point on an approximation of the real earth shape. But when we need to compute the distance between two points we still need to use the haversine distance which works with latitude and longitude. So, to do the fewest search-time computation possible I'm now associating every point with its `DocId` and its lat/lng. Co-authored-by: Tamo <tamo@meilisearch.com>
Fixed by #2005 that bump the milli dependency to milli v0.22.0 containing the fix of this issue 🙂 The bug fix will be released in Meilisearch v0.26.0. |
Describe the bug
The result is not sorted by distance to the given lat/lng when using _geoPoint sorting function
To Reproduce
Use the following data
When using the _geoPoint sort function
curl -X POST 'http://localhost:7700/indexes/test/search' -H 'Content-type:application/json' --data-binary '{ "sort": ["_geoPoint(48.2342154, 16.4205687):asc"] }'
You get the following result:
Which is wrong since the object with ID 3 is returned before the one with ID 4, although it's further away.
Expected behavior
Expected output would be:
MeiliSearch version: 0.23.1
EDIT from @irevoire
Steps to REALLY sort the documents by geo distances.
The text was updated successfully, but these errors were encountered: