-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase offset for decay function #296
Comments
Good catch! Perhaps the newly introduced distance_sort=true then can be reverted, as suggested by @lonvia in #294? However, introducing a zoomlevel dependent offset or decay still might improve user experience: If zoomed on city level, I'd have a preference for POIs and addresses near by, zoomed out with the same center coord probably not... |
Hmmh, it still does not fix the location bias. I'm currently experimenting with a better script like:
but this is far from easy to have a good balance and a zoomlevel dependent solution might be good where the normalization factor
The default behaviour is already fixed via #297. |
We really need better testing here. Is this possible with the geocoder-tester tool? |
For Ile de France, there seem to be few test cases. |
Thanks, cool, that this is already possible! We have to add a few negative (where location bias has to be ignored) and positive (where location bias can be used for a better search) |
The location bias should automatically be applied soon as the lat/lon parameters are added. I don't think we need an additional parameter there. A zoom parameter is a good idea as it is something users can understand (and automatically apply from the map that is shown). Nominatim uses a viewbox parameter and then has a binary bias of inside/outside that box. But that is not ideal either. Zoom sounds better. What do you think about setting up an instance with an explicit location bias parameter that allows experimenting a bit? |
I can do that, yes. Should I make the whole formula changeable via a URL parameter for the time being? |
Yes, I think just setting the overall number for the location bias would be good. That allows to collect a few data points and then retrofit them to a formula. |
Now I am confused :) ... so complete access to a formula or just a distance as location bias? |
The formular computes some number, right? I would suggest to set this number directly via a paramter. |
This is not how it works as this depends on the distance to the document specific coordinate. Also scores are relative for ElasticSearch and cannot be used in an absolute manner, ie. you cannot compare scores of two different searches. |
Right. Now I see what you mean by whole formula. Would be cool but only if it is not too much work. |
Ok, see this branch and here you can test it: http://46.4.67.134:2322/api?q=dresden E.g. the query Please make sure you use the proper URL encoding |
For me the formula
|
Additonally, I'd propose to introduce kind of elasticsearch's 'distance' offset,
where distance_offset is zoom level dependent, e.g. In consequence, all results closer than distance_offset would have the same (max) location boost. OSM-Wiki/Zoom_levels offers a formula, which could be used to calculate a distance_offset internally:
This would boost all results in a pixel radius of e.g. 500 pixels around the map center with the same location bias, decaying would start only for coords farer away. bias_radius should then be zoom level dependent as well, I think... |
If we have a distance_offset we do not need a bias_radius.
We would need to performance test this because of the two expensive operations (cos + exp). Our current formula just has multiplication |
BTW: did you mean |
bias_radius would influence the decay. It somewhat corresponds to the 'scale' parameter in the ES decay fuctions.
Oh, yes, you are right!
The distance_offset would be calculated once per query and then passed in the formula as parameter, so performance should not be an issue, I think. |
Indeed, it influences the function differently. But 4 variables just for location bias is a bit ugly and potentially too flexible. Maybe we tune all the parameters for development and expose only distance_offset? Maybe we should name it
Ah, got it - makes sense :) ! |
I have created some tests for this feature here: geocoders/geocoder-tester#36 And so we use this new photon release including the proposed function with dist_offset=5 in production at GH for 2 days without complaints :D The interesting thing is that the dist_offset does not have much influence in the results in my manual tests, the |
Sounds good! Concerning the dist_offset: a value of 5 (m) would treat coordinates around 5m of the given coord to have the max boost, right? This yields a good result if zoomed in on street level. E.g. for a map centre in Kassel, a search for München returns a good POI match.
Which range did you use for testing? As they need to be in meter, the probably need to be rather huge to show significant effects |
Ah, no I convert the distance into km before, so 5km: String strCode = "double dist = doc['coordinate'].planeDistance(params.lat, params.lon); " +
"double score = 0.1 + 0.5 / (1.0 + ( dist * 0.001 - " + radius + ") / 10.0); " +
"score";
Tested also hundreds of km |
Could you try it with the Math.max call included:
|
Ah, probably even this:
would be better as otherwise the close points are 0 or worse like in my case negative? |
In this case the maximum score would depend on the externally provided radius, which IMHO is hard to understand. Points closer than radius should not get a negative score, but just evaluate to maximum score:
|
Now I understand. Thanks! Now the problem was that my new location bias tests fail for radius=5
To fix this the location bias scale needed an increase from 0.5 to 1.8. Then still the Hoyerswerda test fails as there is a "Bautzener Brücke" close to the "Bautzener Allee" but the Allee is the correct one and the problem behind this is the radius parameter which makes all results in 5km more or less the same. IMO radius should be 0, please see my PR #306 for this proposal. |
For now fixed via #306 |
As discussed on the mailing list we need a location bias that behaves a bit more like the old functionality and one possibility could be to increase the offset so that results "far away" are still considered with an enabled location bias.
The text was updated successfully, but these errors were encountered: