-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable reverse geocoding #19
Comments
Hi, I am currently trying to implement this on my photon powered app. I was even wondering if querying Postgres directly would not be a better option here, although it adds a tiers on production environnement. I will dig more into this and Solr doc this week, and I will update if I get better results. My approach so far was to use geofilt + geodist. Any advices about that ? |
since postgis 2.0 you have very performant queries of that kind using these operators: http://workshops.boundlessgeo.com/postgis-intro/knn.html 750 ms for solr is really surprising there should be ways to do it more performant. we'll implement that in the next version of photon, which will be based on elastic search. this happens on our next sprint in a week. we'll let you know about our findings. |
A geodistance sort on lat,lng should do the job on ElasticSearch. |
Example using Java API: MatchAllQueryBuilder query = QueryBuilders.matchAllQuery();
SearchRequestBuilder searchRequest = client.prepareSearch(INDEX);
searchRequest.setQuery(query).addSort(SortBuilders.geoDistanceSort("latlon").point(lat, lon).order(SortOrder.ASC));
searchRequest.setSize(1); |
I eventually found out the problem with my query and I now get results under 100ms. Not sure the story will be relevant for ES since it's not clear for me what is provided by Lucene and what is provided by Solr / ES. The thing is you can't just sort a The solution is to filter results first, to get the items of a given area arround the coordinate to reverse geocode.
I then switched to use
So relevants parts of my final query are: &fq={!bbox sfield=coordinate} Note that the Also note that I return more than the first result, since from a UI point of view, I find it better to suggest near places. Relevant Solr documentationhttps://cwiki.apache.org/confluence/display/solr/Spatial+Search Full Solr query:
Test stack to relativise results
|
This is the geo query necessary for elasticsearch: {
"sort": [
{
"_geo_distance": {
"coordinate": [
11.003661,
49.598095
],
"order": "asc"
}
}
],
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"geo_distance": {
"distance": "5km",
"coordinate": [
11.003661,
49.598095
]
}
}
}
}
} The point has to be changed of course :). It executes in 530ms for 100km and 460ms for 1km - not much difference for my dataset (world wide feed). But avoiding the filter leads to response times of over 3000ms. I think here again could be a pain point of edge ngram which produces too many entries and could therefor lead to slower response times. |
I am experimenting a bit with enabling reverse geocoding for my Photon installation. Is @karussell's query above still valid for the current version of ES? I am getting a |
hm, we updated elasticsearch from 1.1 to 1.3.1 and dynamic scripting isn't allowed anymore. for the sake of security you need to place the script in the script folder and add your file here to be copied to the right place. I am not totally sure that is your problem, but it would make sense. There is also a good docu for sorting by distance btw: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/sorting-by-distance.html |
@christophlingg There should not be a need for dynamic scripting for this. @Svantulden are you using some external library or directly the JSON request? If external lib make sure you use it correctly: http://stackoverflow.com/a/20175407/194609 |
Thank you both for your very quick reply! @karussell you were indeed right that the ES library Photon uses already adds I got reverse geocoding to work in my Photon installation with the following code: The query: {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"geo_distance": {
"distance": "5km",
"coordinate": [
${lon},
${lat}
]
}
}
}
} The searching & sorting code: SearchResponse response = client.prepareSearch("photon").setSearchType(SearchType.QUERY_AND_FETCH)
.setQuery(query)
.addSort(SortBuilders.geoDistanceSort("coordinate").point(lat, lon).order(SortOrder.ASC))
.setSize(1)
.setTimeout(TimeValue.timeValueSeconds(7))
.execute()
.actionGet(); I didn't get the sorting to work as a JSON String, but I may be doing something wrong there. The query executes in an average of The data seems pretty accurate on first glance, need to test some more though. Would you be interested if I made my (finished) reverse geocoding work public via a PR? |
I think PR is always appreciated, @christophlingg can veto, of course ;) For the Java code: this is not working and your phyton query works? Did you replaced the lat,lon parameters before passing the string to setQuery like done here and used the correct template? |
reverse geocoding is something many users will be very happy about! If i recall right, an optimization step was necessary. It is too expensive to calculate the distance of between the location and all photon documents (currently more than 100 million). the distance calculation is not trivial and doing it that often will cause long queries -> A preselection was necessary: take all documents in a certain bbox (geo index is used to make this one performant) and take the closest within this bbox. btw: when it comes to performance we cannot beat nominatim here. postgis has a better geoindex support than what elasticsearch has. I think they do it via geohashes. |
I think thatswhy I added the filter where most of the calculation is done via quadtree not based on normal distance calculation.
I don't think ElasticSearch should be slower because of the used algorithm. And a read from a Quadtree (or similar index) should be similarly fast to fetching from a large hashmap. If they really are slower then some devs from ES will help. |
Sorry, I phrased that badly. I meant that having the sort as part of the query JSON string did not work (probably because the ES library expects just a query and not a sort). So I used I will first do some timings on the global dataset and some refactoring (working here) before I submit a PR. |
Any news on this? This is a much needed feature :) |
In a week I'll have enough time to submit a PR for this on the recent refactored Photon version. If anyone wants it before that, you can use the query above and write a small modification to the RequestHandler to do it yourself. |
Any news on this? :) |
I've made a PR with my Reverse Geocoding here: #164 |
was closed by #164 |
... and let other people know your feature on the project website
The text was updated successfully, but these errors were encountered: