New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support geo sorting on multiple geo point values per doc #1846

Closed
tommyvn opened this Issue Apr 4, 2012 · 6 comments

Comments

Projects
None yet
5 participants
@tommyvn

tommyvn commented Apr 4, 2012

I have multiple geo points per document and would like to be able to sort based on the closest matching location in the document to my center point. Apparently the way things currently work is the sort will be based on a random location on the document, as in the following scenario:

I have the following mapping:

$ curl -XGET 'http://localhost:9200/work/offices/_mapping?pretty=true'
{
  "offices" : {
    "properties" : {
      "location" : {
        "properties" : {
          "address" : {
            "type" : "string"
          },
          "point" : {
            "type" : "geo_point"
          }
        }
      },
      "name" : {
        "type" : "string"
      }
    }
  }
}

Then I have the following query and response (I've only included hits to cut down on noise):

$ curl -XGET 'http://localhost:9200/work/offices/_search?pretty=true' -d '{ "fields": [ "name", "location" ], "query": { "match_all": {} }, "sort": { "_geo_distance": {"location.point": [-0.0976398, 51.4962307], "order": "asc", "unit": "km"} }}'
...
    "hits" : [ {
      "_index" : "work",
      "_type" : "offices",
      "_id" : "IDtt2WnSQnWEhuAxhlbPgw",
      "_score" : null,
      "fields" : {
        "location" : [ {
          "point" : [ -0.01655, 51.5007324 ],
          "address" : "E14 9SH"
        }, {
          "point" : [ -0.0976398, 51.4962307 ],
          "address" : "SE1 6PL"
        } ],
        "name" : "office3"
      },
      "sort" : [ 0.0 ]
    }, {
      "_index" : "work",
      "_type" : "offices",
      "_id" : "xzaVAHUoSGON8gY1-ggILQ",
      "_score" : null,
      "fields" : {
        "location" : [ {
          "point" : [ -0.01655, 51.5007324 ],
          "address" : "E14 9SH"
        }, {
          "point" : [ -0.0684337, 51.4843866 ],
          "address" : "SE1 5BA"
        } ],
        "name" : "office10"
      },
      "sort" : [ 2.413161036894697 ]
    }, {
      "_index" : "work",
      "_type" : "offices",
      "_id" : "ZWjuiVkWSD6H99ooKcKNuA",
      "_score" : null,
      "fields" : {
        "location" : [ {
          "point" : [ -0.0976398, 51.4962307 ],
          "address" : "SE1 6PL"
        }, {
          "point" : [ -0.0684337, 51.4843866 ],
          "address" : "SE1 5BA"
        } ],
        "name" : "office8"
      },
      "sort" : [ 2.413161036894697 ]
    }
...

office8 and office3 both have locations 0km away from my search point, yet office10 is sneaking in between the two at 2.4km away and office8 is also placing itself 2.4km away (which it is, but but only on the further location point).

This was originally discussed here: https://groups.google.com/forum/?fromgroups#!topic/elasticsearch/DxIUevwZfOs

@DominicWatson

This comment has been minimized.

DominicWatson commented Sep 7, 2012

While waiting for this enhancement, I've managed to get this working using the scripting functionality of ElasticSearch (which I have to say is utterly awesome, this is such a refreshing API to work with).

The following solution is very rough. It currently only supports kilometers and the lang-javascript plugin is required.

A temporary solution

I have the following javascript in ./config/scripts/geo/closestdistance.js:

(function(){
    var i, locations, calculateDistance, closest, distance;

    if ( !doc['latlon'].multiValued ){
        return doc['latlon'].distanceInKm( lat, lon )
    }

    calculateDistance = function( lat1, lat2, lon1, lon2 ){
        var R = 6371 // km
          , dLat = (lat2-lat1) * Math.PI / 180
          , dLon = (lon2-lon1) * Math.PI / 180
          , lat1 = lat1 * Math.PI / 180
          , lat2 = lat2 * Math.PI / 180
          , a = Math.sin(dLat/2) * Math.sin(dLat/2) + Math.sin(dLon/2) * Math.sin(dLon/2) * Math.cos(lat1) * Math.cos(lat2)
          , c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));

        return R * c;
    }

    locations = doc['latlon'].getValues();
    closest = calculateDistance( lat, locations[0].lat(), lon, locations[0].lon() );

    for( i=1; i < locations.length; i++ ){
        distance = calculateDistance( lat, locations[i].lat(), lon, locations[i].lon() );
        closest = distance < closest ? distance : closest;
    }

    return closest;
})();

Then, in my search body, I have something like:

{
    query : {
        custom_score : {
              query : {
                  query_string : {
                      query : "myquerystring..."
                  }
              }
            , params : {
                  lat : 54.5881
                , lon : -5.85829
              }
            , script : "geo_closestdistance"
        }
    },
    sort : [{_score : { order : "asc"}}]
}

Hopefully someone will find this useful.

@ghost

This comment has been minimized.

ghost commented Sep 8, 2012

jest sewz you know dominic: ESP dev centre left geo-location searching
as a rectangulation problem, despite how the measurements are laid out
on the globe. That was the search engine bought by Microsoft. I've no
idea if they got geo into SharePoint search and whether they got
something smarter than right angles... i look forward to leveraging this
solution.

b.

On 09/07/2012 05:48 PM, Dominic Watson wrote:

While waiting for this enhancement, I've managed to get this working
using the scripting functionality of ElasticSearch (which I have to
say is utterly awesome, this is such a refreshing API to work with).

The following solution is very rough. It currently only supports
kilometers and the |lang-javascript| plugin is required.

A temporary solution

I have the following javascript in
|./config/scripts/geo/closestdistance.js|:

|(function(){
var i, locations, calculateDistance, closest, distance;

 if ( !doc['latlon'].multiValued ){
     return doc['latlon'].distanceInKm( lat, lon )
 }

 calculateDistance = function( lat1, lat2, lon1, lon2 ){
     var R = 6371 // km
       , dLat = (lat2-lat1) * Math.PI / 180
       , dLon = (lon2-lon1) * Math.PI / 180
       , lat1 = lat1 * Math.PI / 180
       , lat2 = lat2 * Math.PI / 180
       , a = Math.sin(dLat/2) * Math.sin(dLat/2) + Math.sin(dLon/2) * Math.sin(dLon/2) * Math.cos(lat1) * Math.cos(lat2)
       , c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));

     return R * c;
 }

 locations = doc['latlon'].getValues();
 closest = calculateDistance( lat, locations[0].lat(), lon, locations[0].lon() );

 for( i=1; i < locations.length; i++ ){
     distance = calculateDistance( lat, locations[i].lat(), lon, locations[i].lon() );
     closest = distance < closest ? distance : closest;
 }

 return closest;

})();
|

Then, in my search body, I have something like:

|{
query : {
custom_score : {
query : {
query_string : {
query : "myquerystring..."
}
}
, params : {
lat : 54.5881
, lon : -5.85829
}
, script : "geo_closestdistance"
}
},
sort : [{_score : { order : "asc"}}]
}
|

Hopefully someone will find this useful.


Reply to this email directly or view it on GitHub
#1846 (comment).

@tbug

This comment has been minimized.

tbug commented Mar 27, 2013

I would also very much like to see this implemented.
Preferably controlled by the mode option as described in
http://www.elasticsearch.org/guide/reference/api/search/sort/

@tbug

This comment has been minimized.

tbug commented Mar 28, 2013

As a response to @DominicWatson:
This will run ~30% faster:

(function(){
    var i, locations, arcDist, closest, distance;
    arcDist = function( lat1, lat2, lon1, lon2 ){
        var R = 6371 // earth radius
          , CI = 0.017453292519943295 // pi/180
          , dLat = (lat2-lat1) * CI / 2
          , dLon = (lon2-lon1) * CI / 2
          , sinDLat = Math.sin(dLat)
          , sinDLon = Math.sin(dLon)
          , a = sinDLat*sinDLat + sinDLon*sinDLon * Math.cos(lat1*CI)*Math.cos(lat2*CI)
          , c = 2*Math.atan2(Math.sqrt(a), Math.sqrt(1-a));
        return R * c;
    }
    locations = doc["locations"].getValues();
    closest = arcDist( lat, locations[0].lat(), lon, locations[0].lon() );
    for( i=1; i < locations.length; i++ ) {
        distance = arcDist( lat, locations[i].lat(), lon, locations[i].lon() );
        if(distance < closest) {
          closest = distance;
        }
    }
    return closest;
})();

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Mar 28, 2013

@martijnvg martijnvg closed this in 941aa17 Mar 28, 2013

@martijnvg

This comment has been minimized.

Member

martijnvg commented Mar 28, 2013

The next 0.90 release will have a mode option (min & max) support for sorting by geo distance.

pdegeus added a commit to pdegeus/elasticsearch that referenced this issue Apr 11, 2013

Merge branch 'master' of https://github.com/msimons/elasticsearch
# By Igor Motov (1) and Martijn van Groningen (1)
# Via Marco Simons (1) and Martijn van Groningen (1)
* 'master' of https://github.com/msimons/elasticsearch:
  Added sort mode to geo distance sorting. Closes elastic#1846
  Fix LeastUsedDistributor and ensure random distribution for multiple non-fs directories
@mouzt

This comment has been minimized.

mouzt commented Jul 6, 2015

Does this question have new solution which is not rough, now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment