Skip to content

Indoqa/solr-spatial-clustering

Repository files navigation

Indoqa Solr Spatial Clustering

This project offers a distance-based spatial clustering search component for Apache Solr. It addresses the problem of reducing the amount of displayed markers on a map, described as Spatial Clustering, using a distance-based clustering algorithm based on GVM.

The search component aggregates all possible search results to a maximum amount of pins and adds this information to the standard search result representation. Like faceting, it can be used to query for a paged result slice (eg. for a result list) and a geographic overview of ALL search result items (spatial clusters) at once.

Spatial Clustering

Installation

Download the plugin jar from http://repo1.maven.org/maven2/com/indoqa/solr/spatial-clustering/7.5.0/spatial-clustering-7.5.0-jar-with-dependencies.jar and copy it into the /lib directory of your solr core.

Configuration

schema.xml

To enable spatial clustering, store the geo information (longitude and latitude) in your solr document:

<fieldType name="pdouble" class="solr.DoublePointField" />

<field name="latitude" type="pdouble" indexed="true" stored="true" />
<field name="longitude" type="pdouble" indexed="true" stored="true" />

Note: For legacy support of old Solr 4 SortableDoubleField, see branch legacy/solr-4.3

solrconfig.xml

Define the search component and map field names for id, longitude and latitude, as well as the maximum allowed number of clusters:

<searchComponent class="com.indoqa.solr.spatial.clustering.SpatialClusteringComponent" name="spatial-clustering">
  <str name="fieldId">id</str>
  <str name="fieldLon">longitude</str>
  <str name="fieldLat">latitude</str>

  <int name="maxSize">1000000</int>
</searchComponent>

After that, add the spatial component to your query component chain:

<requestHandler name="/search" class="solr.SearchHandler" default="true">
  <arr name="last-components">
    <str>spatial-clustering</str>
  </arr>
</requestHandler>

Usage

Query Parameters

  • spatial-clustering=true -> Enables spatial clustering
  • spatial-clustering.size=20 -> Optionally sets the maximum number of clusters (=pins)
  • spatial-clustering.min-result-count=100 -> Optionally sets the minimum number of documents required to do clustering

Result

Similar to facets, the computed clusters are added to the search result after the requested documents. There are two types of result pins:

  • single: Represents a single document, including the id of the referenced document.
  • cluster: Represents an aggregated pin covering more than one document, including the cluster size.
<lst name="spatial-clustering">
  <lst name="pin">
    <str name="type">single</str>
    <int name="size">1</int>
    <double name="longitude">16.345518</double>
    <double name="latitude">48.285202</double>
    <string name="reference">document-2313</string>
  </lst>
  <lst name="pin">
    <str name="type">cluster</str>
    <int name="size">3</int>
    <double name="longitude">16.2461115932</double>
    <double name="latitude">48.20259082573333</double>
  </lst>
  ...
  ...
</lst>

Build

Requirements

  • Apache Solr 7.5.0+
  • Java 8+

Build

  • Download the latest release
  • run maven clean install