Skip to content
William Lyon edited this page Aug 18, 2014 · 31 revisions

Neo4j Spatial Cypher GSoC Project Tutorial

Overview

This document aims to provide a detailed explanation of how to use the initial version of the Neo4j Spatial Cypher project. This initial prototype version of Spatial Cypher was developed as part of Google Summer of Code 2014. Spatial Cypher aims to expose the power of the Neo4j Spatial plug-in (spatial layers/indexing, efficient spatial querying) to the Cypher query language.

Installation

Spatial Cypher depends on this fork https://github.com/johnymontana/spatial of Spatial (for RegisteredLayerTypes).

Rough outline (TODO: fill in with actual commands)

  1. git clone https://github.com/johnymontana/spatial.git
  2. cd spatial
  3. mvn clean install
  4. cd ..
  5. git clone https://github.com/johnymontana/neo4j.git

FIXME: This is where it gets tricky, (I think) because of a temporary circular dependency (neo4j <-> spatial) maven cannot be used to build the project, however individual tests are able to be executed in the IntelliJ IDE during development. For example this maven command should run the SpatialQueriesTest test, however it fails:

  1. cd neo4j/community/cypher
  2. mvn -Dtest=SpatialQueriesTest test
  • IntelliJ
  1. Open neo4j/community/cypher/pom.xml
  2. Choose to import new project
  3. Individual tests can be run in IntelliJ by selecting CTRL-SHIFT-F10

Spatial Cypher tests are:

cypher-parent project

  • SpatialCreateAddTest.scala
  • SpatialDistanceTest.scala
  • SpatialIntersectsTest.scala
  • SpatialQueriesTest.scala
  • SpatialTest.scala

cypher-compiler-2.1 project

  • spatial/IntersectsTest.scala
  • spatial/SpatialCypherTest.scala
  • spatial/SpatialTransactionBoundQueryContextTest.scala

Functionality

The initial functionality exposed in this version of Spatial Cypher is a minimal set consisting of:

  • Creating spatial layer(s)
  • Adding Nodes to spatial layer(s)
  • Spatial querying

Creating Spatial layer

A spatial layer must first be created before any spatial functionality is available. This is accomplished with the spatialCreateLayer function. The first parameter is the name of the layer, which should correspond to a Node Label. The second parameter is the type of layer. Initial options for layer type are SimplePoint and `WKT'.

SimplePoint Layer This layer type supports geometries of POINT type only (x,y). This layer type uses a configuration type of lat:lon which means any nodes added to this layer must have lat and lon properties which specify the coordinates of the point.

WKT Layer This layer type support geometries as specified by WKT.

This statement:

CREATE (n1:MyLayer) WITH spatialCreateLayer("MyLayer","SimplePoint") as layer

creates a spatial layer named MyLayer of type SimplePoint.

Adding Nodes to Spatial Layer

Nodes are added to the layer with the spatialAddNode function:

CREATE (n2:MyLayer {lat: 34.221, lon: 46.221, name: "Bob"}) WITH spatialAddNode(n2, "MyLayer")
CREATE (n2:MyLayer {lat: 35.321, lon: 52.341, name: "Olivia"}) WITH spatialAddNode(n3, "MyLayer")

Spatial querying

Initially the following spatial queries are exposed in Spatial Cypher:

  • Distance
  • Intersects

Distance The distance function simply calculates the distance between two Nodes in a spatial layer using the GeometryEncoder specific to that layer. The Distance function takes as parameters two Nodes, two WKT strings, or a combination of Node and WKT string. This function can be used with Cypher pattern matching to find geometries within a certain distance:

MATCH (n:MyLayer) WHERE Distance(n, n2, "MyLayer") < 10 RETURN n

The query above will return all Nodes in the MyLayer spatial layer within 10km of n2.

Intersects The Intersects boolean function returns TRUE is two geometries are intersecting. Like the Distance function Intersects takes two Nodes, two WKT strings, or a combination of Node and WKT String - as well as the layer name:

MATCH (n:MyLayer) WHERE Intersects(n, "POINT(46.221 34.221)") RETURN n.name as n

-or-

MATCH (n:MyLayer) WHERE Intersects(n, n2) RETURN n.name

Functionality to be implemented

A number of additional spatial query functions exist in the Spatial plugin, but have not yet been exposed as part of this Spatial Cypher project. This includes:

  • Contain
  • Cover
  • Covered By
  • Cross
  • Disjoint
  • Intersect Window
  • Overlap
  • Touch
  • Within
  • Within Distance

Also, the only layer types currently exposed in Spatial Cypher are SimplePoint and WKT. These layer types are defined and configured in this fork of Spatial, specifically the RegisteredLayerType class in SpatialDatabaseService. Additional layer types can be defined there.

Demo app: Finding Businesses by category within a polygon

spatialcypherdemo.herokuapp.com

Using data from the Yelp Academic Dataset we will build a Neo4j Spatial powered web application to help users find businesses on a map. Our main requirements are:

  1. The user should be able to select a business category they would like to search
  2. The user should be able to draw a polygon on a map and search only within that polygon
  3. The results should be displayed on a map

Map demo screenshot

The data

Yelp provides data on businesses, users, checkins, and reviews. However we are only interested in businesses. The data is in json format with each business object defined like this:

{
  "business_id": "yZXEELxi8KMwzXCHP345GQ", 
  "full_address": "2550 S Rainbow Blvd\nSte  W1\nWestside\nLas Vegas, NV 89146", 
  "hours": {
    "Monday": {"close": "22:00", "open": "11:00"}, 
    "Tuesday": {"close": "22:00", "open": "11:00"}, 
    "Friday": {"close": "22:00", "open": "11:00"}, 
    "Wednesday": {"close": "22:00", "open": "11:00"}, 
    "Thursday": {"close": "22:00", "open": "11:00"}, 
    "Sunday": {"close": "22:00", "open": "11:00"}, 
    "Saturday": {"close": "22:00", "open": "11:00"}
  }, 
  "open": true, 
  "categories": ["Bars", "Turkish", "Mediterranean", "Nightlife", "Lounges", "Restaurants"], 
  "city": "Las Vegas", 
  "review_count": 18, 
  "name": "Maza Mediterranean Grill & Lounge", 
  "neighborhoods": ["Westside"], 
  "longitude": -115.2419755, 
  "state": "NV", 
  "stars": 4.5, 
  "latitude": 36.143396799999998,
  "attributes": {
    "Alcohol": "full_bar", 
    "Noise Level": "quiet", 
    "Music": {"dj": false}, 
    "Attire": "casual", 
    "Ambience": {
      "romantic": false, 
      "intimate": false, 
      "classy": false, 
      "hipster": false, 
      "divey": false, 
      "touristy": false, 
      "trendy": false, 
      "upscale": false, 
      "casual": false
    }, 
    "Good for Kids": true, 
    "Wheelchair Accessible": true, 
    "Good For Dancing": true, 
    "Delivery": true, 
    "Coat Check": false, 
    "Smoking": "outdoor", 
    "Accepts Credit Cards": true, 
    "Take-out": true, 
    "Price Range": 2, 
    "Happy Hour": false, 
    "Outdoor Seating": false, 
    "Takes Reservations": true, 
    "Waiter Service": true, 
    "Wi-Fi": "free", 
    "Caters": true, 
    "Good For": {
      "dessert": false, 
      "latenight": false, 
      "lunch": false, 
      "dinner": false, 
      "breakfast": false, 
      "brunch": false
    }, 
    "Parking": {
      "garage": false, 
      "street": false, 
      "validated": false, 
      "lot": true, 
      "valet": false
    }, 
    "Has TV": true, 
    "Good For Groups": true
  }, 
"type": "business"
}

As you can see this data provides quite a lot of information about each business. For our purposes we are only interested in the following attributes:

  1. business_id - we will want some unique way to refer to each business
  2. name - we will need to display this to the user
  3. categories - we will need to keep track of business categories
  4. latitude
  5. longitude

Data Model

Data model

Here we can see the data model we are dealing with. The blue nodes indicate those added by the Spatial plug-in that implement an in-graph RTree. The yellow nodes show our business and category nodes.

Creating a spatial layer

We will use Neo4j Spatial's spatial index to perform fast geospatial searches. To do this we need to create a spatial layer and then add nodes into that layer. Since our geospatial geometries are all points (lat,lon) we can use the SimplePoint type:

CREATE (n1:Business) WITH spatialCreateLayer("Business","SimplePoint") as layer

This Cypher statement uses the spatialCreateLayer function to create a Spatial layer called Business of the type SimplePoint. The SimplePoint layer defines the geometry for each node to be a POINT with lat and lon properties on each node defining the point.

Adding nodes to the layer

Now that we have created the Spatial Layer we are ready to start adding business data. For each business in the Yelp data we execute the following Cypher query:

CREATE (n:Business 
  {
    business_id: {business_id}, 
    lat: {latitude}, 
    lon: {longitude}, 
    name: {name}
  }) 
  WITH spatialAddNode(n, "Business")

This will create a Node for each business and add it to our Business spatial layer.

We also need to add Categories. We could accomplish this with a Cypher FOREACH but to be more explicit we'll just iterate over each Business and Category:

MATCH (b:Business {business_id: {business_id} })
MERGE (c:Category {name: {cat_name} })
CREATE UNIQUE (b)-[:IS_IN]->(c)

After this we'll have a data model consistent with the image above.

Querying for businesses using the spatial index

We would like to query for businesses within a user-defined polygon and list all of the businesses that belong to a certain category. We can do that with a query like this:

MATCH (n:Business) WHERE Intersects(n, "POLYGON ((33.487 -112.038, 33.480 -112.039, 33.480 -112.030, 33.487 -112.0297, 33.487 -112.038))", "Business")
MATCH (n)-[:IS_IN]->(c:Category {name: "Restaurants"})
RETURN n

Note that we're using the Spatial Cypher Intersects function to find businesses within a polygon. We then use Cypher pattern matching to find businesses within a certain category.

Adding a web frontend using Mapbox

A simple web frontend is used to demonstrate these queries. MapBox tiles are used to render the map and the Leaflet draw plugin is used to allow the user to draw a polygon on the map. Check it out here: http://spatialcypherdemo.herokuapp.com/