Skip to content

Custom Spark RDD that partitions geospatial data based on spatial proximity, for faster Orthogonal Range Query

Notifications You must be signed in to change notification settings

codekrypt-dev/spatial-spark-rdd

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spatial RDD

Custom Spark RDD to partition geospatial data, based on spatial proximity, for faster orthogonal range queries.

What is this Fork about?

This fork modifies the Spatial RDD to partition dataset using KD Tree & Epsilon approximation based on Parallel Algorithms for Constructing Range and Nearest-Neighbor Searching Data Structures.

Note

  1. Here we have chosen to implement KD tree based on 2D points
  2. We are doing primary partitioning & secondary indexing using KD Tree.