Custom Spark RDD to partition geospatial data, based on spatial proximity, for faster orthogonal range queries.
This fork modifies the Spatial RDD
to partition dataset using KD Tree
& Epsilon approximation
based on Parallel Algorithms for Constructing Range and
Nearest-Neighbor Searching Data Structures.
- Here we have chosen to implement KD tree based on 2D points
- We are doing
primary partitioning
&secondary indexing
usingKD Tree
.