Skip to content
forked from apache/sedona

A Cluster Computing System for Processing Large-Scale Spatial Data

License

Notifications You must be signed in to change notification settings

imdany/GeoSpark

 
 

Repository files navigation

GeoSpark Logo

Stable Latest Source code
Maven Central with version prefix filter Sonatype Nexus (Snapshots) Build Status

GeoSpark@Twitter || GeoSpark Discussion Board || Join the chat at https://gitter.im/geospark-datasys/Lobby

GeoSpark is a cluster computing system for processing large-scale spatial data. GeoSpark extends Apache Spark / SparkSQL with a set of out-of-the-box Spatial Resilient Distributed Datasets (SRDDs)/ SpatialSQL that efficiently load, process, and analyze large-scale spatial data across machines.

GeoSpark contains several modules:

Name API Spark compatibility Introduction
Core RDD Spark 2.X/1.X SpatialRDDs and Query Operators.
SQL SQL/DataFrame SparkSQL 2.1+ SQL interfaces for GeoSpark core.
Viz RDD, SQL/DataFrame RDD - Spark 2.X/1.X, SQL - Spark 2.1+ Visualization for Spatial RDD and DataFrame.
Zeppelin Apache Zeppelin Spark 2.1+, Zeppelin 0.8.1+ GeoSpark plugin for Apache Zeppelin

GeoSpark supports several programming languages: Scala, Java, SQL, Python and R.

Please visit GeoSpark website for detailed documentations

News!

Orignial Contributors

  • (Mo)hamed Sarwat (Twitter: @MoSarwat)
  • Jia Yu

Impact

GeoSpark Downloads on Maven Central

GeoSpark ecosystem has around 10K downloads per month.

About

A Cluster Computing System for Processing Large-Scale Spatial Data

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 58.4%
  • Python 21.0%
  • Scala 16.2%
  • Jupyter Notebook 4.1%
  • Other 0.3%