-
Notifications
You must be signed in to change notification settings - Fork 59
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #127 from jbouffard/performance-master
Performance Refactor
- Loading branch information
Showing
90 changed files
with
4,500 additions
and
4,343 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -100,3 +100,4 @@ geopyspark/jars/*.jar | |
|
||
# Unit test performance results | ||
prof/ | ||
.ensime_cache |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
GeoPySpark | ||
*********** | ||
.. image:: https://travis-ci.org/locationtech-labs/geopyspark.svg?branch=master | ||
:target: https://travis-ci.org/locationtech-labs/geopyspark | ||
|
||
``GeoPySpark`` provides Python bindings for working with geospatial data using `PySpark <http://spark.apache.org/docs/latest/api/python/pyspark.html>`_ | ||
It will provide interfaces into GeoTrellis and GeoMesa LocationTech frameworks. | ||
It is currently under development, and has just entered alpha. | ||
|
||
Currently, only functionality from GeoTrellis has been supported. GeoMesa | ||
LocationTech frameworks will be added at a later date. | ||
|
||
Contact and Support | ||
-------------------- | ||
|
||
If you need help, have questions, or like to talk to the developers (let us | ||
know what you're working on!) you contact us at: | ||
|
||
* `Gitter <https://gitter.im/geotrellis/geotrellis>`_ | ||
* `Mailing list <https://locationtech.org/mailman/listinfo/geotrellis-user>`_ | ||
|
||
As you may have noticed from the above links, those are links to the GeoTrellis | ||
gitter channel and mailing list. This is because this project is currently an | ||
offshoot of GeoTrellis, and we will be using their mailing list and gitter | ||
channel as a means of contact. However, we will form our own if there is | ||
a need for it. | ||
|
||
Setup | ||
------ | ||
|
||
GeoPySpark Requirements | ||
^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
============ ============ | ||
Requirement Version | ||
============ ============ | ||
Java >=1.8 | ||
Scala 2.11.8 | ||
Python 3.3 - 3.5 | ||
Hadoop >=2.0.1 | ||
============ ============ | ||
|
||
Java 8 and Scala 2.11 are needed for GeoPySpark to work; as they are required | ||
by GeoTrellis. In addition, Spark needs to be installed and configured with the | ||
environment variable, ``SPARK_HOME`` set. | ||
|
||
You can test to see if Spark is installed properly by running the following in the | ||
terminal: | ||
|
||
.. code:: console | ||
> echo $SPARK_HOME | ||
/usr/local/bin/spark | ||
If the return is a path leading to your Spark folder, then it means that Spark | ||
has been configured correctly. | ||
|
||
How to Install | ||
^^^^^^^^^^^^^^^ | ||
|
||
Before installing, check the above table to make sure that the | ||
requirements are met. | ||
|
||
To install via ``pip`` open the terminal and run the following: | ||
|
||
.. code:: console | ||
pip install geopyspark | ||
If you would rather install from source, you can do so by running the following | ||
in the terminal: | ||
|
||
.. code:: console | ||
git clone https://github.com/locationtech-labs/geopyspark.git | ||
cd geopyspark | ||
make install | ||
This will assemble the backend-end ``jar`` that contains the Scala code, | ||
move it to the ``jars`` module, and then runs the ``setup.py`` script. | ||
|
||
Make Targets | ||
^^^^^^^^^^^^ | ||
|
||
- **isntall** - install ``GeoPySpark`` python package locally | ||
- **wheel** - build python ``GeoPySpark`` wheel for distribution | ||
- **pyspark** - start pyspark shell with project jars | ||
- **docker-build** - build docker image for Jupyter with ``GeoPySpark`` | ||
|
||
Contributing | ||
------------ | ||
|
||
Any kind of feedback and contributions to GeoPySpark is always welcomed. | ||
A CLA is required for contribution, see `Contributing <docs/contributing.rst>`_ for more | ||
information. | ||
>>>>>>> Expanded README |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Minimal makefile for Sphinx documentation | ||
# | ||
|
||
# You can set these variables from the command line. | ||
SPHINXOPTS = | ||
SPHINXBUILD = sphinx-build | ||
SPHINXPROJ = GeoPySpark | ||
SOURCEDIR = . | ||
BUILDDIR = _build | ||
|
||
# Put it first so that "make" without argument is like "make help". | ||
help: | ||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | ||
|
||
.PHONY: help Makefile | ||
|
||
# Catch-all target: route all unknown targets to Sphinx using the new | ||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). | ||
%: Makefile | ||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
Changelog | ||
========== | ||
|
||
0.1.0 | ||
------ | ||
|
||
The first release of GeoPySpark! After being in development for the past 5 | ||
months, it is now ready for its initial release! Since nothing has been changed | ||
or updated per se, we'll just go over the features that will be present in | ||
0.1.0. | ||
|
||
|
||
**geopyspark.geotrellis** | ||
|
||
- Create a ``RasterRDD`` from GeoTiffs that are stored locally, on S3, or on | ||
HDFS. | ||
- Serialize Python RDDs to Scala and back. | ||
- Perform various tiling operations such as ``tile_to_layout``, ``cut_tiles``, | ||
and ``pyramid``. | ||
- Stitch together a ``TiledRasterRDD`` to create one ``Raster``. | ||
- ``rasterize`` geometries and turn them into ``RasterRDD``. | ||
- ``reclassify`` values of Rasters in RDDs. | ||
- Calculate ``cost_distance`` on a ``TiledRasterRDD``. | ||
- Perform local and focal operations on ``TiledRasterRDD``. | ||
- Read, write, and query GeoTrellis tile layers. | ||
- Read tiles from a layer. | ||
|
||
**Documentation** | ||
|
||
- Added docstrings to all python classes, methods, etc. | ||
- Core-Concepts. | ||
- Ingesting and creating a tile server with a greyscale data. | ||
- Ingesting and creating a tile server with data from Sentinel. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,162 @@ | ||
#!/usr/bin/env python3 | ||
# -*- coding: utf-8 -*- | ||
# | ||
# GeoPySpark documentation build configuration file, created by | ||
# sphinx-quickstart on Wed Apr 12 16:16:48 2017. | ||
# | ||
# This file is execfile()d with the current directory set to its | ||
# containing dir. | ||
# | ||
# Note that not all possible configuration values are present in this | ||
# autogenerated file. | ||
# | ||
# All configuration values have a default; values that are commented out | ||
# serve to show the default. | ||
|
||
# If extensions (or modules to document with autodoc) are in another directory, | ||
# add these directories to sys.path here. If the directory is relative to the | ||
# documentation root, use os.path.abspath to make it absolute, like shown here. | ||
# | ||
import os | ||
import sys | ||
from geopyspark.geopyspark_utils import setup_environment | ||
|
||
setup_environment() | ||
sys.path.insert(0, os.path.abspath('../geopyspark/')) | ||
|
||
|
||
# -- General configuration ------------------------------------------------ | ||
|
||
# If your documentation needs a minimal Sphinx version, state it here. | ||
# | ||
# needs_sphinx = '1.0' | ||
|
||
# Add any Sphinx extension module names here, as strings. They can be | ||
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom | ||
# ones. | ||
extensions = ['sphinx.ext.autodoc', 'sphinx.ext.napoleon'] | ||
|
||
napoleon_google_docstring = True | ||
|
||
# Add any paths that contain templates here, relative to this directory. | ||
templates_path = ['_templates'] | ||
|
||
# The suffix(es) of source filenames. | ||
# You can specify multiple suffix as a list of string: | ||
# | ||
# source_suffix = ['.rst', '.md'] | ||
source_suffix = '.rst' | ||
|
||
# The master toctree document. | ||
master_doc = 'index' | ||
|
||
# General information about the project. | ||
project = 'GeoPySpark' | ||
copyright = '2017, Jacob Bouffard, James McClean, Eugene Cheipesh' | ||
author = 'Jacob Bouffard, James McClean, Eugene Cheipesh' | ||
|
||
# The version info for the project you're documenting, acts as replacement for | ||
# |version| and |release|, also used in various other places throughout the | ||
# built documents. | ||
# | ||
# The short X.Y version. | ||
version = '0.1.0' | ||
# The full version, including alpha/beta/rc tags. | ||
release = '0.1.0' | ||
|
||
# The language for content autogenerated by Sphinx. Refer to documentation | ||
# for a list of supported languages. | ||
# | ||
# This is also used if you do content translation via gettext catalogs. | ||
# Usually you set "language" from the command line for these cases. | ||
language = None | ||
|
||
# List of patterns, relative to source directory, that match files and | ||
# directories to ignore when looking for source files. | ||
# This patterns also effect to html_static_path and html_extra_path | ||
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] | ||
|
||
# The name of the Pygments (syntax highlighting) style to use. | ||
pygments_style = 'sphinx' | ||
|
||
# If true, `todo` and `todoList` produce output, else they produce nothing. | ||
todo_include_todos = False | ||
|
||
|
||
# -- Options for HTML output ---------------------------------------------- | ||
|
||
# The theme to use for HTML and HTML Help pages. See the documentation for | ||
# a list of builtin themes. | ||
# | ||
html_theme = 'alabaster' | ||
|
||
# Theme options are theme-specific and customize the look and feel of a theme | ||
# further. For a list of options available for each theme, see the | ||
# documentation. | ||
# | ||
# html_theme_options = {} | ||
|
||
# Add any paths that contain custom static files (such as style sheets) here, | ||
# relative to this directory. They are copied after the builtin static files, | ||
# so a file named "default.css" will overwrite the builtin "default.css". | ||
html_static_path = ['_static'] | ||
|
||
|
||
# -- Options for HTMLHelp output ------------------------------------------ | ||
|
||
# Output file base name for HTML help builder. | ||
htmlhelp_basename = 'GeoPySparkdoc' | ||
|
||
|
||
# -- Options for LaTeX output --------------------------------------------- | ||
|
||
latex_elements = { | ||
# The paper size ('letterpaper' or 'a4paper'). | ||
# | ||
# 'papersize': 'letterpaper', | ||
|
||
# The font size ('10pt', '11pt' or '12pt'). | ||
# | ||
# 'pointsize': '10pt', | ||
|
||
# Additional stuff for the LaTeX preamble. | ||
# | ||
# 'preamble': '', | ||
|
||
# Latex figure (float) alignment | ||
# | ||
# 'figure_align': 'htbp', | ||
} | ||
|
||
# Grouping the document tree into LaTeX files. List of tuples | ||
# (source start file, target name, title, | ||
# author, documentclass [howto, manual, or own class]). | ||
latex_documents = [ | ||
(master_doc, 'GeoPySpark.tex', 'GeoPySpark Documentation', | ||
'Jacob Bouffard, James McClean, Eugene Cheipesh', 'manual'), | ||
] | ||
|
||
|
||
# -- Options for manual page output --------------------------------------- | ||
|
||
# One entry per manual page. List of tuples | ||
# (source start file, name, description, authors, manual section). | ||
man_pages = [ | ||
(master_doc, 'geopyspark', 'GeoPySpark Documentation', | ||
[author], 1) | ||
] | ||
|
||
|
||
# -- Options for Texinfo output ------------------------------------------- | ||
|
||
# Grouping the document tree into Texinfo files. List of tuples | ||
# (source start file, target name, title, author, | ||
# dir menu entry, description, category) | ||
texinfo_documents = [ | ||
(master_doc, 'GeoPySpark', 'GeoPySpark Documentation', | ||
author, 'GeoPySpark', 'One line description of project.', | ||
'Miscellaneous'), | ||
] | ||
|
||
|
||
|
Oops, something went wrong.