Skip to content

Commit

Permalink
Merge pull request #493 from jbouffard/bug-fix/doc-links-fix
Browse files Browse the repository at this point in the history
Links and Formatting Fixes for the Docs
  • Loading branch information
Jacob Bouffard committed Sep 13, 2017
2 parents 686ad28 + c6bb133 commit 00f5331
Show file tree
Hide file tree
Showing 5 changed files with 250 additions and 263 deletions.
82 changes: 41 additions & 41 deletions docs/guides/catalog.rst
Original file line number Diff line number Diff line change
@@ -1,59 +1,56 @@
Catalog
=======

The ``catalog`` module allows for users to retrieve information, query,
and write to/from GeoTrellis layers.

Before begining, all examples in this guide need the following boilerplate
code:

.. code::
curl -o /tmp/cropped.tif https://s3.amazonaws.com/geopyspark-test/example-files/cropped.tif
.. code:: python3
import datetime
import geopyspark as gps
import numpy as np
from pyspark import SparkContext
from shapely.geometry import MultiPolygon, box
.. code:: python3
!curl -o /tmp/cropped.tif https://s3.amazonaws.com/geopyspark-test/example-files/cropped.tif
.. code:: python3
conf = gps.geopyspark_conf(master="local[*]", appName="layers")
pysc = SparkContext(conf=conf)
.. code:: python3
# Setting up the Spatial Data to be used in this example
spatial_raster_layer = gps.geotiff.get(layer_type=gps.LayerType.SPATIAL, uri="/tmp/cropped.tif")
spatial_tiled_layer = spatial_raster_layer.tile_to_layout(layout=gps.GlobalLayout(), target_crs=3857)
.. code:: python3
# Setting up the Spatial-Temporal Data to be used in this example
def make_raster(x, y, v, cols=4, rows=4, crs=4326):
cells = np.zeros((1, rows, cols), dtype='float32')
cells.fill(v)
# extent of a single cell is 1
extent = gps.TemporalProjectedExtent(extent = gps.Extent(x, y, x + cols, y + rows),
epsg=crs,
instant=datetime.datetime.now())
return (extent, gps.Tile.from_numpy_array(cells))
layer = [
make_raster(0, 0, v=1),
make_raster(3, 2, v=2),
make_raster(6, 0, v=3)
]
rdd = pysc.parallelize(layer)
space_time_raster_layer = gps.RasterLayer.from_numpy_rdd(gps.LayerType.SPACETIME, rdd)
space_time_tiled_layer = space_time_raster_layer.tile_to_layout(layout=gps.GlobalLayout(tile_size=5))
space_time_pyramid = space_time_tiled_layer.pyramid()
Catalog
=======

The ``catalog`` module allows for users to retrieve information, query,
and write to/from GeoTrellis layers.
What is a Catalog?
------------------
Expand Down Expand Up @@ -130,9 +127,10 @@ backend and the API of GeoPySpark.
Saving Data to a Backend
------------------------

The ``write`` function will save a given ``TiledRasterLayer`` to a
specified backend. If the catalog does not exist when calling this
function, then it will be created along with the saved layer.
The :meth:`~geopyspark.geotrellis.catalog.write` function will save a
given :class:`~geopyspark.geotrellis.layer.TiledRasterLayer` to a specified
backend. If the catalog does not exist when calling this function, then it
will be created along with the saved layer.

**Note**: It is not possible to save a layer to a catalog if the layer
name and zoom already exist. If you wish to overwrite an existing, saved
Expand Down Expand Up @@ -185,9 +183,9 @@ units of time that can be used to space apart data in the catalog.
Saving a Pyramid
~~~~~~~~~~~~~~~~

For those that are unfamiliar with the ``Pyramid`` class, please see the
[Pyramid section] of the visualization guide. Otherwise, please continue
on.
For those that are unfamiliar with the :class:`~gepyspark.geotrellis.layer.Pyramid`
class, please see the [Pyramid section] of the visualization guide. Otherwise,
please continue on.

As of right now, there is no way to directly save a ``Pyramid``.
However, because a ``Pyramid`` is just a collection of
Expand All @@ -208,10 +206,10 @@ through the layers of the ``Pyramid`` and save one individually.
Reading Metadata From a Saved Layer
-----------------------------------

It is possible to retrieve the ``Metadata`` for a layer without reading
in the whole layer. This is done using the ``read_layer_metadata``
function. There is no difference between spatial and spatial-temporal
layers when using this function.
It is possible to retrieve the :class:`~geopyspark.geotrellis.Metadata` for a layer
without reading in the whole layer. This is done using the
:meth:`~geopyspark.geotrellis.catalog.read_layer_metadata` function.
There is no difference between spatial and spatial-temporal layers when using this function.

.. code:: python3
Expand All @@ -229,8 +227,9 @@ Reading a Tile From a Saved Layer
---------------------------------

One can read a single tile that has been saved to a layer using the
``read_value`` function. This will either return a ``Tile`` or ``None``
depending on whether or not the specified tile exists.
:meth:`~geopyspark.geotrellis.catalog.read_value` function. This will either
return a :class:`~geopyspark.geotrellis.Tile` or ``None`` depending on whether
or not the specified tile exists.

Reading a Tile From a Saved, Spatial Layer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -239,7 +238,7 @@ Reading a Tile From a Saved, Spatial Layer
# The Tile being read will be the smallest key of the layer
min_key = spatial_tiled_layer.layer_metadata.bounds.minKey
gps.read_value(uri="file:///tmp/spatial-catalog",
layer_name="spatial-layer",
layer_zoom=11,
Expand All @@ -253,7 +252,7 @@ Reading a Tile From a Saved, Spatial-Temporal Layer
# The Tile being read will be the largest key of the layer
max_key = space_time_tiled_layer.layer_metadata.bounds.maxKey
gps.read_value(uri="file:///tmp/spacetime-catalog",
layer_name="spacetime-layer",
layer_zoom=7,
Expand All @@ -267,8 +266,9 @@ Reading a Layer
There are two ways one can read a layer in GeoPySpark: reading the
entire layer or just portions of it. The former will be the goal
discussed in this section. While all of the layer will be read, the
function for doing so is called, ``query``. There is no difference
between spatial and spatial-temporal layers when using this function.
function for doing so is called, :meth:`~geopyspark.geotrellis.catalog.query`.
There is no difference between spatial and spatial-temporal layers when using
this function.

**Note**: What distinguishes between a full and partial read is the
parameters given to ``query``. If no filters were given, then the whole
Expand Down Expand Up @@ -296,7 +296,7 @@ One can query an area of a spatial layer that covers the region of
interest by providing a geometry that represents this region. This area
can be represented as: ``shapely.geometry`` (specifically ``Polygon``\ s
and ``MultiPolygon``\ s), the ``wkb`` representation of the geometry, or
an ``Extent``.
an :class:`~geopyspark.geotrellis.Extent`.

**Note**: It is important that the given geometry is in the same
projection as the queried layer. Otherwise, either the wrong area or
Expand All @@ -311,7 +311,7 @@ given are in the same projection.
.. code:: python3
layer_extent = spatial_tiled_layer.layer_metadata.extent
# Creates a Polygon from the cropped Extent of the Layer
poly = box(layer_extent.xmin+100, layer_extent.ymin+100, layer_extent.xmax-100, layer_extent.ymax-100)
Expand Down Expand Up @@ -351,7 +351,7 @@ the geometry is in.
# Because we queried the whole Extent of the layer, we should have gotten back the whole thing.
querried_extent = querried_spatial_layer.layer_metadata.layout_definition.extent
base_extent = spatial_tiled_layer.layer_metadata.layout_definition.extent
querried_extent == base_extent
Querying a Spatial-Temporal Layer
Expand All @@ -366,7 +366,7 @@ Querying by Time
.. code:: python3
min_key = space_time_tiled_layer.layer_metadata.bounds.minKey
# Returns a TiledRasterLayer whose keys intersect the given time interval.
# In this case, the entire layer will be read.
gps.query(uri="file:///tmp/spacetime-catalog",
Expand Down
62 changes: 32 additions & 30 deletions docs/guides/core-concepts.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,3 @@

.. code:: python3
import datetime
import numpy as np
import geopyspark as gps
Core Concepts
=============

Expand All @@ -14,12 +7,21 @@ terminology and data representations have carried over. This section
seeks to explain this jargon in addition to describing how GeoTrellis
types are represented in GeoPySpark.

Before begining, all examples in this guide need the following boilerplate
code:

.. code:: python3
import datetime
import numpy as np
import geopyspark as gps
Rasters
-------

GeoPySpark differs in how it represents rasters from other geo-spatial
Python libraries like rasterIO. In GeoPySpark, they are represented by
the ``Tile`` class. This class contains a numpy array (refered to as
the :class:`~geopyspark.geotrellis.Tile` class. This class contains a numpy array (refered to as
``cells``) that represents the cells of the raster in addition to other
information regarding the data. Along with ``cells``, ``Tile`` can also
have the ``no_data_value`` of the raster.
Expand All @@ -32,7 +34,7 @@ bands, even if the original raster just contained one.
arr = np.array([[[0, 0, 0, 0],
[1, 1, 1, 1],
[2, 2, 2, 2]]], dtype=np.int16)
# The resulting Tile will set -10 as the no_data_value for the raster
gps.Tile.from_numpy_array(numpy_array=arr, no_data_value=-10)
Expand All @@ -47,7 +49,7 @@ Extent
Describes the area on Earth a raster represents. This area is
represented by coordinates that are in some Coordinate Reference System.
Thus, depending on the system in use, the values that outline the
``extent`` can vary. ``Extent`` can also be refered to as a *bounding
:class:`~geopyspark.geotrellis.Extent` can vary. ``Extent`` can also be refered to as a *bounding
box*.

**Note**: The values within the ``Extent`` must be ``float``\ s and not
Expand All @@ -61,27 +63,27 @@ box*.
ProjectedExtent
---------------

``ProjectedExtent`` describes both the area on Earth a raster represents
:class:`~geopyspark.geotrellis.ProjectedExtent` describes both the area on Earth a raster represents
in addition to its CRS. Either the EPSG code or a proj4 string can be
used to indicate the CRS of the ``ProjectedExtent``.

.. code:: python3
# Using an EPSG code
gps.ProjectedExtent(extent=extent, epsg=3857)
.. code:: python3
# Using a Proj4 String
proj4 = "+proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0 +a=6378137 +b=6378137 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs "
gps.ProjectedExtent(extent=extent, proj4=proj4)
TemporalProjectedExtent
-----------------------

Similar to ``ProjectedExtent``, ``TemporalProjectedExtent`` describes
Similar to ``ProjectedExtent``, :class:`~geopyspark.geotrellis.TemporalProjectedExtent` describes
the area on Earth the raster represents, its CRS, and the time the data
was represents. This point of time, called ``instant``, is an instance
of ``datetime.datetime``.
Expand All @@ -94,7 +96,7 @@ of ``datetime.datetime``.
TileLayout
----------

``TileLayout`` describes the grid which represents how rasters are
:class:`~geopyspark.geotrellis.TileLayout` describes the grid which represents how rasters are
orginized and assorted in a layer. ``layoutCols`` and ``layoutRows``
detail how many columns and rows the grid itself has, respectively.
While ``tileCols`` and ``tileRows`` tell how many columns and rows each
Expand All @@ -103,14 +105,14 @@ individual raster has.
.. code:: python3
# Describes a layer where there are four rasters in a 2x2 grid. Each raster has 256 cols and rows.
tile_layout = gps.TileLayout(layoutCols=2, layoutRows=2, tileCols=256, tileRows=256)
tile_layout
LayoutDefinition
----------------

``LayoutDefinition`` describes both how the rasters are orginized in a
:class:`~geopyspark.geotrellis.LayoutDefinition` describes both how the rasters are orginized in a
layer as well as the area covered by the grid.

.. code:: python3
Expand All @@ -129,7 +131,7 @@ produce a layout based on the data they are given.
LocalLayout
~~~~~~~~~~~

``LocalLayout`` is the first tiling strategy that produces a layout
:class:`~geopyspark.geotrellis.LocalLayout` is the first tiling strategy that produces a layout
where the grid is constructed over all of the pixels within a layer of a
given tile size. The resulting layout will match the original resolution
of the cells within the rasters.
Expand All @@ -156,7 +158,7 @@ performed.**
GlobalLayout
~~~~~~~~~~~~

The other tiling strategy is ``GlobalLayout`` which makes a layout where
The other tiling strategy is :class:`~geopyspark.geotrellis.GlobalLayout` which makes a layout where
the grid is constructed over the global extent CRS. The cell resolution
of the resulting layer be multiplied by a power of 2 for the CRS. Thus,
using this strategy will result in either up or down sampling of the
Expand Down Expand Up @@ -184,7 +186,7 @@ level, then the ``zoom`` parameter must be set.
SpatialKey
----------

``SpatialKey``\ s describe the positions of rasters within the grid of
:class:`~geopyspark.geotrellis.SpatialKey`\ s describe the positions of rasters within the grid of
the layout. This grid is a 2D plane where the location of a raster is
represented by a pair of coordinates, ``col`` and ``row``, respectively.
As its name and attributes suggest, ``SpatialKey`` deals solely with
Expand All @@ -197,7 +199,7 @@ spatial data.
SpaceTimeKey
------------

Like ``SpatialKey``\ s, ``SpaceTimeKey``\ s describe the position of a
Like ``SpatialKey``\ s, :class:`~geopyspark.geotrellis.SpaceTimeKey`\ s describe the position of a
raster in a layout. However, the grid is a 3D plane where a location of
a raster is represented by a pair of coordinates, ``col`` and ``row``,
as well as a z value that represents a point in time called,
Expand All @@ -212,7 +214,7 @@ deal with spatial-temporal data.
Bounds
------

``Bounds`` represents the the extent of the layout grid in terms of
:class:`~geopyspark.geotrellis.Bounds` represents the the extent of the layout grid in terms of
keys. It has both a ``minKey`` and a ``maxKey`` attributes. These can
either be a ``SpatialKey`` or a ``SpaceTimeKey`` depending on the type
of data within the layer. The ``minKey`` is the left, uppermost cell in
Expand All @@ -221,34 +223,34 @@ the grid and the ``maxKey`` is the right, bottommost cell.
.. code:: python3
# Creating a Bounds from SpatialKeys
min_spatial_key = gps.SpatialKey(0, 0)
max_spatial_key = gps.SpatialKey(10, 10)
bounds = gps.Bounds(min_spatial_key, max_spatial_key)
bounds
.. code:: python3
# Creating a Bounds from SpaceTimeKeys
min_space_time_key = gps.SpaceTimeKey(0, 0, 1.0)
max_space_time_key = gps.SpaceTimeKey(10, 10, 1.0)
gps.Bounds(min_space_time_key, max_space_time_key)
Metadata
--------

``Metadata`` contains information of the values within a layer. This
:class:`~geopyspark.geotrellis.Metadata` contains information of the values within a layer. This
data pertains to the layout, projection, and extent of the data
contained within the layer.

The below example shows how to construct ``Metadata`` by hand, however,
this is almost never required and ``Metadata`` can be produced using
easier means. For ``RasterLayer``, one call the method,
``collect_metadata()`` and ``TiledRasterLayer`` has the attribute,
``layer_metadata``.
easier means. For ``RasterLayer``, one can call the method,
:meth:`~geopyspark.geotrellis.Metadata.collect_metadata` and
``TiledRasterLayer`` has the attribute, ``layer_metadata``.

.. code:: python3
Expand Down

0 comments on commit 00f5331

Please sign in to comment.