Skip to content

Commit

Permalink
Continued work on geometry documentation (#33)
Browse files Browse the repository at this point in the history
  • Loading branch information
caspervdw committed Feb 12, 2020
1 parent 7a4e131 commit cda3576
Show file tree
Hide file tree
Showing 13 changed files with 552 additions and 382 deletions.
2 changes: 1 addition & 1 deletion CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Changelog of dask-geomodeling

- Added GeometryWKTSource.

- Reworked the docstrings of all rasterblocks.
- Updated all docstrings.

- Renamed the 'location' parameter of raster.misc.Step to 'value'.

Expand Down
119 changes: 51 additions & 68 deletions dask_geomodeling/geometry/aggregate.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,47 +113,43 @@ def bucketize(bboxes):

class AggregateRaster(GeometryBlock):
"""
Compute zonal statistics and add them to the geometry properties
:param source: the source of geometry data
:param raster: the source of raster data
:param statistic: the type of statistic to perform. can be
``'sum', 'count', 'min', 'max', 'mean', 'median', 'p<percentile>'``.
:param projection: the projection to perform the aggregation in
:param pixel_size: the pixel size to perform aggregation in
:param max_pixels: the maximum number of pixels to use for aggregation.
defaults to the geomodeling.raster-limit setting.
:param column_name: the name of the column to output the results
:param auto_pixel_size: determines whether the pixel_size is
adjusted when a raster is too large. Default False.
:returns: GeometryBlock with aggregation results in ``column_name``
:type source: GeometryBlock
:type raster: RasterBlock
:type statistic: string
:type projection: string or None
:type pixel_size: float or None
:type max_pixels: int or None
:type column_name: string
:type auto_pixel_size: boolean
The currently implemented statistics are sum, count, min, max, mean,
median, and percentile. If projection or max_resolution are not
given, these are taken from the provided RasterBlock.
The count statistic calculates the number of active cells in the raster. A
percentile statistic can be selected using text value starting with 'p'
followed by something that can be parsed as a float value, for example
``'p33.3'``.
Only geometries that intersect the requested bbox are aggregated.
Aggregation is done in a specified projection and with a specified pixel
size.
Compute statistics of a raster for each geometry in a geometry source.
A statistic is computed in a specific projection and with a specified cell
size. If ``projection`` or ``pixel_size`` are not given, these default to
the native projection of the provided raster source.
Should the combination of the requested pixel_size and the extent of the
source geometry cause the requested raster size to exceed max_pixels, the
pixel_size is adjusted automatically if ``auto_pixel_size = True``, else
a RuntimeError is raised.
source geometry cause the required raster size to exceed ``max_pixels``,
the ``pixel_size`` can be adjusted automatically if ``auto_pixel_size`` is
set to ``True``, else (the default) a RuntimeError is raised.
Please note that for any field operation on the result of this block
a GetSeriesBlock should be used to retrieve data from the added column. The
name of the added column is determined by the ``column_name`` parameter.
Args:
source (GeometryBlock): The geometry source for which the statistics are
determined.
raster (RasterBlock): The raster source that is sampled.
statistic (str): The type of statistical analysis that should be
performed. The options are: ``{"sum", "count", "min", "max", "mean",
"median", "p<percentile>"}``. Percentiles are provided for example as
follows: ``"p50"``. Default ``"sum"``.
projection (str, optional): Projection to perform the aggregation in, for
example ``"EPSG:28992"``. Defaults to the native projection of the
supplied raster.
pixel_size (float, optional): The raster cell size used in the
aggregation. Defaults to the cell size of the supplied raster.
max_pixels (int, optional): The maximum number of pixels (cells) in the
aggregation. Defaults to the ``geomodeling.raster-limit`` setting.
column_name (str, optional): The name of the column where the result
should be placed. Defaults to ``"agg"``.
auto_pixel_size (boolean): Determines whether the pixel size is adjusted
automatically when ``"max_pixels"`` is exceeded. Default False.
Returns:
GeometryBlock with aggregation results in an added column
The global raster-limit setting can be adapted as follows:
>>> from dask import config
Expand Down Expand Up @@ -463,36 +459,23 @@ def process(geom_data, raster_data, process_kwargs):

class AggregateRasterAboveThreshold(AggregateRaster):
"""
Aggregate raster values ignoring values below some threshold. The
thresholds are supplied per geometry.
:param source: the source of geometry data
:param raster: the source of raster data
:param statistic: the type of statistic to perform. can be
``'sum', 'count', 'min', 'max', 'mean', 'median', 'p<percentile>'``.
:param projection: the projection to perform the aggregation in
:param pixel_size: the pixel size to perform aggregation in
:param max_pixels: the maximum number of pixels to use for aggregation
:param column_name: the name of the column to output the results
:param auto_pixel_size: determines whether the pixel_size is
adjusted when a raster is too large. Default False.
:param threshold_name: the name of the column with the thresholds
:returns: GeometryBlock with aggregation results in ``column_name``
:type source: GeometryBlock
:type raster: RasterBlock
:type statistic: string
:type projection: string
:type pixel_size: float
:type max_pixels: int
:type column_name: string
:type auto_pixel_size: boolean
:type threshold_name: string
See also:
:class:`dask_geomodeling.geometry.aggregate.AggregateRaster`
"""
Compute statistics of a per-feature masked raster for each geometry in a
geometry source.
Per feature, a threshold can be supplied to mask the raster with. Only
values that exceed the threshold of a specific feature are included for
the statistical value of that feature.
See :class:``dask_geomodeling.geometry.aggregate.AggregateRaster`` for
further information.
Args:
*args: See :class:``dask_geomodeling.geometry.aggregate.AggregateRaster``
threshold_name (str): The column that holds the thresholds.
Returns:
GeometryBlock with aggregation results in an added column
"""
def __init__(
self,
source,
Expand Down
90 changes: 66 additions & 24 deletions dask_geomodeling/geometry/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ class GeometryBlock(Block):
""" The base block for geometries
All geometry blocks must be derived from this base class and must implement
the following attributes:
the following attribute:
- ``columns``: a set of column names to expect in the dataframe
Expand All @@ -29,13 +29,18 @@ class GeometryBlock(Block):
- filters: dict of `Django <https://www.djangoproject.com/>`_ ORM-like
filters on properties (e.g. ``id=598``)
The data response contains the following:
The data response is a dictionary with the following fields:
- if mode was ``'intersects'``: a DataFrame of features with properties
- if mode was ``'extent'``: the bbox that contains all features
- (if mode was ``"intersects"`` or ``"centroid"``) ``"features"``:
a ``GeoDataFrame`` of features with properties
- (if mode was ``"extent"``) ``"extent"``: a tuple of 4 numbers
``(min_x, min_y, max_x, max_y)`` that represents the extent of the
geometries that would be returned by an ``"intersects"`` request.
- (for all modes) ``"projection"``: the EPSG or WKT representation of the
projection.
To be able to perform operations on properties, there is a helper type
called``SeriesBlock``. This is the block equivalent of a ``pandas.Series``.
called ``SeriesBlock``. This is the block equivalent of a ``pandas.Series``.
You can get a ``SeriesBlock`` from a ``GeometryBlock``, perform operations
on it, and set it back into a ``GeometryBlock``.
"""
Expand All @@ -54,7 +59,11 @@ def to_file(self, *args, **kwargs):
"""Utility function to export data from this block to a file on disk.
You need to specify the target file path as well as the extent geometry
you want to save.
you want to save. Feature properties can be saved by providing a field
mapping to the ``fields`` argument.
To stay within memory constraints or to parallelize an operation, the
``tile_size`` argument can be provided.
Args:
url (str): The target file path. The extension determines the format.
Expand All @@ -79,6 +88,7 @@ def to_file(self, *args, **kwargs):
Relevant settings can be adapted as follows:
>>> from dask import config
>>> config.set({"geomodeling.root": '/my/output/data/path'})
>>> config.set({"geomodeling.geometry-limit": 10000})
>>> config.set({"temporary_directory": '/my/alternative/tmp/dir'})
"""
from dask_geomodeling.geometry.sinks import to_file
Expand All @@ -87,7 +97,16 @@ def to_file(self, *args, **kwargs):


class SeriesBlock(Block):
""" A helper block for GeometryBlocks, representing one single field"""
""" A block that represents one column from a GeometryBlock.
Use this helper class to modify (or to use logic on) a specific feature
property.
Use :class:``dask_geomodeling.geometry.base.GetSeriesBlock`` to retrieve
a SeriesBlock from a GeometryBlock and
:class:``dask_geomodeling.geometry.base.SetSeriesBlock`` to add a
SeriesBlock to a GeometryBlock.
"""

def __add__(self, other):
from . import Add
Expand Down Expand Up @@ -181,14 +200,21 @@ def __xor__(self, other):


class GetSeriesBlock(SeriesBlock):
"""Get a column from a GeometryBlock.
"""
Obtain a single feature property column from a GeometryBlock.
Provide a GeometryBlock with one or more columns. One of these columns can
be read from this source into a SeriesBlock. This SeriesBlock can be used
to run for example classifications.
:param source: GeometryBlock
:param name: name of the column to get
:returns: SeriesBlock containing the property column
Args:
source (GeometryBlock): GeometryBlock with the column you want to load
into the SeriesBlock.
name (str): Name of the column to load into the SeriesBlock.
Returns:
SeriesBlock containing the single property column
:type source: GeometryBlock
:type name: string
"""

def __init__(self, source, name):
Expand All @@ -212,20 +238,36 @@ def process(data, name):


class SetSeriesBlock(GeometryBlock):
"""Set one or multiple columns (SeriesBlocks) in a GeometryBlock.
"""
Add one or multiple property columns (SeriesBlocks) to a GeometryBlock.
Provide the GeometryBlock that you want to add more properties to. Then
provide the SeriesBlock(s) which you want to add to the GeometryBlock. The
values of the SeriesBlock will be added to the features in the
GeometryBlock automatically (if they are derived from the same geometries
in previous operations, the features will have matching indexes so that
each property is matched to the correct feature).
The value which is set can also be a single value, in which case each
feature will get the same value as a property.
Args:
source (GeometryBlock): The base GeometryBlock to which the SeriesBlock
is added as a new column.
column (str): The name of the new column (if it exists, it will be
overwritten)
value (SeriesBlock, number, str, bool): The SeriesBlock or constant value
that has to be inserted in the destination column.
*args: It is possible to repeat the ``"column"`` and ``"value"``
arguments multiple times to insert more than one column.
:param source: source to add the extra columns to
:param column: name of the column to be set
:param value: series or constant value to set
:param args: string, SeriesBlock, ..., repeated multiple times
:returns: the source GeometryBlock with additional property columns
Example:
Add two columns to an existing ``view`` like this:
``SetSeriesBlock(view, "column_1", series_1, "column_2", series_2)``.
:type source: GeometryBlock
:type column: string
:type value: SeriesBlock, scalar
Returns:
The source GeometryBlock with additional property columns
Example:
>>> SetSeriesBlock(view, 'column_1', series_1, 'column_2', series_2)
"""

def __init__(self, source, column, value, *args):
Expand Down
58 changes: 34 additions & 24 deletions dask_geomodeling/geometry/constructive.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,27 @@


class Buffer(BaseSingle):
"""Buffer geometries.
:param source: the geometry source
:param distance: a distance measure in the given projection.
:param projection: an EPSG or WKT string, e.g. EPSG:28992.
:param resolution: quarter circle segments. Default is 16.
"""
Buffer ('expand') geometries with a given value.
A GeometryBlock and a buffer distance are provided. Each feature in the
GeometryBlock is buffered with the distance provided, resulting in updated
geometries.
Args:
source (GeometryBlock): The source GeometryBlock whose geometry will be
updated.
distance (float): The distance used to buffer all features. The distance
is measured in the unit of the given projection (e.g. m, °).
projection (str): The projection used in the operation provided in the
format: ``"EPSG:28992"``.
resolution (integer, optional): The resolution of the buffer provided as
the number of points used to represent a quarter of a circle. The
default value is ``16``.
Returns:
GeometryBlock with buffered geometries.
:type source: GeometryBlock
:type distance: float
:type projection: string
:type resolution: int
"""

def __init__(self, source, distance, projection, resolution=16):
Expand All @@ -35,20 +45,14 @@ def __init__(self, source, distance, projection, resolution=16):

@property
def distance(self):
"""Buffer distance.
The unit (e.g. m, °) is determined by the projection.
"""
return self.args[1]

@property
def projection(self):
"""Projection used for buffering."""
return self.args[2]

@property
def resolution(self):
"""Buffer resolution."""
return self.args[3]

def get_sources_and_requests(self, **request):
Expand Down Expand Up @@ -88,16 +92,22 @@ def process(data, kwargs):


class Simplify(BaseSingle):
"""Simplify geometries up to given tolerance.
"""
Simplify geometries, mainly to make them computationally more efficient.
Provide a GeometryBlock and a tolerance value to simplify the geometries.
As a result all features in the GeometryBlock are simplified.
Args:
source (GeometryBlock): Source of the geometries to be simplified.
tolerance (float): The tolerance used in the simplification. If no
tolerance is given the ``"min_size"`` request parameter is used.
preserve_topology (boolean, optional): Determines whether the topology
should be preserved in the operation. Defaults to ``True``.
:param source: the geometry source
:param tolerance: the simplification tolerance. if no tolerance is given,
the ``min_size`` request param is used.
:param preserve_topology: whether to preserve topology. Default True.
Returns:
GeometryBlock which was provided as input with a simplified geometry.
:type source: GeometryBlock
:type tolerance: float
:type preserve_topology: boolean
"""

def __init__(self, source, tolerance=None, preserve_topology=True):
Expand Down

0 comments on commit cda3576

Please sign in to comment.