In [3]:
import findspark
spark_home = "/usr/local/spark"
findspark.init(spark_home)

import geopyspark as gps
import matplotlib.pyplot as plt

from colortools import Color
from pyspark import SparkContext

%matplotlib inline

In [None]:
!curl -o /tmp/cropped.tif https://s3.amazonaws.com/geopyspark-test/example-files/cropped.tif

In [4]:
conf = gps.geopyspark_conf(master="local[*]", appName="visualization")
pysc = SparkContext(conf=conf)

In [5]:
raster_layer = gps.geotiff.get(layer_type=gps.LayerType.SPATIAL, uri="/tmp/cropped.tif")
tiled_layer = raster_layer.tile_to_layout(layout=gps.GlobalLayout(), target_crs=3857)

# Visualizing Data in GeoPySpark

Data is visualized in GeoPySpark by running a server which allows it to be viewed in an interactive way. Before putting the data on the server, however, it must first be formatted and colored. This guide seeks to go over the steps needed to create a visualization server in GeoPySpark.

## Pyramid

The `pyramid` class represents a list of `TiledRasterLayer`s that represent the same area where each layer is a level within the pyramid at a specific zoom level. Thus, as one moves up the pyramid (starting a level 0), the image will become more zoomed by a power of 2 for each level. It is this varying level of detail that allows an interactive tile server to be created from a `Pyramid`. This class is needed in order to create visualizations of the contents within its layers.

### Creating a Pyramid

There are currently two different ways to create a `Pyramid` instance: Through the `TiledRasterLayer.pyramid` method or by constructing it by passing in a `[TiledRasterLayer]` or `{zoom_level: TiledRasterLayer}` to `Pyramid`.

Any `TiledRasterLayer` with a `max_zoom` can be pyramided. However, the resulting `Pyramid` may have limited functionality depending on the layout of the source `TiledRasterLayer`. In order to be used for visualization, the `Pyramid` **must** have been created from `TiledRasterLayer` that was tiled using a `GlobalLayout` and whose tile sizes are a power of 2.

#### Via the pyramid Method

When using the `pyramid` method, a `Pyramid` instance will be created with levels from 0 to `TiledRasterlayer.zoom_level`. Thus, if a `TiledRasterLayer` has a `zoom_level` of 12 then the resulting `Pyramid` will have 13 levels that each correspond to a zoom from 0 to 12.

In [6]:
pyramided = tiled_layer.pyramid()
pyramided

Pyramid(layer_type=LayerType.SPATIAL, max_zoom=11, num_levels=12, is_cached=False)

#### Contrusting a Pyramid Manually

In [7]:
gps.Pyramid([tiled_layer.tile_to_layout(gps.GlobalLayout(zoom=x)) for x in range(0, 13)])

Pyramid(layer_type=LayerType.SPATIAL, max_zoom=12, num_levels=13, is_cached=False)

In [8]:
gps.Pyramid({x: tiled_layer.tile_to_layout(gps.GlobalLayout(zoom=x)) for x in range(0, 13)})

Pyramid(layer_type=LayerType.SPATIAL, max_zoom=12, num_levels=13, is_cached=False)

### Computing the Histogram of a Pyramid

One can produce a `Histogram` instance representing all of the layers within a `Pyramid` via the `get_histogram` method.

In [9]:
hist = pyramided.get_histogram()
hist

<geopyspark.geotrellis.histogram.Histogram at 0x7fbd81068208>

### RDD Methods

`Pyramid` contains methods for working with the `RDD`s contained within its `TiledRasterLayer`s. A list of these can be found [here](layers.ipynb#rdd-methods). When used, all internal `RDD`s will be operated on.

### Map Algebra

While not as versatile as `TiledRasterLayer` in terms of map algebra operations, `Pyramid`s are still able to perform local operations between themselves, `int`s, and `float`s.

**Note**: Operations between two or more `Pyramid`s will occur on a per `Tile` basis which depends on the tiles having the same key. It is therefore possible to do an operation between two `Pyramid`s and getting a result where nothing has changed if neither of the `Pyramid`s have matching keys.

In [10]:
pyramided + 1

Pyramid(layer_type=LayerType.SPATIAL, max_zoom=11, num_levels=12, is_cached=False)

In [11]:
(2 * (pyramided + 2)) / 3

Pyramid(layer_type=LayerType.SPATIAL, max_zoom=11, num_levels=12, is_cached=False)

When performing operations on two or more `Pyramid`s, if the `Pyamid`s involved have different number of `level`s, then the resulting `Pyramid` will only have as many levels as the source `Pyramid` with the smallest level count.

In [12]:
small_pyramid = gps.Pyramid({x: tiled_layer.tile_to_layout(gps.GlobalLayout(zoom=x)) for x in range(0, 5)})
small_pyramid

Pyramid(layer_type=LayerType.SPATIAL, max_zoom=4, num_levels=5, is_cached=False)

In [13]:
pyramided + small_pyramid

Pyramid(layer_type=LayerType.SPATIAL, max_zoom=4, num_levels=5, is_cached=False)

## ColorMap

The `ColorMap` class in GeoPySpark acts as a wrapper for the GeoTrellis `ColorMap` class. It is used to colorize the data within a layer when it's being visualized.

### Constructing a Color Ramp

Before we can initialize `ColorMap` we must first create a list of colors (or a color ramp) to pass in. This can be created either through a function in the `color` module or manually.

#### Using Matplotlib

The `get_colors_from_matplotlib` function creates a color ramp using the name of on an existing in color ramp in `Matplotlib` and the number of colors.

**Note**: This function will not work if `Matplotlib` is not installed.

In [14]:
color_ramp = gps.get_colors_from_matplotlib(ramp_name="viridis")
color_ramp

[1140937983,
 1141003775,
 1141069823,
 1157978367,
 1158044415,
 1158175743,
 1175018751,
 1175150335,
 1175216127,
 1175347711,
 1192190719,
 1192322047,
 1192388095,
 1192519423,
 1192585215,
 1192651263,
 1192782591,
 1209625599,
 1209691391,
 1209822975,
 1209888767,
 1209954559,
 1210085887,
 1210151679,
 1210217471,
 1210283263,
 1193637375,
 1193703167,
 1193768959,
 1193834751,
 1193966079,
 1194031871,
 1194097663,
 1177386239,
 1177517311,
 1177583103,
 1177648895,
 1160937471,
 1161068543,
 1161134335,
 1161200127,
 1144488447,
 1144619775,
 1127908351,
 1127973887,
 1128039679,
 1111327999,
 1111393791,
 1111524863,
 1094813439,
 1094878975,
 1078167551,
 1078233087,
 1061521407,
 1061652735,
 1044941055,
 1045006847,
 1028295167,
 1028360703,
 1028426239,
 1011714815,
 1011780351,
 995134207,
 995199743,
 978488319,
 978553855,
 961842175,
 961907711,
 945196031,
 945261823,
 928550143,
 928615679,
 911903999,
 911969535,
 895257855,
 895323391,
 878611967,
 878677503,
 8

In [15]:
gps.get_colors_from_matplotlib(ramp_name="hot", num_colors=150)

[167772415,
 218104063,
 301990143,
 385876223,
 436207871,
 520093951,
 603980031,
 704643327,
 738197759,
 822083839,
 922747135,
 956301567,
 1056964863,
 1140850943,
 1224737023,
 1275068671,
 1358954751,
 1442840831,
 1493172479,
 1577058559,
 1660944639,
 1761607935,
 1795162367,
 1879048447,
 1979711743,
 2013266175,
 2113929471,
 2197815551,
 2281701631,
 2332033279,
 2415919359,
 2499805439,
 2550137087,
 2634023167,
 2717909247,
 2818572543,
 2852126975,
 2936013055,
 3036676351,
 3120562431,
 3170894079,
 3254780159,
 3338666239,
 3388997887,
 3472883967,
 3556770047,
 3640656127,
 3690987775,
 3774873855,
 3875537151,
 3909091583,
 3992977663,
 4093640959,
 4177527039,
 4227858687,
 4278321407,
 4278649087,
 4278845695,
 4279173375,
 4279501055,
 4279894271,
 4280025343,
 4280353023,
 4280746239,
 4280877311,
 4281270527,
 4281598207,
 4281925887,
 4282122495,
 4282450175,
 4282777855,
 4282974463,
 4283302143,
 4283629823,
 4284023039,
 4284154111,
 4284481791,
 4284875007

#### From ColorTools

The second helper function for constructing a color ramp is `get_colors_from_colors`. This uses the `colortools` package to build the ramp from `[Color]` instances.

**Note**: This function will not work if `colortools` is not installed.

In [16]:
colors = [Color('green'), Color('red'), Color('blue')]
colors

[Color(0, 128, 0, 255), Color(255, 0, 0, 255), Color(0, 0, 255, 255)]

In [17]:
colors_color_ramp = gps.get_colors_from_colors(colors=colors)
colors_color_ramp

[8388863, 4278190335, 65535]

### Creating a ColorMap

`ColorMap` has many different ways of being constructed depending on the inputs it's given. It has a general `build` method that can take various types for `breaks` and `colors` in addition to other `classmethod`s that have more specific inputs.

#### From a Histogram

In [18]:
gps.ColorMap.from_histogram(histogram=hist, color_list=color_ramp)

<geopyspark.geotrellis.color.ColorMap at 0x7fbd80df8f60>

#### From a List of Colors

In [19]:
# Creates a ColorMap instance that will have three colors for the values that are less than or equal to 0, 250, and
# 1000.
gps.ColorMap.from_colors(breaks=[0, 250, 1000], color_list=colors_color_ramp)

<geopyspark.geotrellis.color.ColorMap at 0x7fbd80dee438>

#### For NLCD Data

If the layers you are working with contain data from NLCD, then it is possible to construct a `ColorMap` without first making a color ramp and passing in a list of breaks.

In [20]:
gps.ColorMap.nlcd_colormap()

<geopyspark.geotrellis.color.ColorMap at 0x7fbd80df87f0>

#### From a Break Map

If there aren't many colors to work with in the layer, than it may be easier to construct a `ColorMap` using a `break_map`, a `dict` that maps tile values to colors.

In [21]:
# The three tile values are 1, 2, and 3 and they correspond to the colors 0x00000000, 0x00000001, and 0x00000002
# respectively.
break_map = {
    1: 0x00000000,
    2: 0x00000001,
    3: 0x00000002
}

gps.ColorMap.from_break_map(break_map=break_map)

<geopyspark.geotrellis.color.ColorMap at 0x7fbd80df36a0>

#### More General Build Method

As mentioned above, `ColorMap` has a more general `classmethod` called `build` which a wide range of types to construct a `ColorMap`. In the following example, `build` will be passed the same inputs used in the previous examples.

In [22]:
# build using a Histogram
gps.ColorMap.build(breaks=hist, colors=color_ramp)

# It is also possible to pass in the name of Matplotlib color ramp instead of constructing it yourself
gps.ColorMap.build(breaks=hist, colors="viridis")

# build using Colors
gps.ColorMap.build(breaks=colors_color_ramp, colors=colors)

# buld using breaks
gps.ColorMap.build(breaks=break_map)

<geopyspark.geotrellis.color.ColorMap at 0x7fbd80df8828>

#### Additional Coloring Options

In addition to supplying breaks and color values to `ColorMap`, there are other ways of changing the coloring strategy of a layer.

The following additional parameters that can be changed:

- `no_data_color`: The color of the `no_data_value` of the `Tile`s. The default is `0x00000000`
- `fallback`: The color to use when a `Tile` value has no color mapping. The default is `0x00000000`
- `classification_strategy`: How the colors should be assigned to the values based on the breaks. The default is `ClassificationStrategy.LESS_THAN_OR_EQUAL_TO`.

In [23]:
pysc.stop()