# Raster Properties Tutorial
This tutorial introduces the `Raster` class and examines routines to manage data values and spatial metadata.

## Introduction

Raster datasets are fundamental to pfdf - many routines require rasters as input, and many produce new rasters as output. In brief, a raster dataset is a rectangular grid of data values. The individual values (often called _pixels_) are regularly spaced along the X and Y axes, and each axis may use its own spacing interval. A raster is usually associated with some spatial metadata, which locates the raster's pixels in space. Some rasters will also have a NoData value - when this is the case, pixels equal to the NoData value represent missing data.

A raster's spatial metadata consists of a coordinate reference system (CRS) and an affine transformation matrix (also known as the _transform_). The transform converts the data grid's column indices to spatial coordinates, and the CRS specifies the location of these coordinates on the Earth's surface. A transform defines a raster's resolution and alignment (the location of pixel edges) and takes the form:

$$
\begin{vmatrix}
dx & 0 & \mathrm{left}\\
0 & dy & \mathrm{top}
\end{vmatrix}
$$

Here _dx_ and _dy_ are the change in spatial coordinate when incrementing one column or row, and their absolute values define the raster's resolution. Meanwhile, _left_ and _top_ indicate the spatial coordinates of the data grid's left and top edges, which defines the raster's alignment. The two remaining coefficients can be used to implement shear transforms, but pfdf only supports rectangular pixels, so these will always be 0 for our purposes.

In this tutorial, we'll see how to use ``Raster`` objects to manage data values and spatial metadata. Other routines are explored later in the [Raster Factories](07_Raster_Factories.ipynb) and [Preprocessing](04_Preprocessing.ipynb) tutorials.

## Prerequisites

### Install pfdf
To run this tutorial, you must have installed [pfdf 3+ with tutorial resources](https://ghsc.code-pages.usgs.gov/lhp/pfdf/resources/installation.html#tutorials) in your Jupyter kernel. The following line checks this is the case:

In [None]:
import check_installation

### Imports
We'll next import the ``Raster`` class from pfdf. We'll also use ``numpy`` to work with raster data grids.

In [None]:
from pfdf.raster import Raster
import numpy as np

### Example File
Finally, we'll create an example raster file to use in the tutorial. This dataset is a 50x75 grid of random values between 0 and 100 with a border of -128 NoData values along the edges. The raster is projected in EPSG:26911 with a 10 meter resolution.

In [None]:
from tools import examples
examples.build_raster()

## Raster Object
We'll start by using the `from_file` command to create a `Raster` object for our example dataset. (You can learn more about this command in the [Raster Factories Tutorial](07_Raster_Factories.ipynb):

In [None]:
raster = Raster('examples/raster.tif')

Printing the object to the console, we can see a summary of the data grid and spatial metadata:

In [None]:
print(raster)

## Data Grid
You can use the `values` property to return a `Raster` object's data grid:

In [None]:
raster.values

`Raster` objects represent their data grids as numpy arrays, so provide several properties determined by the array. For example, you can use the `shape` property to return the array shape (nrows x ncols), `size` to return the number of pixels. `dtype` to return the data type, and `nbytes` to return the memory consumed by the array. Users who prefer [rasterio's](https://rasterio.readthedocs.io/en/stable/index.html) syntax can also use `height` and `width` to return the number of rows and columns, respectively:

In [None]:
print(f'shape = {raster.shape}')
print(f'height = {raster.height}')
print(f'width = {raster.width}')
print(f'size = {raster.size}')
print(f'dtype = {raster.dtype}')
print(f'nbytes = {raster.nbytes}')

The `values` property returns a read-only view of the `Raster` object's data grid. Most routines will work as normal, but you'll need to make a copy if you want to alter array elements directly:

In [None]:
# Most routines work as normal
median = np.median(raster.values)
print(median)

In [None]:
# But this will fail because it attempts to alter array elements
try:
    rasters.values[0,:] = 0
except Exception:
    print('Failed because we attempted to change the array')

In [None]:
# This is fine because we copied the array first
values = raster.values.copy()
values[0,:] = 0
print(values)

## NoData Values
You can use the `nodata` property to return a raster's NoData value:

In [None]:
print(raster.nodata)

The `nodata_mask` property will return a boolean array indicating the locations of NoData values in the data grid. Here, `True` values indicate NoData pixels, and `False` values indicate data pixels. Inspecting the NoData mask for the example dataset, we can see locations of NoData pixels along the data grid's edges:

In [None]:
raster.nodata_mask

Alternatively, you can use the `data_mask` property to return the inverse mask, wherein `True` indicates data pixels and `False` is NoData:

In [None]:
raster.data_mask

These masks can be useful for manipulating and/or visualizing raster data values after processing.

## CRS
Several other properties return a raster's spatial metadata. The `crs` returns the raster's coordinate reference system as a [pyproj.CRS](https://pyproj4.github.io/pyproj/stable/) object, `crs_units` reports the CRS's coordinate units along the X and Y axes, and `utm_zone` returns the CRS of the best UTM zone for the raster's center point:

In [None]:
raster.crs

In [None]:
raster.crs_units

In [None]:
raster.utm_zone

## Transform
You can use the `transform` property to return a raster's `Transform` object. This object manages the affine transform, and you can learn more in the [Spatial Metadata Tutorial](08_Spatial_Metadata.ipynb):

In [None]:
raster.transform

You can also use the `resolution` method to return the resolution along the X and Y axes, and `pixel_area` to return the area of a single pixel:

In [None]:
print(raster.resolution())
print(raster.pixel_area())

By default, these commands return values in meters, but you can use the `units` option to select other units:

In [None]:
resolution = raster.resolution(units='feet')
area = raster.pixel_area(units='feet')
print(resolution)
print(area)

You can find a list of supported units here: [Supported Units](https://ghsc.code-pages.usgs.gov/lhp/pfdf/guide/utils/units.html#supported-units)

## Bounding Box
You can use the `bounds` property to return a raster's `BoundingBox` object. This object manages the raster's bounding box, and you can learn more in the [Spatial Metadata Tutorial](08_Spatial_Metadata.ipynb):

In [None]:
raster.bounds

You can also use the `left`, `right`, `top`, and `bottom` properties to return the coordinates of specific edges, and the `center` property to return the (X, Y) coordinate of the raster's center point:

In [None]:
print(f'left = {raster.left}')
print(f'right = {raster.right}')
print(f'bottom = {raster.bottom}')
print(f'top = {raster.top}')
print(f'center = {raster.center}')

## Conclusion
In this tutorial, we've introduced raster datasets, and seen how to use the `Raster` class to manage their data grids and spatial metadata. In the [next tutorial](07_Raster_Factories.ipynb), we'll see how to load and build `Raster` objects from a variety of different data sources.