# Rasterio

? verify crs and coord <-> pixel conversion functions and aviris

- rasterio depends on GDAL
- there is a command line interface `rio`

- [Rasterio](https://rasterio.readthedocs.io/en/latest/) (and older set of docs https://mapbox.github.io/rasterio/ which was removed). Note, for some reason rtd redirects to the "stable" version, but github links to "latest" which should be preferred!
- https://rasterio.groups.io/g/main support forum
- https://github.com/mapbox/rasterio
- https://github.com/mapbox/rasterio-cookbook
- https://sgillies.net/tags/rasterio.html is good for current news!
- https://github.com/sgillies/affine

see also shapely and fiona, maybe also cartopy, pyproj, pysal, etc.

[GeoRaster](https://georaster.readthedocs.io/en/latest/) [[github](https://github.com/atedstone/georaster)] looks interesting

todo: read up on this good ref https://chris35wills.github.io/courses/pydata_stack/


## Rasterio Tutorials

- https://automating-gis-processes.github.io/CSC18/lessons/L6/overview.html, esp. https://automating-gis-processes.github.io/CSC18/lessons/L6/reading-raster.html
- [Accessing datasets located in buffers using MemoryFile and ZipMemoryFile (rasterio#977)](https://github.com/mapbox/rasterio/issues/977) is a nice review of ways to open files.

## Rasterio Versions

The 1.0 release was announced in the [Rasterio 1.0.0](https://sgillies.net/2018/07/13/rasterio-1-0-0.html) blog post.

[Migrating to Rasterio 1.0](https://rasterio.readthedocs.io/en/latest/topics/migrating-to-v1.html) outlines the changes.

See [CHANGES.txt](https://github.com/mapbox/rasterio/blob/master/CHANGES.txt) for changelog.

- ...
- `ul()` replaces the `xy()` method, note `xy` defaults to center of pixel while `ul` defaults to upper left of pixel.
- `read()` replaces `read_band()`
- `read_masks()` replaces `read_mask()`

Note: `ul(row, col)` was replaced by `xy(row, col)` in 1.0+, 

### Transform Backstory

Rasterio v0.36 used GDAL geotransform arrays in their native form, but v1.0 moved toward `affine.Affine` instances.
During the change a transitional `affine` attribute was added for the new `Affine` approach and `transform` was deprecated, but by v1.0 `transform` changed to new affine and `affine` was removed.

See [affine.Affine() vs. GDAL-style geotransforms](https://rasterio.readthedocs.io/en/latest/topics/migrating-to-v1.html#affine-affine-vs-gdal-style-geotransforms) (from Migrating to Rasterio 1.0) and the links therein for more on the transform vs. affine drama, as well as  [mapbox/rasterio#86](https://github.com/mapbox/rasterio/issues/86).


## Rasterio Objects

- `DatasetReader`
- `Band`
- `Window`
- `CRS`
- `Affine`
- ...

### DatasetReader

Let's focus on reading raster data, and ignore the idea of writing it for the time being.

Dataset is read into a [`DatasetReader`](https://rasterio.readthedocs.io/en/latest/api/rasterio.io.html#rasterio.io.DatasetReader) object (older versions of Rasterio may create a `RasterReader` [ref [rasterio#1221](https://github.com/mapbox/rasterio/issues/1221)]).

There is some GDAL object beneath?

[rasterio._base.DatasetBase](https://rasterio.readthedocs.io/en/latest/api/rasterio._base.html#rasterio._base.DatasetBase) > [rasterio._io.DatasetReaderBase](https://rasterio.readthedocs.io/en/latest/api/rasterio._io.html#rasterio._io.DatasetReaderBase) > [rasterio.io.DatasetReader](https://rasterio.readthedocs.io/en/latest/api/rasterio.io.html#rasterio.io.DatasetReader)

#### DatasetReader Properties

note that some/all are generic dataset attributes also shared by DatasetWriter

- `name`---
- `mode`---
- `closed`---

- `count`---the number of raster bands in the dataset
- `width`---
- `height`---
- `bounds`---derived from `transform`
- `transform`---affine transformation matrix
- `crs`---
- `meta`
- `profile`
- descriptions?
- units?

- `indexes`
- `dtypes`

- `res`---



 ## Understanding the Rasterio Codebase
 
 Affine?
 
 Rasterio is confusing.
 
 The basic `rasterio` module is in [`__init__.py`](https://github.com/mapbox/rasterio/blob/master/rasterio/__init__.py)

The [rasterio._base module](https://rasterio.readthedocs.io/en/latest/api/rasterio._base.html) (defined in [`_base.pyx`](https://github.com/mapbox/rasterio/blob/master/rasterio/_base.pyx)) defines numpy-free base classes, including `DatasetBase`. Some methods include `get_crs()`, `get_transform()`, as well as `read_crs()` and `read_transform()` but they are just called by the corresponding `_get_*` methods to set the values of `_crs` and `_transform` if they are not already set. Note "`get_crs`, `set_crs`, `set_nodatavals`, `set_descriptions`, `set_units`, and `set_gcps` are deprecated and will be removed in version 1.0. They have been replaced by fully settable dataset properties crs, nodatavals, descriptions, units, and gcps.", so this section needs to be updated! Also see if/how crs and transform are different than their `read_` methods. And what's up with `get_transform` it's still around.

TODO, it seems that transform attribute is set in `_io.pyx`, but it's not clear where transform is set? Note some set with `@property / def property_name:` and some using `property property_name:`.

So `wkt` seems to be a different representation of the CRS, 

__i think some need to be imported individually!__ e.g. `rasterio.plot`
and the underscore versions are some compiled thing

What's up with gcps and the warp module, and why do DatasetReader have a gcps attribute?

https://rasterio.readthedocs.io/en/latest/api/ lists modules and submodules but not all are available
 
- `compat`
- `coords`
- crs `_crs`
- drivers `_drivers` 
- `dtypes`
- `enums`
- `env`
- `errors`
- features `_features`
- fill `_fill`
- io `_io`?
- mask ?
- merge ?
- plot ?
- `profiles`
- sample ?
- `transform`
- `vfs`
- vrt ?
- warp `_warp`
- `windows`
 
`rasterio.io` is kind of available as `rasterio._io` but not certain


## Importing Rasterio

In [None]:
import rasterio

In [None]:
# Uncomment to examine rasterio package.
# dir(rasterio)
# help(rasterio)

questions and tasks:

- learn about masks

## Opening a Raster File

- [`rasterio.open()`](https://rasterio.readthedocs.io/en/latest/api/rasterio.html#rasterio.open) (defined in [`__init__.py`](https://github.com/mapbox/rasterio/blob/master/rasterio/__init__.py)) returns a [**`DatasetReader`**](https://rasterio.readthedocs.io/en/latest/api/rasterio.io.html#rasterio.io.DatasetReader) or [`DatasetWriter`](https://rasterio.readthedocs.io/en/stable/api/rasterio.io.html#rasterio.io.DatasetWriter).
- `rasterio.band()` (defined in [`__init__.py`](https://github.com/mapbox/rasterio/blob/master/rasterio/__init__.py)) allows you to wrap a dataset and and one or more of its bands up into a `rasterio.Band`, which is really just a tuple:

        Band = namedtuple('Band', ['ds', 'bidx', 'dtype', 'shape'])

It's not clear what you can do with a Band object!?

definitions and inheritance:

- `DatasetReader`, `DatasetWriter`, `MemoryFile`, `BufferedDatasetReader` are part of the [rasterio.io module](https://rasterio.readthedocs.io/en/latest/api/rasterio.io.html) (defined in [`io.py`](https://github.com/mapbox/rasterio/blob/master/rasterio/io.py), but nothing is really defined until [`_io.pyx`](https://github.com/mapbox/rasterio/blob/master/rasterio/_io.pyx))).
- `DataSetReader` inherits from `rasterio._io.DatasetReaderBase`, `rasterio.windows.WindowMethodsMixin`, `rasterio.transform.TransformMethodsMixin`.
- The [rasterio._io module](https://rasterio.readthedocs.io/en/latest/api/rasterio._io.html) (defined in [`_io.pyx`](https://github.com/mapbox/rasterio/blob/master/rasterio/_io.pyx)) defines `DatasetReaderBase`, `DatasetWriterBase`, `MemoryFileBase`, `InMemoryRaster`, and some writer classes, also includes `read()`.
- `DatasetBase` is defined in [`_base.pyx`](https://github.com/mapbox/rasterio/blob/master/rasterio/_base.pyx). See [rasterio._base module](https://rasterio.readthedocs.io/en/latest/api/rasterio._base.html).

see also MemoryFile? for in-memory? see [In-Memory Files](https://rasterio.readthedocs.io/en/stable/topics/memory-files.html) but these seem to be related to network files or GDAL somethingorother!

read https://github.com/mapbox/rasterio/issues/86 sometime!
and see https://rasterio.readthedocs.io/en/latest/topics/migrating-to-v1.html

note `close()` method!?

what the hell is the *band cache*?

open in update mode? r+, w ?


In [None]:
rasterio.band?

## Read Data into Memory

[Reading Datasets](https://rasterio.readthedocs.io/en/latest/topics/reading.html)

Use [`dataset.read()`](https://rasterio.readthedocs.io/en/latest/api/rasterio.io.html#rasterio.io.DatasetReader.read) (defined on `DatasetReaderBase` in [`_io.pyx`](https://github.com/mapbox/rasterio/blob/master/rasterio/_io.pyx)) to read pixels into a numpy array. Without args it will read the entire dataset, or else specify which band or list of bands to read.

*Rasterio band order* for arrays is: `(bands, rows, columns)`. Use `rasterio.plot.reshape_as_image()` to convert axis order to `(rows, columns, bands)`, and `rasterio.plot.reshape_as_raster()` to convert back (conversion functions defined in [`plot.py`](https://github.com/mapbox/rasterio/blob/master/rasterio/plot.py), and use `np.ma.transpose()` and `np.transpose()`, respectively).

See [Interoperability](https://rasterio.readthedocs.io/en/latest/topics/image_processing.html).


### windows and blocks

The `read()` method may be used to obtain a view on to a rectangular subset of the dataset, referred to as a *window*. The window may be specified usings offsets, or better with a `Window` object. The [rasterio.windows module](https://rasterio.readthedocs.io/en/latest/api/rasterio.windows.html) (defined in [`window.py`](https://github.com/mapbox/rasterio/blob/master/rasterio/windows.py)) defines the [`Window`](https://rasterio.readthedocs.io/en/latest/api/rasterio.windows.html#rasterio.windows.Window) class and related utility functions.

See [Windowed reading and writing](https://rasterio.readthedocs.io/en/latest/topics/windowed-rw.html) for a good overview of windows and blocks.

Dataset attributes of interest when dealing with blocks are:

- [`block_shapes`](https://rasterio.readthedocs.io/en/latest/api/rasterio.io.html#rasterio.io.DatasetReader.block_shapes)
- [`block_size()`](https://rasterio.readthedocs.io/en/latest/api/rasterio.io.html#rasterio.io.DatasetReader.block_size)
- [`block_window()`](https://rasterio.readthedocs.io/en/latest/api/rasterio.io.html#rasterio.io.DatasetReader.block_window)
- [`block_windows()`](https://rasterio.readthedocs.io/en/latest/api/rasterio.io.html#rasterio.io.DatasetReader.block_windows)

Blocks and Windows are also discussed in [Concurrent processing](https://rasterio.readthedocs.io/en/latest/topics/concurrency.html).

The issue [pangeo-data/pangeo#183](https://github.com/pangeo-data/pangeo/issues/183) touches on how windows and blocks in Rasterio relate to chunks in dask and xarray.


## Spatial Indexing

In [None]:
# This is boilerplate for spatial indexing, how to use index() to convert from meters relative to CRS origin to raster pixel.

m_east = 0
m_south = 0
x, y = (dataset.bounds.left + m_east, dataset.bounds.top - m_south)
row, col = dataset.index(x, y)
print(row, col)

band1 = dataset.read(1)
# e.g. get pixel value at corresponding pixel
print(band1[row, col])

## bounding box

even if the transform has a rotation, what does that mean for bounds?

## masks and nodata

[Nodata Masks](https://rasterio.readthedocs.io/en/latest/topics/masks.html)

[rasterio.mask module](https://rasterio.readthedocs.io/en/latest/api/rasterio.mask.html)

- `nodata`?
- `nodatavals`?
- `read_masks`?
- `write_masks`?
- `dataset_masks()`?


When `masked=True` the `read()` method will return a masked array.


## Plotting with Rasterio

- [Plotting](https://rasterio.readthedocs.io/en/latest/topics/plotting.html)
- [rasterio.plot module](https://rasterio.readthedocs.io/en/latest/api/rasterio.plot.html)

see [`plot.py`](https://github.com/mapbox/rasterio/blob/master/rasterio/plot.py)

`rasterio.plot.show(source, with_bounds=True, contour=False, contour_label_kws=None, ax=None, title=None, transform=None, adjust='linear', **kwargs)`
is a wrapper for pyplot, I believe imshow but check?

source can be a Band or tuple (dataset, bdx), or array, or dataset in which case the first band is diaplayed unless colorinterp metadata is set - how?

TODO it's really unclear how to plot three bands from a multispectral image! seems possible if you create a band object

## rasterio merge module

[rasterio.merge module](https://rasterio.readthedocs.io/en/latest/api/rasterio.merge.html)

https://automating-gis-processes.github.io/CSC18/lessons/L6/raster-mosaic.html

## rasterio.transform module

[rasterio.transform module](https://rasterio.readthedocs.io/en/latest/api/rasterio.transform.html)

uses Affine.

Note the `transform` property is type `affine.Affine` and has properties for accessing:

    a b c
    d e f
    g h i


## Misc