<a href="https://colab.research.google.com/github/matthewshawnkehoe/Data-Analysis/blob/main/geotiff_example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [3]:
# Toy example for .tif

!pip install keras-spatial
from keras_spatial.datagen import SpatialDataGenerator


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting keras-spatial
  Downloading keras_spatial-1.0.7-py2.py3-none-any.whl (22 kB)
Collecting rasterio (from keras-spatial)
  Downloading rasterio-1.3.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m21.3/21.3 MB[0m [31m100.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting geopandas (from keras-spatial)
  Downloading geopandas-0.13.2-py3-none-any.whl (1.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m63.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting netCDF4 (from keras-spatial)
  Downloading netCDF4-1.6.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.4/5.4 MB[0m [31m98.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting fiona>=1.8.19 (from geopandas->keras-spatial)
  Dow

Quickstart

1.   Create a SpatialDataGen and set the source raster
2.   Create a geodataframe with 200x200 (in projection units) samples covering the spatial extent of the raster
3. Create the generator producing arrays with shape [32, 128, 128, 1]
4. Fit model

In [4]:
from keras_spatial.datagen import SpatialDataGenerator

sdg = SpatialDataGenerator(source='/content/drive/MyDrive/data/train/BannockLakes_20180728.tif')
geodataframe = sdg.regular_grid(200, 200)
generator = sdg.flow_from_dataframe(geodataframe, 128, 128, batch_size=32)
# model(generator, ...)

**Usage**

Keras Spatial provides a SpatialDataGenerator (SDG) modeled on the Keras ImageDataGenerator. The SDG allows user to work in spatial coorindates rather than pixels and easily integrate data from different coordinates systems. Reprojection and resampling is handled automatically as needed. Because Keras Spatial is based on the rasterio package, raster data source may either local files or remote resources referenced by URL.

Because the SDG reads directly from larger raster data sources rather than small, preprocessed images files, SDG makes use of a GeoDataFrame to identify each sample area. The geometry associated with the datafame is expected to be a polygon but extraction is done using a windowed read based on the bounds. As with the ImageDataGenerator, the flow_from_dataframe method returns the generator that can be passed to the Keras model.
SpatialDataGenerator class

The SDG is similar to the ImageDataGenerator albeit missing the .flow and the .flow_from_directory methods. SDG also moves more configutation and setting to the instance and with the .flow_from_dataframe having few arguments.

**SpatialDataGenerator class**

The SDG is similar to the ImageDataGenerator albeit missing the .flow and the .flow_from_directory methods. SDG also moves more configutation and setting to the instance and with the .flow_from_dataframe having few arguments.

**Arguments**

*   source (path or url): raster source
*   width (int): array size produced by generator
*   height (int): array size produced by generator
*   width (int): array size produced by generator
*   indexes (int or tuple of ints): one or more raster bands to sampled
*   interleave (str): type of interleave 'band' or 'pixel' (default='pixel')
*   resampling (int): One of the values from rasterio.enums.Resampling (default=Resampling.nearest)

Raises RasterioIOError when the source is set if the file does not exist or remote resource is not available.

**Examples**

In [6]:
from keras_spatial import SpatialDataGenerator

sdg = SpatialDataGenerator(source='/content/drive/MyDrive/data/train/BannockLakes_20180728.tif')
sdg.width, sdg.height = 128,128

The source must be set prior to calling flow_from_dataframe. Width and height can set as attributes to the SDG or as arguments to flow_from _dataframe but specifying as arguments to flow_from_dataframe is preferred.

The indexes argument selects bands in a multiband raster. By default all bands are read and the indexes argument is updated when the raster source is set.

In multiband situations, if interleave is set to 'band' (the default) the numpy array will have the shape `[batch_size, bands, height, width]` and is compatible with TensorFlow. If interleave is set to 'pixel', the shape will be `[batch_size, height, width, bands]` which is not generally what you want, use with care.

In [7]:
# file.tif is a 5 band raster
sdg = SpatialDataGenerator('/content/drive/MyDrive/data/train/BannockLakes_20180728.tif')
sdg.interleave, sdg.indexes = 'band', -1
arr = next(sdg.flow_from_dataframe(df, 128, 128, batch_size=1))
print(arr.shape)

sdg.interleave, sdg.indexes = 'band', 1
arr = next(sdg.flow_from_dataframe(df, 128, 128, batch_size=1))
print(arr.shape)

sdg.interleave, sdg.indexes = 'pixel', [1,2,3]
arr = next(sdg.flow_from_dataframe(df, 128, 128, batch_size=1))
print(arr.shape)

sdg.interleave, sdg.indexes = 'pixel', 1
arr = next(sdg.flow_from_dataframe(df, 128, 128, batch_size=1))
print(arr.shape)

NameError: ignored