[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/UM-RMRS/raster_tools/blob/main/notebooks/zonal_module.ipynb)

# Raster Tools Zonal Module
## This notebook demonstrates the functionality of the Raster Tools zonal module 
by John Hogland 4/27/2023

# Install software for Colab

In [None]:
!pip install --upgrade gdown
!pip install --upgrade numba
!pip install --upgrade geopandas
!pip install mapclassify
!pip install --upgrade datascience
!pip install --upgrade gym
!pip install --upgrade folium
!pip install raster_tools

# The Process
In this notebook we create some raster datasets and use the zonal module to summarize and extract values

## Steps
- 1. Create random raster surfaces
- 2. Create zonal Raster and summarize
- 3. Create zonal Vector and summarize
- 4. Create random locations
- 5. Extract raster values from those locations

## Step 1: Create a random raster surface
### Import various packages

In [None]:
from raster_tools import Raster, general, zonal
import py3dep
import geopandas as gpd
import numpy as np

### Create a 3 band raster surface (y=1500 by x=1000 cells) of random numbers between 0 and 255
#### This raster will be the raster values that get summarized

In [None]:
rs=Raster(np.random.randint(0,255,(3,1500,1000)))
display(rs.xdata)
rs.plot(col='band',col_wrap=3)

## Step 2: Create zonal Raster and summarize
This raster will be what we use to summarize values (dimensions = 1,1500,1000)

In [None]:
vls=np.ones((1,1500,1000))
vls[0,750:1500,0:500] = 2
vls[0,0:750,0:500] = 3
vls[0,750:1500,500:1000] = 4
z_rs=Raster(vls).astype(int)
display(z_rs.xdata)
z_rs.plot()

### Summarize by zones (Raster)
Note, outputs are organized by zone and band and can be pivoted on the band column to create a one to one record match with the zone dataset.

In [None]:
zonal.zonal_stats(z_rs,rs,list(zonal.ZONAL_STAT_FUNCS)).compute()

### Step 3: Create a zonal Vector and summarize
- Create a Vector object with a Albers project ('EPSG:5070'). Note Vector and Raster projection must be specified and match. To match projection use geopandas to_crs() function.
- Overlapping areas are accounted for when using Vectors

In [None]:
from shapely import geometry
xmin, ymin, xmax, ymax = z_rs.bounds
x, y = (xmin, ymin)
lngx = (xmax-xmin)/2
lngy= (ymax-ymin)/2
geom_lst = []
while y < ymax:
    while x < xmax:
        geom = geometry.Polygon(
            [
                (x, y),
                (x, y + lngy),
                (x + lngx, y + lngy),
                (x + lngx, y),
                (x, y),
            ]
        )
        geom_lst.append(geom)
        x += lngx

    x = xmin
    y += lngy
    
vct=zonal.get_vector(gpd.GeoDataFrame({'geometry':geom_lst},crs='EPSG:5070')) #specifying a arbitrary projection for the example

#visualize the data
dt=vct.data.compute() #compute data to a geopandas dataframe
display(dt)
dt.plot(facecolor="none")

### Summarize values using the zonal Vector layer
- Outputs are organized by zone and band and can be pivoted on the band column to create a one to one record match with the zone dataset.

In [None]:
odf=zonal.zonal_stats(vct,rs.set_crs('EPSG:5070'),list(zonal.ZONAL_STAT_FUNCS)).compute() # don't forget to set a projection for the raster dataset
odf

### Pivot data so that it can be merged with the original zone dataset

In [None]:
odf['pid']=list(np.arange(4))*3 # create a repeating index to pivot on
odfp=odf.pivot(index='pid',columns='band',values=odf.columns[1:-1])
odfp

## Step 4: Create 150 random points to extract raster values

In [None]:
n=150
xmin, ymin, xmax, ymax = z_rs.bounds
xdif = xmax - xmin
ydif = ymax - ymin
pnts_lst = []
while len(pnts_lst) < n:
    x = (np.random.random() * xdif) + xmin
    y = (np.random.random() * ydif) + ymin
    pnt = geometry.Point([x, y])
    pnts_lst.append(pnt)

dic = {"geometry": pnts_lst}
pnts = gpd.GeoDataFrame(dic)
# Visualize the points
display(pnts)
pnts.plot()

## Step 5: Extract raster values using the point locations
- By default extracted values are returned in a one to many dataframe. The dataframe contains 2 columns and a row for each point band combination (pnts*bands)
- Alternatively, you can specify the output to be merged by column by setting the optional variable axis to 1
- We will set the optional variable making a new dataset ready to join with the pnts dataset 

In [None]:
edf=zonal.extract_points_eager(pnts,rs,column_name='test',axis=1).compute()
edf

### Joint edf to pnts dataframes

In [None]:
pnts_edf= pnts.join(edf)
display(pnts_edf)
pnts_edf.plot(column='test_1',legend=True)


# This ends the Raster Tools zonal module notebook
## Check out the other notebooks:
- https://github.com/UM-RMRS/raster_tools/blob/main/notebooks/README.md
## References
- Raster-Tools GitHub: https://github.com/UM-RMRS/raster_tools
- Hogland's Spatial Solutions: https://sites.google.com/view/hoglandsspatialsolutions/home
- Dask: https://dask.org/
- Geopandas:https://geopandas.org/en/stable/
- Xarray: https://docs.xarray.dev/en/stable/
- Jupyter: https://jupyter.org/
- Anaconda:https://www.anaconda.com/
- VS Code: https://code.visualstudio.com/
- ipywidgets: https://ipywidgets.readthedocs.io/en/latest/
- numpy:https://numpy.org/
- matplotlib:https://matplotlib.org/
- folium: https://python-visualization.github.io/folium/
- pandas: https://pandas.pydata.org/
- sklearn: https://scikit-learn.org/stable/index.html