# Pysheds

Pysheds is an open-source library designed to help with processing of digital elevation models (DEMs), particularly for hydrologic analysis. Pysheds performs many of the basic hydrologic functions offered by commercial software such as ArcGIS, including catchment delineation and accumulation computation.

Pysheds is designed with speed in mind. It can delineate a flow direction grid of ~10 million cells in less than 5 seconds. Flow accumulation for a grid of 36 million cells can be computed in about 15 seconds. These methods can be readily automated and incorporated into web services: for instance, a web mapping service like leaflet.js could use pysheds to generate contributing areas for locations of interest on-the-fly.

The library is available at my github here: https://github.com/mdbartos/pysheds

A full feature list is given below:

Hydrologic Functions:
- flowdir: DEM to flow direction.
- catchment: Delineate catchment from flow direction.
- accumulation: Flow direction to flow accumulation.
- flow_distance: Compute flow distance to outlet.
- resolve_flats: Resolve flats in a DEM using the modified method of Garbrecht and Martz (1997).
- fraction: Compute fractional contributing area between differently-sized grids.
- extract_river_network: Extract river network at a given accumulation threshold.
- cell_area: Compute (projected) area of cells.
- cell_distances: Compute (projected) channel length within cells.
- cell_dh: Compute the elevation change between cells.
- cell_slopes: Compute the slopes of cells.
Utilities:
- view: Returns a view of a dataset at a given bounding box and resolution.
- clip_to: Clip the current view to the extent of nonzero values in a given dataset.
- resize: Resize a dataset to a new resolution.
- rasterize: Convert a vector dataset to a raster dataset.
- polygonize: Convert a raster dataset to a vector dataset.
- check_cycles: Check for cycles in a flow direction grid.
- set_nodata: Set nodata value for a dataset.
I/O:
- read_ascii: Reads ascii gridded data.
- read_raster: Reads raster gridded data.
- to_ascii: Write grids to ascii files.


An overview with examples is given below:

## Read DEM data and plot

In [None]:
# Read elevation raster
# ----------------------------
from pysheds.grid import Grid

grid = Grid.from_raster('data/elevation.tiff')
dem = grid.read_raster('data/elevation.tiff')


import numpy as np
import matplotlib.pyplot as plt
from matplotlib import colors
import seaborn as sns

fig, ax = plt.subplots(figsize=(8,6))
fig.patch.set_alpha(0)

plt.imshow(dem, extent=grid.extent, cmap='terrain', zorder=1)
plt.colorbar(label='Elevation (m)')
plt.grid(zorder=0)
plt.title('Digital elevation map', size=14)
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.tight_layout()

# Condition the elevation data

The following commands are used to remove any imperfections (sinks) in the digital elevation model. A sink is a cell that does not have an associated drainage value. Drainage values indicate the direction that water will flow out of the cell, and are assigned during the process of creating a flow direction grid for the landscape. The resulting drainage network depends on finding the ‘flow path’ of every cell in the grid, so it is important that the fill step is performed prior to creating a flow direction grid.

In [None]:
# Condition DEM
# ----------------------
# Fill pits in DEM
pit_filled_dem = grid.fill_pits(dem)

# Fill depressions in DEM
flooded_dem = grid.fill_depressions(pit_filled_dem)
    
# Resolve flats in DEM
inflated_dem = grid.resolve_flats(flooded_dem)

## Elevation to flow direction

A flow direction grid assigns a value to each cell to indicate the direction of flow – that is, the direction that
water will flow from that particular cell based on the underlying topography of the landscape. This is a crucial
step in hydrological modeling, as the direction of flow will determine the ultimate destination of the water flowing across the surface of the land. Flow direction grids are created using the following commands. For every 3x3 cell neighbourhood, the grid processor finds the lowest neighbouring cell from the centre. Each number in the matrix below corresponds to a flow direction – that is, if the centre cell flows due north, its value will be 64; if it flows northeast, its value will be 128, etc. These numbers have no numeric meaning – they are simply codes that define a specific directional value, and are determined using the elevation values from the underlying DEM.

![_._](img/flow-direction.png)



In [None]:
# Determine D8 flow directions from DEM
# ----------------------
# Specify directional mapping
dirmap = (64, 128, 1, 2, 4, 8, 16, 32)
    
# Compute flow directions
# -------------------------------------
fdir = grid.flowdir(inflated_dem, dirmap=dirmap)

In [None]:
fig = plt.figure(figsize=(8,6))
fig.patch.set_alpha(0)

plt.imshow(fdir, extent=grid.extent, cmap='viridis', zorder=2)
boundaries = ([0] + sorted(list(dirmap)))
plt.colorbar(boundaries= boundaries,
             values=sorted(dirmap))
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.title('Flow direction grid', size=14)
plt.grid(zorder=-1)
plt.tight_layout()

## Compute accumulation from flow direction

The following commands calculate the flow into each cell by identifying the upstream cells that flow into each downslope cell. In other words, each cell’s flow accumulation value is determined by the number of upstream cells flowing into it based on landscape topography.

The new flow accumulation raster will be added to your map document. Each cell in the grid contains a value that represents the number of cells upstream from that particular cell. Cells with higher flow accumulation values should be located in areas of lower elevation, such as valleys or drainage channels where water flows naturally while it is following the landscape.

In [20]:
# Calculate flow accumulation
# --------------------------
acc = grid.accumulation(fdir, dirmap=dirmap)

In [None]:
fig, ax = plt.subplots(figsize=(8,6))
fig.patch.set_alpha(0)
plt.grid('on', zorder=0)
im = ax.imshow(acc, extent=grid.extent, zorder=2,
               cmap='cubehelix',
               norm=colors.LogNorm(1, acc.max()),
               interpolation='bilinear')
plt.colorbar(im, ax=ax, label='Upstream Cells')
plt.title('Flow Accumulation', size=14)
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.tight_layout()

## Delineate catchment from flow direction

Pour point placement is an important step in the process of watershed delineation. A pour point should exist within an area of high flow accumulation because it is used to calculate the total contributing water flow to that
given point. In many cases you will already have a file containing the locations of your pour points, whether they are sampling sites, hydrometric stations, or another data source. However in some cases, it may be necessary or preferable to create pour points manually. Instructions have been provided for both cases below

In [None]:
# Delineate a catchment
# ---------------------
# Specify pour point
x, y = -97.294, 32.737

# Snap pour point to high accumulation cell
x_snap, y_snap = grid.snap_to_mask(acc > 1000, (x, y))

# Delineate the catchment
catch = grid.catchment(x=x_snap, y=y_snap, fdir=fdir, dirmap=dirmap, 
                       xytype='coordinate')

# Crop and plot the catchment
# ---------------------------
# Clip the bounding box to the catchment
grid.clip_to(catch)
clipped_catch = grid.view(catch)

In [None]:
# Plot the catchment
fig, ax = plt.subplots(figsize=(8,6))
fig.patch.set_alpha(0)

plt.grid('on', zorder=0)
im = ax.imshow(np.where(clipped_catch, clipped_catch, np.nan), extent=grid.extent,
               zorder=1, cmap='Greys_r')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.title('Delineated Catchment', size=14)

## Extract the river network

In [None]:
# Extract river network
# ---------------------
branches = grid.extract_river_network(fdir, acc > 50, dirmap=dirmap)

In [None]:
sns.set_palette('husl')
fig, ax = plt.subplots(figsize=(8.5,6.5))

plt.xlim(grid.bbox[0], grid.bbox[2])
plt.ylim(grid.bbox[1], grid.bbox[3])
ax.set_aspect('equal')

for branch in branches['features']:
    line = np.asarray(branch['geometry']['coordinates'])
    plt.plot(line[:, 0], line[:, 1])
    
_ = plt.title('D8 channels', size=14)

## Compute flow distance from flow direction


In [None]:
# Calculate distance to outlet from each cell
# -------------------------------------------
dist = grid.distance_to_outlet(x=x_snap, y=y_snap, fdir=fdir, dirmap=dirmap,
                               xytype='coordinate')

In [None]:
fig, ax = plt.subplots(figsize=(8,6))
fig.patch.set_alpha(0)
plt.grid('on', zorder=0)
im = ax.imshow(dist, extent=grid.extent, zorder=2,
               cmap='cubehelix_r')
plt.colorbar(im, ax=ax, label='Distance to outlet (cells)')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.title('Flow Distance', size=14)

## Add land cover data

In [None]:
# Combine with land cover data
# ---------------------
terrain = grid.read_raster('data/impervious_area/impervious_area.tiff', window=grid.bbox,
                           window_crs=grid.crs, nodata=0)
# Reproject data to grid's coordinate reference system
projected_terrain = terrain.to_crs(grid.crs)
# View data in catchment's spatial extent
catchment_terrain = grid.view(projected_terrain, nodata=np.nan)

In [None]:
fig, ax = plt.subplots(figsize=(8,6))
fig.patch.set_alpha(0)
plt.grid('on', zorder=0)
im = ax.imshow(catchment_terrain, extent=grid.extent, zorder=2,
               cmap='bone')
plt.colorbar(im, ax=ax, label='Percent impervious area')
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.title('Percent impervious area', size=14)

## Add vector data

In [None]:
# Convert catchment raster to vector and combine with soils shapefile
# ---------------------
# Read soils shapefile
import pandas as pd
import geopandas as gpd
from shapely import geometry, ops
soils = gpd.read_file('data/soils/soils.shp')
soil_id = 'MUKEY'
# Convert catchment raster to vector geometry and find intersection
shapes = grid.polygonize()
catchment_polygon = ops.unary_union([geometry.shape(shape)
                                     for shape, value in shapes])
soils = soils[soils.intersects(catchment_polygon)]
catchment_soils = gpd.GeoDataFrame(soils[soil_id], 
                                   geometry=soils.intersection(catchment_polygon))
# Convert soil types to simple integer values
soil_types = np.unique(catchment_soils[soil_id])
soil_types = pd.Series(np.arange(soil_types.size), index=soil_types)
catchment_soils[soil_id] = catchment_soils[soil_id].map(soil_types)

In [None]:
fig, ax = plt.subplots(figsize=(8, 6))
catchment_soils.plot(ax=ax, column=soil_id, categorical=True, cmap='terrain',
                     linewidth=0.5, edgecolor='k', alpha=1, aspect='equal')
ax.set_xlim(grid.bbox[0], grid.bbox[2])
ax.set_ylim(grid.bbox[1], grid.bbox[3])
plt.xlabel('Longitude')
plt.ylabel('Latitude')
ax.set_title('Soil types (vector)', size=14)

## Convert from vector to raster

In [None]:
soil_polygons = zip(catchment_soils.geometry.values, catchment_soils[soil_id].values)
soil_raster = grid.rasterize(soil_polygons, fill=np.nan)

In [None]:
fig, ax = plt.subplots(figsize=(8, 6))
plt.imshow(soil_raster, cmap='terrain', extent=grid.extent, zorder=1)
boundaries = np.unique(soil_raster[~np.isnan(soil_raster)]).astype(int)
#plt.colorbar(boundaries=boundaries,
#             values=boundaries)
ax.set_xlim(grid.bbox[0], grid.bbox[2])
ax.set_ylim(grid.bbox[1], grid.bbox[3])
plt.xlabel('Longitude')
plt.ylabel('Latitude')
ax.set_title('Soil types (raster)', size=14)