# Adding a column for channel gradient 

In [None]:
import pandas as pd
import geopandas as gpd
import cartopy as cp
import cartopy.crs as ccrs
import rasterio as rio
import matplotlib.pyplot as plt
import numpy as np

In [None]:
df = pd.read_csv("el_study_chi_data_map.csv")
gdf = gpd.GeoDataFrame(
    df, geometry=gpd.points_from_xy(df.longitude, df.latitude))

# We have to tell the geopandas data what geographic system we are in by using something called an EPSG code. 
# All major geographic projection and transformation system have this code. 
gdf.crs = "EPSG:4326" 

# The head command shows you what is in the file.
gdf.head()

## Making new data columns: slope and smoothed slope

Okay, we have flow distance and elevation in this file, but we also want to look at the slope of the channel. To get the slope, we need to calculate the change in elevation over the change in flow distance. The mathematical operation for this is called the gradient (or, if you want to use the notation of derivatives it is `dz/dx`).

The python package `numpy` has a built in function for calculating the gradient (`np.gradient`), which we use below to get the slope along the channel.

In [None]:
z = gdf.elevation
x = gdf.flow_distance
S = np.gradient(np.asarray(z),np.asarray(x))
gdf["slope"] = S
gdf.head()

In this notebook, I want to isolate one of the basins. `pandas` has lots of easy ways to isolate data.

In line 2 below, you use this syntax to isolate data in your data set. The `basin_key` is one of the data columns in the dataset (you can always see the data columns by using the `.head()` command on your dataset. 

In [None]:
# First lets isolate just one of these basins. The Killmade Burn is basin 4
gdf_b1 = gdf[(gdf['basin_key'] == 4)]
gdf_b1.head()

Okay, now lets plot the channel profile and the channel slope. 

Much of the code below is for plotting, but there are some key lines (lines 7-8):

    # The main stem channel is the one with the minimum source key in this basin
    min_source = np.amin(gdf_b1.source_key)
    gdf_b2 = gdf_b1[(gdf_b1['source_key'] == min_source)]
    
In thse three lines, we are isolating a `source_key` which is a single channel in the DEM. The basin has lots of channels, but we only want one. The minimm source key in a basin is the longest channel in this dataset. 

In [None]:
plt.rcParams['figure.figsize'] = [10, 10]

# First lets isolate just one of these basins. There is only basin 0 and 1
gdf_b1 = gdf[(gdf['basin_key'] == 4)]

# The main stem channel is the one with the minimum source key in this basin
min_source = np.amin(gdf_b1.source_key)
gdf_b2 = gdf_b1[(gdf_b1['source_key'] == min_source)]
#gdf_b2 = gdf_b1

# Now make channel profile plots
z = gdf_b2.elevation
x_locs = gdf_b2.flow_distance
S = gdf_b2.slope

# Create two subplots and unpack the output array immediately
plt.clf()
f, (ax1, ax2) = plt.subplots(2, 1)
ax1.scatter(x_locs, z,s = 0.2)
ax2.scatter(x_locs, S,s = 1)


ax1.set_xlabel("Distance from outlet ($m$)")
ax1.set_ylabel("elevation (m)")

ax2.set_xlabel("Distance from outlet ($m$)")
ax2.set_ylabel("Slope (m/m)")

plt.tight_layout()

This slope (bottom figure) is very noisy. One way to deal with this is to smooth the data. We can smooth the data by running a mobing window over it and doing some averaging inside the window. 

Python has lots of tools for this. In this case I use a `rolling` window and I have picked various settings. You don't need to worry about this too much, the only number that you might wanty to play with is the first number after `rolling` which is the number of datapoints in the window. The bigger this number, the more smoothed the data becomes. 

In [None]:
gdf['slope_rolling'] = gdf.slope.rolling(40,win_type='hamming').mean()
gdf.head()

This plot will show the slope and the rolling slope, so you can see how the rolling window smooths the data. 

In [None]:
plt.rcParams['figure.figsize'] = [10, 10]

# First lets isolate just one of these basins. There is only basin 0 and 1
gdf_b1 = gdf[(gdf['basin_key'] == 4)]

# The main stem channel is the one with the minimum source key in this basin
min_source = np.amin(gdf_b1.source_key)
gdf_b2 = gdf_b1[(gdf_b1['source_key'] == min_source)]
#gdf_b2 = gdf_b1

# Now make channel profile plots
# To get a single data column from a pandas dataframe (in this case called gdf_b2) you just put
# a full stop and then the name of the column
# If your column has spaces or funny characters in the name you need to use the square brackets like this:
# z = gdf_b2["elevation"]
# Which is an alternative way of isolating data
z = gdf_b2.elevation
x_locs = gdf_b2.flow_distance
S = gdf_b2.slope
SR = gdf_b2.slope_rolling

# Create two subplots and unpack the output array immediately
plt.clf()
f, (ax1, ax2) = plt.subplots(2, 1)
ax1.scatter(x_locs, S,s = 1)
ax2.scatter(x_locs, SR,s = 1)


ax1.set_xlabel("Distance from outlet ($m$)")
ax1.set_ylabel("Slope (m/m)")

ax2.set_xlabel("Distance from outlet ($m$)")
ax2.set_ylabel("Rolling Slope (m/m)")

plt.tight_layout()

## Looking at the gradient and where the high gradient channels are along the channel profile

Okay, the rolling slope allows us to see some spikes in the gradient. Can we see this in the right places along the channel profile?

In [None]:
plt.rcParams['figure.figsize'] = [10, 5]

# First lets isolate just one of these basins. There is only basin 0 and 1
gdf_b1 = gdf[(gdf['basin_key'] == 4)]

# The main stem channel is the one with the minimum source key in this basin
# If you want to play with this a bit you can change the source number to look at different channels
min_source = np.amin(gdf_b1.source_key)
gdf_b2 = gdf_b1[(gdf_b1['source_key'] == min_source)]
#gdf_b2 = gdf_b1

# Now make channel profile plots
z = gdf_b2.elevation
x_locs = gdf_b2.flow_distance
S = gdf_b2.slope
SR = gdf_b2.slope_rolling

# Create two subplots and unpack the output array immediately
plt.clf()
f, (ax1) = plt.subplots(1, 1)
ax2 = ax1.twinx()  # instantiate a second axes that shares the same x-axis

# Make the scatter plots
ax1.scatter(x_locs, z,s = 1, label='Longitudinal profile')
ax2.scatter(x_locs, SR,s = 1,c="r", label='Channel slope')

# Some code to make sure the legend renders on the same axis
lines, labels = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax2.legend(lines + lines2, labels + labels2, loc=0)


ax1.set_xlabel("Distance from outlet ($m$)")
ax1.set_ylabel("Elevation (m)")

ax2.set_xlabel("Distance from outlet ($m$)")
ax2.set_ylabel("Rolling Slope (m/m)")

plt.tight_layout()

## Saving the channel gradients to csv

I am afraid it is a little bit complicated to save the smoothed channel gradients to csv. 

Why? Becasue there are jumps in the flow distance at the tributary junctions. 

So to get the channel gradients we need to loop through each source key and get the gradients one by one. 

In [None]:
# Isolate the Killmade Burn
gdf_b1 = gdf[(gdf['basin_key'] == 4)]

# Now print to csv
gdf_b1.to_csv("killmade_channel_with_gradient.csv")