<a href="https://colab.research.google.com/github/LSDtopotools/lsdtt_notebooks/blob/master/lsdtopotools/channel_extraction_and_drainage_area_examples/what_is_in_the_channel_files.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# What is in the channel files that lsdtopotools produces

Last updated by Simon M Mudd 09/05/2023

In this notebook we will use an example where you have collected some channel characteristics in the field and we want to know the drainage area of the points. This will include the simplest possible example where all we have is the location of the points. 

## Stuff we need to do if you are in colab (not required in the lsdtopotools pytools container)

**If you are in the `docker_lsdtt_pytools` docker container, you do not need to do any of this. 
The following is for executing this code in the google colab environment only.**

If you are in the docker container you can skip to the **Download some data** section. 

First we install `lsdviztools`. This will take around a minute. It is important you do this before the `condacolab` step. 

In [None]:
!pip install lsdviztools &> /dev/null

Now we need to install lsdtopotools. We do this using something called `mamba`. To get `mamba` we install something called `condacolab`. 

In [None]:
!pip install -q condacolab
import condacolab
condacolab.install()

Alternatively we can do this by downloading the mamba installer directly, but this frequently leads to various coding conflicts becasue you need to keep the installer URL up to date. `condacolab` does all that for you so you don't need to worry about it. 

In [None]:
#%%bash
#MINICONDA_INSTALLER_SCRIPT=Mambaforge-Linux-x86_64.sh
#MINICONDA_PREFIX=/usr/local
#wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh &> /dev/null
#chmod +x $MINICONDA_INSTALLER_SCRIPT
#./$MINICONDA_INSTALLER_SCRIPT -b -f -p $MINICONDA_PREFIX &> /dev/null

Now use mamba to install `lsdtopotools`. 
This step takes a bit over a minute. 

In [None]:
!mamba install -y lsdtopotools &> /dev/null

The next line tests to see if it worked. If you get some output asking for a parameter file then `lsdtopotools` is installed. This notebook was tested on version 0.8.

In [None]:
!lsdtt-basic-metrics -v

## First get data

Before we do anything, we need to import a few packages:

In [None]:
import lsdviztools.lsdbasemaptools as bmt
from lsdviztools.lsdplottingtools import lsdmap_gdalio as gio
import lsdviztools.lsdmapwrappers as lsdmw

Now we need to get some data to download. We are going to download data using the opentopography scraper that is included with `lsdviztools`. You will need to get an opentopography.org account and copy in your API key.

You can sign up to an opentopography.org account here: https://portal.opentopography.org/myopentopo 

Before I actually do anything I am going to set up some filenames:

In [None]:
Dataset_prefix = "RioAguas"
source_name = "COP30"

r_prefix = Dataset_prefix+"_"+source_name +"_UTM"
w_prefix = Dataset_prefix+"_"+source_name +"_UTM"

DataDirectory = "./"
Base_file = r_prefix

Now lets grab the data. If you want to do this yourself for a new area just choose your own lower lect and upper right coordinates of your site

In [None]:
# YOU NEED TO PUT YOUR API KEY IN A FILE
your_OT_api_key_file = "my_OT_api_key.txt"

with open(your_OT_api_key_file, 'r') as file:
    print("I am reading you OT API key from the file "+your_OT_api_key_file)
    api_key = file.read().rstrip()
    print("Your api key starts with: "+api_key[0:4])

SB_DEM = bmt.ot_scraper(source = source_name,
                        lower_left_coordinates = [36.97524478026287, -2.3631792251411805], 
                        upper_right_coordinates = [37.3200098350942, -1.7962073552766233],
                        prefix = Dataset_prefix, 
                        api_key_file = your_OT_api_key_file)
SB_DEM.print_parameters()
SB_DEM.download_pythonic()
DataDirectory = "./"
Fname = Dataset_prefix+"_"+source_name+".tif"
gio.convert4lsdtt(DataDirectory,Fname)

## Look at the hillshade

Right, lets see what this place looks like:

In [None]:
lsdtt_parameters = {"write_hillshade" : "true"}
lsdtt_drive = lsdmw.lsdtt_driver(read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()
lsdtt_drive.run_lsdtt_command_line_tool()

In [None]:
%matplotlib inline
Base_file = r_prefix
DataDirectory = "./"
this_img = lsdmw.SimpleHillshade(DataDirectory,Base_file,cmap="gist_earth", save_fig=False, size_format="geomorphology",dpi=500)

## Now get a single basin

I add a basin outlet into a pandas dataframe and then copy this to a file. 
The points below are obtained just by clicking in google maps and copying the resulting lat-long into the below code. 

In [None]:
# Import pandas library
import pandas as pd

data = [ [37.15674383710805, -1.9049454817508027]]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['latitude', 'longitude'])

df.to_csv("basin_outlets.csv",index=False)
df.head()

We can use the linux `cat` command to make sure the file is what we expect.

In [None]:
!cat basin_outlets.csv

## The different kinds of channel files

You can get channels in a number of ways from `lsdtopotools`.

The following are available in `lsdtt-basic-metrics`:
    
* `print_channels_to_csv:  true` This prints basic information about the channels to a csv file.
* `print_chi_data_maps: true` This includes the chi metric and drainage area
* `use_extended_channel_data: true` This adds some additional data columns to the above two csv files.

You can also control the extent of the channel network with `threshold_contributing_pixels`; a bigger number means a shorter channel network. 

Lets try these methods out. 

### `print_channels_to_csv`

First we use the `print_channels_to_csv`. This will produce a channel csv with *CN* in the filename:

In [None]:
## Get the basins and the channel profile
lsdtt_parameters = {"print_channels_to_csv" : "true",
                    "get_basins_from_outlets" : "true",
                    "basin_outlet_csv" : "basin_outlets.csv",
                    "threshold_contributing_pixels" : "2000"}
r_prefix = Dataset_prefix+"_"+source_name +"_UTM"
w_prefix = Dataset_prefix+"_"+source_name +"_UTM"
lsdtt_drive = lsdmw.lsdtt_driver(command_line_tool = "lsdtt-chi-mapping", 
                                 read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()
lsdtt_drive.run_lsdtt_command_line_tool()

We can look at the data using `pandas`

In [None]:
# Import pandas library
import pandas as pd

df = pd.read_csv("RioAguas_COP30_UTM_CN.csv")
df.head()

So this file includes some flow routing information about the network. It has spatial coordinates (`latitude` and `longitude`) and then various information about the flow routing. Each junction is numbered (this includes sources). The `Junction Index` is the number of the junction upstream. There is also the `receiver_JI` which is the downstream junction to which the channel flows. In addition there are `NI` and `reciever_NI` columns. The `NI` is the node index, which is a number assigned to the pixel (in computational terms it is the index into the flattened array that holds the data) and there is the node index of the receiver pixel (`reciever_NI`). There is also the Strahler stream order (`Stream Order`) of the pixel. 

**This does not include drainge area because this routine does not check if the basin is complete.**

This file is generally only used for plotting the location of the channel. In practise, we really only use the alternative function `get_chi_data_maps`. But we will show you what that does in a second. 

First though, lets look at what happends when you extend the data:

In [None]:
## Get the basins and the channel profile
lsdtt_parameters = {"print_channels_to_csv" : "true",
                    "get_basins_from_outlets" : "true",
                    "basin_outlet_csv" : "basin_outlets.csv",
                    "threshold_contributing_pixels" : "2000",
                    "use_extended_channel_data" : "true"}
r_prefix = Dataset_prefix+"_"+source_name +"_UTM"
w_prefix = Dataset_prefix+"_"+source_name +"_UTM"
lsdtt_drive = lsdmw.lsdtt_driver(command_line_tool = "lsdtt-chi-mapping", 
                                 read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()
lsdtt_drive.run_lsdtt_command_line_tool()

In [None]:
!ls *.csv

In [None]:
df = pd.read_csv("RioAguas_COP30_UTM_CN.csv")
df.head()

So all this realy does is add the elevation to the same file. 

### Very basic plotting of where the points are using `folium`

`lsdviztools` has a number of plotting routines for making pretty, publication ready figures. But if you just want to see where the data is you can use the package `folium` in conjunction with `pandas`. Note this takes a wee while since there are a lot of points.

In [None]:
# This is for the area threshold points

import folium

#create a map
this_map = folium.Map(prefer_canvas=True, tiles='Stamen Terrain')

def plotDot(point):
    '''input: series that contains a numeric named latitude and a numeric named longitude
    this function creates a CircleMarker and adds it to your this_map'''
    folium.CircleMarker(location=[point.latitude, point.longitude],
                        radius=2,
                        weight=5).add_to(this_map)

#use df.apply(,axis=1) to "iterate" through every row in your dataframe
df.apply(plotDot, axis = 1)


#Set the zoom to the maximum possible
this_map.fit_bounds(this_map.get_bounds())

#Save the map to an HTML file
this_map.save('simple_dot_plot.html')

this_map

### Now for `get_chi_data_maps`

**Firstly, you need to know what the chi coordinate is. If you know this already, skip ahead to the next cell.**

To understand it, we start with Morisawa's law that says slope ($S$) is related to drianage area ($A$) via two empirical constants, the concavity index ($\theta$) and the steepness index ($k_s$):

$S = k_s A^{-\theta}$

$S$ is the same as $dz/dx$, the derivative of elevation. $x$ is the flow distance. If we integrate this equation we get:
    
$z(x) = z(x_b) + \Big(\frac{k_s}{{A_0}^{\theta}}\Big) \int_{x_b}^{x} \Big(\frac{A_0}{A(x)}\Big)^{\theta} dx$

where $x_b$ is some arbitrary base level, and $A_0$ is a reference drainage area (this is to ensure the integrand is dimensionless). We almost always set $A_0$ to 1 $m^2$. The integrand seems annoying and messy, but it actually fairly easy to calculate from topographic data (you are just adding drainage area along the length of the channel. You will never need to do this yourself, there is software for calculating the integrand. It also has dimensions length. So we can define a coordinate, $\chi$:

$\chi = \int_{x_b}^{x} \Big(\frac{A_0}{A(x)}\Big)^{\theta} dx$

which is just that integrand, but it looks nicer in the equation:

$z(x) = z(x_b) + \Big(\frac{k_s}{{A_0}^{\theta}}\Big) \chi$

Now, have a look at that last equation. This is the equation of a line! 

From your school maths you might remember a line being written as $y = mx+b$. In this case the gradient of the line is $\Big(\frac{k_s}{{A_0}^{\theta}}\Big)$.

If $A_0$ = 1 $m^2$, then the gradient of the line is the channel steepness index ($k_s$)! This happens to be a very convinient way of extracting the channel steepness, and you can do it with much better accuracy than with slope--area plots. 

If you want to read more this technique, which is not widely used in geomorphology, you can read this paper:

Perron, J.T., Royden, L., 2013. An integral approach to bedrock river profile analysis. Earth Surface Processes and Landforms 38, 570–576. https://doi.org/10.1002/esp.3302


Note: $\chi$ is the greek letter chi, which a Greek person would pronounce a bit like the English "he". But because most geomorphologsts did not benefit from a classical education they pronounce this letter "kai".

#### Why `get_chi_data_maps` is more restrictive than `print_channels_to_csv`

The `get_chi_data_maps` routine is more restrictive then the `print_channels_to_csv` because it enforces complete basins. The reason why is that you must calculate the chi coordinate using the drainage area, and if the drainage area is wrong then the chi coordinate is wrong. 

**A very common error in `lsdtopotools` is searching for channels in an incomplete basin. The program will reject the basin and you will not extract any channels.** 

Now, lets see what the `print_chi_data_maps` does:

In [None]:
## Get the basins and the channel profile
lsdtt_parameters = {"print_chi_data_maps" : "true",
                    "get_basins_from_outlets" : "true",
                    "basin_outlet_csv" : "basin_outlets.csv",
                    "threshold_contributing_pixels" : "2000"}
r_prefix = Dataset_prefix+"_"+source_name +"_UTM"
w_prefix = Dataset_prefix+"_"+source_name +"_UTM"
lsdtt_drive = lsdmw.lsdtt_driver(command_line_tool = "lsdtt-chi-mapping", 
                                 read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()
lsdtt_drive.run_lsdtt_command_line_tool()

This prints a file with `_chi_data_map.csv` in the filename. Lets load it:

In [None]:
df = pd.read_csv("RioAguas_COP30_UTM_chi_data_map.csv")
df.head()

So, what is all this? 

* `latitude` : the latitude in decimal degrees
* `longitude` : the longitude in decimal degrees   
* `chi` : the chi coordinate in metres
* `elevation` : the elevation in the same units as the DEM (usually metres)
* `flow_distance` : the flow distance **from the outlet**.
* `drainage_area` : the drainage area of the pixel in m$^2$
* `source_key` : each source pixel (or channel head) gets an integer key. Channels are coded by their sources. Longer channels overwrite shorter channels so the main stem channel (here defined as the longest channel) will have all pixels with the same `source_key`. 
* `basin_key` : Each basin is given a number denoted by the `basin_key`. The longest channel in each basin has the smallest source key in that basin. 

Now lets look at the extended version of this csv:

In [None]:
## Get the basins and the channel profile
lsdtt_parameters = {"print_chi_data_maps" : "true",
                    "get_basins_from_outlets" : "true",
                    "basin_outlet_csv" : "basin_outlets.csv",
                    "threshold_contributing_pixels" : "2000",
                    "use_extended_channel_data" : "true"}
r_prefix = Dataset_prefix+"_"+source_name +"_UTM"
w_prefix = Dataset_prefix+"_"+source_name +"_UTM"
lsdtt_drive = lsdmw.lsdtt_driver(command_line_tool = "lsdtt-chi-mapping", 
                                 read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()
lsdtt_drive.run_lsdtt_command_line_tool()

In [None]:
!ls *.csv

In [None]:
df = pd.read_csv("RioAguas_COP30_UTM_chi_data_map.csv")
df.head()

The extended data has many more data elements. These are repeated from the channel_data csv. But perhaps most useful is the `stream_order` data entry. 

## What about channel steepness?

If you want to be more complicated, you can use `lsdtt-chi-mapping` to extract the channel steepness. 

This uses the segmentation algorithm from Mudd et al 2014 (JGR-ES, https://doi.org/10.1002/2013JF002981).

To turn it on you need `"print_segmented_M_chi_map_to_csv" : "true"`

Here we set the concavity index (m_over_n) to `0.45`. You actually need to set this for the `get_chi_data_maps` as well but the default is 0.45. 

**Warning: this is computationally intensive and might take a while.**

In [None]:
lsdtt_parameters = {"m_over_n" : "0.45",
                    "print_segmented_M_chi_map_to_csv" : "true",
                    "get_basins_from_outlets" : "true",
                    "basin_outlet_csv" : "basin_outlets.csv"}
r_prefix = Dataset_prefix+"_"+source_name +"_UTM"
w_prefix = Dataset_prefix+"_"+source_name +"_UTM"
lsdtt_drive = lsdmw.lsdtt_driver(command_line_tool = "lsdtt-chi-mapping", 
                                 read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()
lsdtt_drive.run_lsdtt_command_line_tool()