<a href="https://colab.research.google.com/github/LSDtopotools/lsdtt_notebooks/blob/master/lsdtopotools/basic_examples/getting_a_channel_profile.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Getting a channel profile

Last updated by Simon M Mudd on 08/05/2023

You just want to get a channel profile! Or maybe you have some data and want to get information about the channel where you collected the data. 

This takes you through downloading the data and selecting a channel. 

You can either select a basin or a starting point. This shows you both ways to do it. 

This notebook will also show you how to take some field sites with geospatial loactions and map channel information such as elevation and drainage area onto those points. 

## If you are on colab

**If you are in the `docker_lsdtt_pytools` docker container, you do not need to do any of this. 
The following is for executing this code in the google colab environment only.**

If you are in the docker container you can skip to the **First get data** section. 

First we install `lsdtopotools`. The first line downloads the package and the second installs it. The `/dev/null` stuff is just to stop the notebook printing a bunch of text to screen.  

In [None]:
!wget https://pkgs.geos.ed.ac.uk/geos-jammy/pool/world/l/lsdtopotools2/lsdtopotools2_0.9-1geos~22.04.1_amd64.deb  &> /dev/null
!apt install ./lsdtopotools2_0.9-1geos~22.04.1_amd64.deb  &> /dev/null

The next line tests to see if it worked. If you get some output asking for a parameter file then `lsdtopotools` is installed. This notebook was tested on version 0.9.

In [None]:
!lsdtt-basic-metrics -v

Now we install `lsdviztools`:

In [None]:
!pip install lsdviztools  &> /dev/null

## First get data

We need to get some data to download. 

We are going to get some data from the centre of Lesotho, in some small catchements draining to the Orange River. 

We are going to download data using the opentopography scraper that is included with `lsdviztools`. You will need to get an opentopography.org account and copy in your API key. 

You can sign up to an opentopography.org account here: https://portal.opentopography.org/myopentopo 

In [None]:
import lsdviztools.lsdbasemaptools as bmt
from lsdviztools.lsdplottingtools import lsdmap_gdalio as gio

# YOU NEED TO PUT YOUR API KEY IN A FILE
your_OT_api_key_file = "my_OT_api_key.txt"

with open(your_OT_api_key_file, 'r') as file:
    print("I am reading you OT API key from the file "+your_OT_api_key_file)
    api_key = file.read().rstrip()
    print("Your api key starts with: "+api_key[0:4])

Dataset_prefix = "Lesotho"
source_name = "COP30"

Xian_DEM = bmt.ot_scraper(source = source_name,
                        lower_left_coordinates = [-29.986795303183285, 28.210294055430822], 
                        upper_right_coordinates = [-29.546820300795922, 28.636351905601636],
                        prefix = Dataset_prefix, 
                        api_key_file = your_OT_api_key_file)
Xian_DEM.print_parameters()
Xian_DEM.download_pythonic()
DataDirectory = "./"
Fname = Dataset_prefix+"_"+source_name+".tif"
gio.convert4lsdtt(DataDirectory,Fname)

Let's check to see what the filenames we generated are:

In [None]:
!ls Lesotho*

## Look at the hillshade

Right, lets see what this place looks like:

In [None]:
import lsdviztools.lsdmapwrappers as lsdmw

In [None]:
lsdtt_parameters = {"write_hillshade" : "true"}

Dataset_prefix = "Lesotho"
source_name = "COP30"

r_prefix = Dataset_prefix+"_"+source_name +"_UTM"
w_prefix = Dataset_prefix+"_"+source_name +"_UTM"
lsdtt_drive = lsdmw.lsdtt_driver(read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()
lsdtt_drive.run_lsdtt_command_line_tool()

In [None]:
%matplotlib inline
Base_file = r_prefix
DataDirectory = "./"
this_img = lsdmw.SimpleHillshade(DataDirectory,Base_file,cmap="gist_earth", save_fig=False, size_format="geomorphology",dpi=500)

## Get a channel by basin

Lets get a channel by basin. I will make a pandas dataframe with the outlet location (I get this from google maps, just right click where you want it and copy the lat-long) and then create a csv file. 

In [None]:
# Import pandas library
import pandas as pd

data = [ [-29.74268168812215, 28.359698313955146]]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['latitude', 'longitude'])

df.to_csv("basin_outlets.csv",index=False)
df.head()

From this outlet we will extract the basin and also get a channel profile. 
There are various options but the one that includes channel profile alongside drainage area and the chi coordinate (see https://onlinelibrary.wiley.com/doi/abs/10.1002/esp.3302) is `print_chi_data_maps`:

**WARNING** This will not accept basins that touch the edge of the DEM. So you need to put your point a bit upstream of a tributary junction if it joins with a bigger basin that drains to the edge. 

In [None]:
## Get the basins and the channel profile
lsdtt_parameters = {"print_basin_raster" : "true",
                    "print_chi_data_maps" : "true",
                    "get_basins_from_outlets" : "true",
                    "basin_outlet_csv" : "basin_outlets.csv"}
r_prefix = Dataset_prefix+"_"+source_name +"_UTM"
w_prefix = Dataset_prefix+"_"+source_name +"_UTM"
lsdtt_drive = lsdmw.lsdtt_driver(command_line_tool = "lsdtt-chi-mapping", 
                                 read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()
lsdtt_drive.run_lsdtt_command_line_tool()

In [None]:
!lsdtt-chi-mapping Test_01.driver

Okay, lets have a look at the basin we got:

In [None]:
%%capture             
Base_file = r_prefix
basins_img = lsdmw.PrintBasins_Complex(DataDirectory,Base_file,cmap="gist_earth", 
                             size_format="geomorphology",dpi=600, save_fig = True)

In [None]:
print(basins_img)
from IPython.display import display, Image
display(Image(filename=basins_img, width=800))

Okay, now the channel profile is in a csv file:

In [None]:
!ls Lesotho*csv

## Plot channels

We can plot the channels using the command line script from `lsdviztools`:

In [None]:
!lsdtt_plotbasicrasters -dir ./ -fname Lesotho_COP30_UTM -PCh true

This puts the plot in a subdirectory called `raster_plots`:

In [None]:
!ls raster_plots

In [None]:
from IPython.display import display, Image
display(Image(filename="raster_plots/Lesotho_COP30_UTM_ChElevation_chi_channels_and_basins.png", width=800))

We can also do that natively here:

In [None]:
%matplotlib inline
this_curv_img = lsdmw.PrintChannelsAndBasins(DataDirectory,Base_file,
                                       add_basin_labels = True, cmap = "jet", 
                                       size_format = "ESURF", fig_format = "png", 
                                       dpi = 300, save_fig = False)

Or we could load the data as a pandas dataframe and plot the profile:

In [None]:
df = pd.read_csv("Lesotho_COP30_UTM_chi_data_map.csv")
df.head()

In [None]:
import matplotlib.pyplot as plt
plt.scatter(df.flow_distance,df.elevation)

In [None]:
plt.scatter(df.chi,df.elevation)

## What if I don't want a basin but instead want to select a source?

In some cases your channel is cut off by the edge of the DEM but you still want a profile. The basin selection tools look to calculate drainage area and this will be incorrect in an incomplete basin. But you can select a channel by source, allowing you to just get elevation downstream. We need to tell `lsdtopotools` where the source point is, and we do that with a csv file:

In [None]:
# Import pandas library
import pandas as pd

data = [ [-29.574418486660242, 28.262105221820292]]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['latitude', 'longitude'])

df.to_csv("channel_source.csv",index=False)
df.head()

In [None]:
lsdtt_parameters = {"extract_single_channel" : "true", 
                    "channel_source_fname" : "channel_source.csv"}

Dataset_prefix = "Lesotho"
source_name = "COP30"

r_prefix = Dataset_prefix+"_"+source_name +"_UTM"
w_prefix = Dataset_prefix+"_"+source_name +"_UTM"
lsdtt_drive = lsdmw.lsdtt_driver(read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()
lsdtt_drive.run_lsdtt_command_line_tool()

The file from this goes into something called `single_channel_nodes.csv`. We can load it into a pandas dataframe and have a look at it.

In [None]:
df = pd.read_csv("single_channel_nodes.csv")
df.head()

To plot columns from a data frame you need to know the column headers. Sometimes these have white space. So if you use the `list` function you can get the exact headers in order to plot the data. 

In [None]:
list(df)

In [None]:
import matplotlib.pyplot as plt
#plt.scatter(df["flow distance(m)"],df["elevation(m)"])
plt.scatter(df[" flow distance(m)"],df[" elevation(m)"])

## Another example getting channel characteristics from specific sites. 

Here we will download data from another place, this time in Scotland. 

In [None]:
lower_left_coord = [54.725951808268405, -4.3164633347178825]
upper_right_coord = [55.70870947547077, -3.553609989341891]

import lsdviztools.lsdbasemaptools as bmt
from lsdviztools.lsdplottingtools import lsdmap_gdalio as gio

# YOU NEED TO PUT YOUR API KEY IN A FILE
your_OT_api_key_file = "my_OT_api_key.txt"

with open(your_OT_api_key_file, 'r') as file:
    print("I am reading you OT API key from the file "+your_OT_api_key_file)
    api_key = file.read().rstrip()
    print("Your api key starts with: "+api_key[0:4])

Dataset_prefix = "Nith"
source_name = "COP30"

Nith_DEM = bmt.ot_scraper(source = source_name,
                        lower_left_coordinates = lower_left_coord, 
                        upper_right_coordinates = upper_right_coord,
                        prefix = Dataset_prefix, 
                        api_key_file = your_OT_api_key_file)
Nith_DEM.print_parameters()
Nith_DEM.download_pythonic()
DataDirectory = "./"
Fname = Dataset_prefix+"_"+source_name+".tif"
gio.convert4lsdtt(DataDirectory,Fname)

We will extract a single basin, which is the outlet of the River Nith.

In [None]:
# Import pandas library
import pandas as pd

data = [ [55.03554646506262, -3.60572251060801]]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['latitude', 'longitude'])

df.to_csv("basin_outlets.csv",index=False)
df.head()

Now we will get the basin raster, we will also print the hillshade, and get the channel data using a keyword `print_chi_data_maps`. I am also going to control how many tributaries I get. I set the threshold number of pixels I need draining into a given pixel to form a channel. Here I use 5000 pixels. This is a 30 m DEM, so each pixel is 900 m$^2$, so the channel sources will be pixels with 4.5 million squre metres or 4.5 square kilometres in drainage area. I set this with the keyword `"threshold_contributing_pixels" : "5000"`.

In [None]:
import lsdviztools.lsdmapwrappers as lsdmw

## Get the basins and the channel profile
lsdtt_parameters = {"print_basin_raster" : "true",
                    "write_hillshade" : "true",
                    "print_chi_data_maps" : "true",
                    "get_basins_from_outlets" : "true",
                    "threshold_contributing_pixels" : "5000",
                    "basin_outlet_csv" : "basin_outlets.csv",
                    "extend_channel_to_node_before_receiver_junction" : "false"}
r_prefix = Dataset_prefix+"_"+source_name +"_UTM"
w_prefix = Dataset_prefix+"_"+source_name +"_UTM"
lsdtt_drive = lsdmw.lsdtt_driver(command_line_tool = "lsdtt-basic-metrics", 
                                 read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()
lsdtt_drive.run_lsdtt_command_line_tool()

Okay, lets look at the river we extracted:

In [None]:
%%capture             
Base_file = r_prefix
DataDirectory = "./"
basins_img = lsdmw.PrintBasins_Complex(DataDirectory,Base_file,cmap="gist_earth", 
                             size_format="geomorphology",dpi=600, save_fig = True)

In [None]:
print(basins_img)
from IPython.display import display, Image
display(Image(filename=basins_img, width=800))

We can use some plotting tools to see where this is

In [None]:
%matplotlib inline
Base_file = r_prefix
DataDirectory = "./"
this_curv_img = lsdmw.PrintChannelsAndBasins(DataDirectory,Base_file,
                                       add_basin_labels = True, cmap = "jet", 
                                       size_format = "ESURF", fig_format = "png", 
                                       dpi = 300, save_fig = False)

Now say you had some sites and you wanted to find the drainage characteristics of those sites. I'm going to make the sites. The file, at minimum, need a `latitude` and `longitude` column. Make sure those are lower case:

In [None]:
import csv

data = [
    ["site","Easting","Northing","longitude","latitude"],
    ["Scar0001",275911,602987,-3.956336,55.305283],
    ["Aftn0001",263158,608025,-4.159543,55.347139],
    ["Camp0001",287165,594055,-3.775567,55.227772],
    ["Craw0001",282305,617799,-3.86191,55.439879],
    ["Dlwt0001",271106,595647,-4.028614,55.238134],
    ["Menn0001",283787,609803,-3.835166,55.368422],
    ["Nith010",291286,586310,-3.707828,55.159133],
    ["Nith053",259689,613818,-4.217112,55.398175],
    ["Nith027",272142,612363,-4.019952,55.388515],
    ["Nith0001",253724,609294,-4.30885,55.355815]
]

filename = "nith_sites.csv"

with open(filename, mode='w', newline='') as file:
    writer = csv.writer(file)
    writer.writerows(data)

print(f"Data has been successfully written to {filename}")

In [None]:
!cat nith_sites.csv

Now we need to import some stuff

In [None]:
import geopandas as gpd
import numpy as np
import pandas as pd

from scipy.spatial import cKDTree
from shapely.geometry import Point

We have two datasets. One is the channel data and the other is the site locations. This second dataset could be any set of points.

We will, in the next step, merge these datasets based on the nearest neighbour to one of the set of points (i.e., mapping channel data to the nearest site).

For this to work, the two datasets must be in the same coordinate reference system. For this example it is not really a problem because both datasets have coordinates in a global reference frame with the code EPSG:4326. In the example below, we use .crs to define the coordinate reference system. 

However, sometimes you might have a data set with another cooridante system (fore example British National Grid, which is EPSG:27700, so you would need to change the corresponding EPSG code. You can look up the EPSG code for a coordinate system with a google search. 

In [None]:
# Load the channel data
dfA = pd.read_csv("Nith_COP30_UTM_chi_data_map.csv")
# Convert to a geopandas dataframe
gdfA = gpd.GeoDataFrame(
    dfA, geometry=gpd.points_from_xy(dfA.longitude, dfA.latitude))
# We have to tell the geopandas data what geographic system we are in by using something called an EPSG code. 
# All major geographic projection and transformation system have this code. 
gdfA.crs = "EPSG:4326" 


# Load the width data
dfB = pd.read_csv("nith_sites.csv")
gdfB = gpd.GeoDataFrame(
    dfB, geometry=gpd.points_from_xy(dfB.longitude, dfB.latitude))
# We have to tell the geopandas data what geographic system we are in by using something called an EPSG code. 
# All major geographic projection and transformation system have this code. 
gdfB.crs = "EPSG:4326" 

# IMPORTANT: we convert one of the datasets to the coordinate reference system of the other
gdfC = gdfB.to_crs(4326)

I now need to add a function for combining datasets. **You don't need to change anything in this function.** The first dataframe keeps its data elements and adds properties from the nearest neighbour that are closest to the points in the first dataframe.

In [None]:
def ckdnearest(gdA, gdB):

    nA = np.array(list(gdA.geometry.apply(lambda x: (x.x, x.y))))
    nB = np.array(list(gdB.geometry.apply(lambda x: (x.x, x.y))))
    btree = cKDTree(nB)
    dist, idx = btree.query(nA, k=1)
    gdB_nearest = gdB.iloc[idx].drop(columns="geometry").reset_index(drop=True)
    gdf = pd.concat(
        [
            gdA.reset_index(drop=True),
            gdB_nearest,
            pd.Series(dist, name='dist')
        ], 
        axis=1)

    return gdf

Now we merge the two files. 

In [None]:
new_gdp = ckdnearest(gdfC, gdfA)
new_gdp.head(10)

Super! Now we can print this new dataset to a file using the .to_csv function:

In [None]:
new_gdp.to_csv("updated_nith_site_infomations.csv")