<a href="https://colab.research.google.com/github/simon-m-mudd/smm_teaching_notebooks/blob/master/Basic_topography/Lesson_04_basic_topographic_features.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lesson 04: Basic topographic features

*This lesson made by Simon M Mudd and last updated 18/09/2023*

In this lesson we are going to do some basic analysis of topography. There are a lot of different software tools for doing this, for example:
* [Whitebox](https://www.whiteboxgeo.com/download-whiteboxtools/)
* [TopoToolbox](https://topotoolbox.wordpress.com/)
* [SAGA](http://www.saga-gis.org/en/index.html)

However, for this example we will use [`lsdtopotools`](https://github.com/LSDtopotools) because the person writing this lesson is also the lead developer of that software.

Instructions for installing `lsdtopotools` on colab are below, if you are doing this in the University of Edinburgh's Notable environment then the software is already installed.

The objective of this practical is to give you a taster of what kinds of things you might do with topographic data.

## Stuff we need to do if you are in colab (not required in the lsdtopotools pytools container or in notable)

**If you are in the `docker_lsdtt_pytools` docker container, you do not need to do any of this.
The following is for executing this code in the google colab environment only.**

If you are in the docker container you can skip to the **First get data** section.

First we install `lsdviztools`. This will take around a minute. It is important you do this before the `condacolab` step.

In [None]:
!pip install lsdviztools &> /dev/null

Now we need to install lsdtopotools. We do this using something called `mamba`. To get `mamba` we install something called `condacolab`.

In [None]:
!pip install -q condacolab
import condacolab
condacolab.install()

Now use mamba to install `lsdtopotools`. For this lesson you also need `gdal` and `proj`
This step takes a bit over a minute.

In [None]:
!mamba install -y gdal proj lsdtopotools &> /dev/null

And now we need to fix an annoying bug with `gdal`.

In [None]:
import os
os.environ['PROJ_LIB'] = '/usr/local/share/proj/'
print(os.getenv('PROJ_LIB'))

## First import some stuff we need

First we make sure lsdviztools version is updated (it needs to be > 0.4.7):

**(you don't need to do this in colab, you already have the latest version)**

In [None]:
!pip install lsdviztools --upgrade

Now we import a bunch of stuff

In [None]:
import lsdviztools.lsdbasemaptools as bmt
from lsdviztools.lsdplottingtools import lsdmap_gdalio as gio
import lsdviztools.lsdmapwrappers as lsdmw
import pandas as pd
import geopandas as gpd
import cartopy as cp
import cartopy.crs as ccrs
import rasterio as rio
import matplotlib.pyplot as plt
import numpy as np

## If you are in colab, get the data (again)

In colab you need to re-download the data in each session.

In notable the data has persistance, so you don't need to re-download.

You will need to copy `my_OT_api_key.txt` again here. See lesson 01 to see how to to that.

In [None]:
lower_left = [36.990554387425014, -2.318307057720176]
upper_right = [37.23367133834253, -1.8425313329873874]

your_OT_api_key_file = "my_OT_api_key.txt"

with open(your_OT_api_key_file, 'r') as file:
    print("I am reading you OT API key from the file "+your_OT_api_key_file)
    api_key = file.read().rstrip()
    print("Your api key starts with: "+api_key[0:4])

# This downloads Copernicus 30m DEM
Aguas_DEM = bmt.ot_scraper(source = "COP30",
                           lower_left_coordinates = lower_left,
                           upper_right_coordinates = upper_right,
                           prefix = "rio_aguas",
                           api_key_file = your_OT_api_key_file)
Aguas_DEM.print_parameters()
Aguas_DEM.download_pythonic()

## Data preprocessing

For various historical reasons, **lsdtopotools** does not read *GeoTiff* format, so we need to convert any rasters to [ENVI bil](https://www.l3harrisgeospatial.com/docs/enviimagefiles.html#:~:text=The%20ENVI%20image%20format%20is,an%20accompanying%20ASCII%20header%20file.&text=Band%2Dinterleaved%2Dby%2Dline,to%20the%20number%20of%20bands.) format. **This is not the same as ESRI bil! MAKE SURE YOU USE ENVI BIL!!**

You could convert any file to `ENVI bil` format using `gdalwarp` and then including the parameter `-of ENVI` (`of` stands for output format) but `lsdviztools` has some built in functions for doing that for you in python.

We are going to use the Copernicus 30 dataset (from the last lesson) for this lesson, and here is the conversion syntax:

In [None]:
DataDirectory = "./"
RasterFile = "rio_aguas_COP30.tif"
gio.convert4lsdtt(DataDirectory, RasterFile,minimum_elevation=0.01,resolution=30)

**You can also convert with gdal. But the above script has some advantages:**

1. It figures out the UTM zone for you.
2. It handles the no data pixels (places in the DEM where you don't have data) better than gdal.

However, if you wanted to use gdal you would have a command like this:

```
!gdalwarp -t_srs EPSG:32630 rio_aguas_COP30.tif -r cubic -tr 30 30 -of ENVI rio_aguas_COP30_UTM.bil
```

You can search for specific files using the `!ls` command, so we can look for the file that has been created.

There is a `.tif` file from the last lesson, but the two files with extensions `.bil` and `.hdr` are from the conversion. ENVI bil files have all the data in the `.bil` file and all the georeferencing in the `.hdr` file. The `.hdr` file is an ascii file so you can easily open these files in a text editor and see all the important metadata.

In [None]:
!ls rio_aguas_COP30_UTM*

# Now lets do some basic topographic analysis

Now will extract some topographic metrics using `lsdtopotools`.

The `lsdtt_parameters` are the various parameters that you can use to run an analysis. We will discuss these later. For now, we will just follow this recipe.

In [None]:
lsdtt_parameters = {"write_hillshade" : "true",
                    "surface_fitting_radius" : "60",
                    "print_slope" : "true"}
lsdtt_drive = lsdmw.lsdtt_driver(read_prefix = "rio_aguas_COP30_UTM",
                                 write_prefix= "rio_aguas_COP30_UTM",
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()

In [None]:
lsdtt_drive.run_lsdtt_command_line_tool()

# Plot some data

We are now going to do some simple plots using a mapping package that we put together. There are more general ways to visualise data, but this makes pretty pictures quickly.  

In [None]:
%matplotlib inline
Base_file = "rio_aguas_COP30_UTM"
DataDirectory = "./"
this_img = lsdmw.SimpleHillshade(DataDirectory,Base_file,cmap="gist_earth", save_fig=True, size_format="geomorphology", dpi=600)

In [None]:
from IPython.display import Image
Image('rio_aguas_COP30_UTM_hillshade.png')

In [None]:
Base_file = "rio_aguas_COP30_UTM"
Drape_prefix = "rio_aguas_COP30_UTM_SLOPE"
DataDirectory = "./"
img_name2 = lsdmw.SimpleDrape(DataDirectory,Base_file, Drape_prefix,
                              cmap = "bwr", cbar_loc = "right",
                              cbar_label = "Gradient (m/m)",
                              save_fig=True, size_format="geomorphology",
                              colour_min_max = [0,1.25])

In [None]:
from IPython.display import Image
Image('rio_aguas_COP30_UTM_drape.png')

# Get some channel profiles

Okay, we will now run a different analysis. We will get some channel profiles.

In [None]:
lsdtt_parameters = {"print_basin_raster" : "true",
                    "print_chi_data_maps" : "true"}
lsdtt_drive = lsdmw.lsdtt_driver(read_prefix = "rio_aguas_COP30_UTM",
                                 write_prefix= "rio_aguas_COP30_UTM",
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()

In [None]:
lsdtt_drive.run_lsdtt_command_line_tool()

We can look to see what files we have using the following command. the `!` tells this notebook to run a command on the underlying linux operating system, and `ls` in linux is a command to list files.

In [None]:
!ls

The file with the channels is the one with `chi_data_map` in the filename. We are going to load this into a `pandas` dataframe. You can think of `pandas` as a kind of excel for python. It does data handling of spreadsheet-like information (and loads more.)

In [None]:
df = pd.read_csv("rio_aguas_COP30_UTM_chi_data_map.csv")
df.head()

Okay, now let's look at what we got. This script allows you to plot the channels.

In [None]:
# Look at the data frame above and see if you can change the plotting_column to something else like the

%matplotlib inline
fname_prefix = "rio_aguas_COP30_UTM"
ChannelFileName = "rio_aguas_COP30_UTM_chi_data_map.csv"
DataDirectory = "./"
lsdmw.PrintChiChannelsAndBasins(DataDirectory,fname_prefix, ChannelFileName,
                                add_basin_labels = True, cmap = "jet", cbar_loc = "right",
                                colorbarlabel = "elevation (m)", size_format = "ESURF", fig_format = "png",
                                dpi = 400,plotting_column = "elevation")

Image('rio_aguas_COP30_UTM_chi_channels_and_basins.png')

## Looking at individual channels using pandas

Okay, lets look at some individual channels. We can do this by selecting data in the dataframe.

In [None]:
# First lets isolate just one of these basins. There is only basin 0 and 1
df_b1 = df[(df['basin_key'] == 0)]
df_b1.head()

We can plot this channel:

In [None]:
plt.rcParams['figure.figsize'] = [10, 5]

# First lets isolate just one of these basins. There is only basin 0 and 1
df_b1 = df[(df['basin_key'] == 0)]

# The main stem channel is the one with the minimum source key in this basin
min_source = np.amin(df_b1.source_key)
df_b2 = df_b1[(df_b1['source_key'] == min_source)]

# Now make channel profile plots
z = df_b2.elevation
x_locs = df_b2.flow_distance

# Create two subplots and unpack the output array immediately
plt.clf()
f, (ax1) = plt.subplots(1, 1)
ax1.scatter(x_locs, z,s = 0.2)

ax1.set_xlabel("Distance from outlet ($m$)")
ax1.set_ylabel("elevation (m)")

plt.tight_layout()

Maybe you want to know the slope of this channel. You can do this by using the `numpy` gradient function.

In [None]:
z = df.elevation
x = df.flow_distance
S = np.gradient(np.asarray(z),np.asarray(x))
df["slope"] = S
df.head()

Now we plot this. It is very similar to the plot above but now has the slope

In [None]:
plt.rcParams['figure.figsize'] = [10, 10]

# First lets isolate just one of these basins. There is only basin 0 and 1
df_b1 = df[(df['basin_key'] == 0)]

# The main stem channel is the one with the minimum source key in this basin
min_source = np.amin(df_b1.source_key)
df_b2 = df_b1[(df_b1['source_key'] == min_source)]

# Now make channel profile plots
z = df_b2.elevation
x_locs = df_b2.flow_distance
S = df_b2.slope

# Create two subplots and unpack the output array immediately
plt.clf()
f, (ax1, ax2) = plt.subplots(2, 1)
ax1.scatter(x_locs, z,s = 0.2)
ax2.scatter(x_locs, S,s = 1,c="r")

ax1.set_xlabel("Distance from outlet ($m$)")
ax1.set_ylabel("elevation (m)")

ax2.set_xlabel("Distance from outlet ($m$)")
ax2.set_ylabel("Slope (m/m)")

plt.tight_layout()

This slope (bottom figure) is very noisy. One way to deal with this is to smooth the data. We can smooth the data by running a moving window over it and doing some averaging inside the window.

Python has lots of tools for this. In this case I use a `rolling` window and I have picked various settings. You don't need to worry about this too much, the only number that you might want to play with is the first number after `rolling` which is the number of datapoints in the window. The bigger this number, the more smoothed the data becomes.

In [None]:
df['slope_rolling'] = df.slope.rolling(40,win_type='hamming').mean()
df.head()

Lets have a look at what this smoothing has done.

In [None]:
plt.rcParams['figure.figsize'] = [10, 5]

# First lets isolate just one of these basins. There is only basin 0 and 1
df_b1 = df[(df['basin_key'] == 0)]

# The main stem channel is the one with the minimum source key in this basin
min_source = np.amin(df_b1.source_key)
df_b2 = df_b1[(df_b1['source_key'] == min_source)]

# Now make channel profile plots
z = df_b2.elevation
x_locs = df_b2.flow_distance
S = df_b2.slope
SR = df_b2.slope_rolling

# Create two subplots and unpack the output array immediately
plt.clf()
f, (ax1) = plt.subplots(1, 1)
ax1.scatter(x_locs, S,s = 0.2, label = "slope")
ax1.scatter(x_locs, SR,s = 1,c="r", label = "rolling slope")

ax1.set_xlabel("Distance from outlet ($m$)")
ax1.set_ylabel("Slope and rolling slope (m/m)")

plt.legend()


plt.tight_layout()

Now we can compare the channel profile to the channel gradients and see if the channel gradient is steep where you think it might be.

In [None]:
plt.rcParams['figure.figsize'] = [10, 5]

# First lets isolate just one of these basins. There is only basin 0 and 1
df_b1 = df[(df['basin_key'] == 0)]

# The main stem channel is the one with the minimum source key in this basin
# If you want to play with this a bit you can change the source number to look at different channels
min_source = np.amin(df_b1.source_key)
df_b2 = df_b1[(df_b1['source_key'] == min_source)]

# Now make channel profile plots
z = df_b2.elevation
x_locs = df_b2.flow_distance
S = df_b2.slope
SR = df_b2.slope_rolling

# Create two subplots and unpack the output array immediately
plt.clf()
f, (ax1) = plt.subplots(1, 1)
ax2 = ax1.twinx()  # instantiate a second axes that shares the same x-axis

# Make the scatter plots
ax1.scatter(x_locs, z,s = 1, label='Longitudinal profile')
ax2.scatter(x_locs, SR,s = 1,c="r", label='Channel slope')

# Some code to make sure the legend renders on the same axis
lines, labels = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax2.legend(lines + lines2, labels + labels2, loc=0)

ax1.set_xlabel("Distance from outlet ($m$)")
ax1.set_ylabel("Elevation (m)")

ax2.set_xlabel("Distance from outlet ($m$)")
ax2.set_ylabel("Rolling Slope (m/m)")

plt.tight_layout()

Okay, so now you have some very basic experience in getting some topographic metrics (Here gradient. You could also extract curvature). You also have extracted some basins and can see where they are, and you can look at some channel characteristics.

## What you should have learned and potential modifications

You should now have seen

* How to extract some basins and channels.
* How to look at some data in a csv and select and subset the data (e.g., by using `df_b1 = df[(df['basin_key'] == 0)]` syntax).
* How to plot some channel profiles and get channel slope.

Next steps

* Trying picking another channel (either with a different basin or with a different source key) and repeating the steps.