# Getting data for southern Spain

*This lesson has been written by Simon M. Mudd at the University of Edinburgh*

*Last update 02/03/2021*

In this notebook we will grab some data from southern Spain using a python package called `lsdviztools`. 
We also will do a little bit of topographic analysis using **lsdtopotools**. **lsdtopotools** is a software package for analysing topography developed at the University of Edinburgh and other institutions. 

We are assuming you are on a Notable notebook via the University of Edinburgh's Learn pages. This already has **lsdtopotools** installed. If this is not the case, you will need to use conda to install that yourself. 

## Get the right python packages

In [None]:
!pip install lsdviztools

In [None]:
import lsdviztools.lsdbasemaptools as bmt
from lsdviztools.lsdplottingtools import lsdmap_gdalio as gio
import lsdviztools.lsdmapwrappers as lsdmw
import pandas as pd
import geopandas as gpd
import cartopy as cp
import cartopy.crs as ccrs
import rasterio as rio
import matplotlib.pyplot as plt
import numpy as np

## Now grab some data

First, we need to grab some data. We use a tool in `lsdviztools.lsdbasemaptools` called the `ot_scraper` (`ot` is for opentopography.org).

You can tell this what sort of data you want (most people will use the SRTM 30 metre data) and you also tell it the lower left and the upper right corners, using latitude and longitude. You can get these from goole earth by right clicking on the map and selecting "what's here". In this example below, I will just get a small area near Sorbas. But you might expand your search area. 

In [None]:
# If you want to modify the DEM, change the bounding latitude and longitude. 
Dataset_prefix = "Sorbas_v2"
Sorbas_DEM = bmt.ot_scraper(source = "SRTM30",longitude_W = -2.3, longitude_E = -2, 
                            latitude_S = 37.1, latitude_N = 37.25,prefix = Dataset_prefix)
Sorbas_DEM.print_parameters()
Sorbas_DEM.download_pythonic()

That just downloaded a .tif file, which you could look at in a GIS. 
You can also look at the raw data here in python using something called `rasterio`. 

But we will use some tools developed at the University of Edinburgh, called **lsdtopotools**, to look at the data. 

To do that, we need to convert the data into a format **lsdtopotools** can understand. 

The option below only works on systems with a recent version of a package call `proj`. The noteable service has an old version so we need to convert the data a different way. I just include the below lines for future reference. 

In [None]:
## IMPORTANT: This doens't work in noteable!! It doesn't read the proj database
#DataDirectory = "./"
#RasterFile = "Sorbas_SRTM30.tif"
#gio.convert4lsdtt(DataDirectory, RasterFile,minimum_elevation=0.01,resolution=30)

The option below uses something called GDAL (short for Geospatial Data Abstraction Library). 
It has many tools for converting raster data formats and changing projections. It is also fast: it is much faster to convert or merge files using gdal than in, say, ArcMap. QGIS has the gdal tools built in. You can read more here: https://lsdtopotools.github.io/LSDTT_documentation/LSDTT_introduction_to_geospatial_data.html#translating-your-raster-into-something-that-can-be-used-by-lsdtopotoolbox

In the below line the main thing you would change is the zone: this is the UTM zone. You can look up your UTM zone here: http://www.dmap.co.uk/utmworld.htm

In [None]:
import subprocess

# Below is the resolution of your data in metres
res = "30"

# Change the below UTM zone to the correct zone for your dataset
pr = "+proj=utm +zone=30 +datum=WGS84"

# Dont change the stuff below this line
gd = "gdalwarp"
sr = "-t_srs"
of = "-of"
en = "ENVI"
ov = "-overwrite"
tr = "-tr"
r = "-r"
rm = "cubic"
infile = Dataset_prefix+"_SRTM30.tif"
outfile = Dataset_prefix+"_SRTM30_UTM.bil"

subprocess_list = [gd, sr, pr, of, en, ov, tr, res, res, r, rm, infile, outfile]

subprocess.run(subprocess_list)

# Below is the gdalwarp call that is not used but is here for future reference
#!gdalwarp -t_srs '+proj=utm +zone=30 +datum=WGS84' -of ENVI -overwrite Sorbas_SRTM30.tif Sorbas_SRTM30_UTM.bil

## Now we do some topographic analysis and look at the data

We will extract some topogaphic metrics using `lsdtopotools`. 
This is already installed on the Noteable GeoScience Notebooks.

The `lsdtt_parameters` are the various parametes that you can use to run an analysis. We will discuss these later. For now, we will just follow this recipie. 

In [None]:
lsdtt_parameters = {"write_hillshade" : "true",  
                    "surface_fitting_radius" : "60",
                    "print_slope" : "true"}
r_prefix = Dataset_prefix+"_SRTM30_UTM"
w_prefix = Dataset_prefix+"_SRTM30_UTM"
lsdtt_drive = lsdmw.lsdtt_driver(read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()


In [None]:
lsdtt_drive.run_lsdtt_command_line_tool()

That will take a little while to run (wait until there is a number in brackets beside the `In` to the right of the cell. If it says `In [*]` that means the cell is still running.)

It will spit out some files. If you want to see what files are in this directory after the cell has finished you can run the following cell:

In [None]:
!ls

## Plot some data

We are now going to do some simple plots using a mapping package that we put together. There are more general ways to visualise data, but this makes pretty pictures quickly.  

The `Base_file` on line 2 in the cell below is the prefix of the DEM you are using. That is, it is the filename without the `.bil` extension. 

In [None]:
%matplotlib inline
Base_file = Dataset_prefix+"_SRTM30_UTM"
DataDirectory = "./"
this_img = lsdmw.SimpleHillshade(DataDirectory,Base_file,cmap="gist_earth", 
                                 save_fig=False, size_format="geomorphology")

We can also plot the slope map

In [None]:
Base_file = Dataset_prefix+"_SRTM30_UTM"
Drape_prefix = Dataset_prefix+"_SRTM30_UTM_SLOPE"
DataDirectory = "./"
img_name2 = lsdmw.SimpleDrape(DataDirectory,Base_file, Drape_prefix, 
                              cmap = "bwr", cbar_loc = "right", 
                              cbar_label = "Gradient (m/m)",
                              save_fig=False, size_format="geomorphology",
                              colour_min_max = [0,1.25])

## Get some channel profiles

Okay, we will now run a different analysis. We will get some channel profiles. 

In [None]:
lsdtt_parameters = {"print_basin_raster" : "true",
                    "print_chi_data_maps" : "true",
                    "minimum_basin_size_pixels" : "5000"}
r_prefix = Dataset_prefix+"_SRTM30_UTM"
w_prefix = Dataset_prefix+"_SRTM30_UTM"
lsdtt_drive = lsdmw.lsdtt_driver(read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()

In [None]:
lsdtt_drive.run_lsdtt_command_line_tool()

Now we are going to read in the channel data using geopandas. 

After this finishes running it will produce a csv file. 

These csv files can be read here, in the python environment. But they can also be loaded into a GIS by importing text data. You can make a file that is immediatly readable using a GIS by adding the option `convert_csv_to_geojson` to the parameters, so the command two cells up would become:

    lsdtt_parameters = {"print_basin_raster" : "true",
                        "print_chi_data_maps" : "true",
                        "convert_csv_to_geojson" : "true",
                        "minimum_basin_size_pixels" : "5000"}
                        
The disadvantage of this is that geojson files are much larger than csv. And you really only use these files with a GIS: if you use python you will be using the csv file. 

Okay, the cells below load a csv file using first `pandas` and then converts it into `geopandas` (which is `pandas` with georeferencing). 

The above routines print some data about the channels, and it also prints the basin information. The basins have different numbers. You can see where they are with this command: 

In [None]:
DataDirectory = "./"
Base_file = Dataset_prefix+"_SRTM30_UTM"
basin_img = lsdmw.PrintBasins_Complex(DataDirectory,Base_file,
                                        use_keys_not_junctions = True, 
                                        show_colourbar = False,cmap = "jet", 
                                        colorbarlabel = "colourbar", size_format = "geomorphology",
                                        fig_format = "png", dpi = 250, 
                                        include_channels = False, label_basins = True,
                                        save_fig=False)

In [None]:
chi_data_map_name = w_prefix+"_chi_data_map.csv"
df = pd.read_csv(chi_data_map_name)
gdf = gpd.GeoDataFrame(
    df, geometry=gpd.points_from_xy(df.longitude, df.latitude))
gdf.crs = "EPSG:4326" 
print(gdf.head())

In [None]:
bounds = gdf.total_bounds
print(bounds)

We will plot these data on a map using something called cartopy. 

If you want to plot only one basin uncomment line 26 and select a basin.

In [None]:
import matplotlib.pyplot as plt
from matplotlib.transforms import offset_copy
import cartopy.crs as ccrs
import cartopy.io.img_tiles as cimgt
plt.rcParams['figure.figsize'] = [10, 10]

stamen_terrain = cimgt.Stamen('terrain-background')

fig = plt.figure()

# Create a GeoAxes in the tile's projection.
ax = fig.add_subplot(1, 1, 1, projection=stamen_terrain.crs)

# Limit the extent of the map to a small longitude/latitude range.
ax.set_extent([bounds[0]-0.2, bounds[2]+0.2, bounds[1]-0.1, bounds[3]+0.1], crs=ccrs.Geodetic())

# Add the Stamen data at zoom level 11.
ax.add_image(stamen_terrain, 11)

# Add the channel data
gdf2 = gdf.to_crs(epsg=3857)    # We have to convert the data to the same 
                               #system as the ap tiles. It happens to be this one. 
                               # This epsg code is used for all map tiles (like google maps)

# IF YOU WANT TO PLOT ONE BASIN, UNCOMMENT THE LINE BELOW        
#gdf2 = gdf2[(gdf2['basin_key'] == 5)]

gdf2.plot(ax=ax, markersize=0.5, column='chi', zorder=10,cmap="jet")

Now to plot the channels in profile

In [None]:
plt.rcParams['figure.figsize'] = [10, 10]

# First lets isolate just one of these basins. There is only basin 0 and 1
gdf_b1 = gdf[(gdf['basin_key'] == 0)]

# The main stem channel is the one with the minimum source key in this basin
min_source = np.amin(gdf_b1.source_key)
gdf_b2 = gdf_b1[(gdf_b1['source_key'] == min_source)]
#gdf_b2 = gdf_b1

# Now make channel profile plots
z = gdf_b2.elevation
x_locs = gdf_b2.flow_distance
chi = gdf_b2.chi

# Create two subplots and unpack the output array immediately
plt.clf()
f, (ax1, ax2) = plt.subplots(2, 1)
ax1.scatter(x_locs, z,s = 0.2)
ax2.scatter(chi, z,s = 0.2)


ax1.set_xlabel("Distance from outlet ($m$)")
ax1.set_ylabel("elevation (m)")

ax2.set_xlabel("$\chi$ ($m$)")
ax2.set_ylabel("elevation (m)")

plt.tight_layout()

## Getting the full channel profile with steepness information

In lesson 2, we are going to work with a file that has the extension `_MChiSegmented.csv`. 
To generate this file, you need to run the following command.

**Warning**: This is quite computationally expensive so if you have a big area it will take a while for this routine to finish. 

If you want to play with this data in a GIS, you would change line 2 of the cell below to:

    lsdtt_parameters = {"print_segmented_M_chi_map_to_csv" : "true", 
                        "print_basin_raster" : "true",
                        "convert_csv_to_geojson" : "true"}

In [None]:
command_line_tool = "lsdtt-chi-mapping"
lsdtt_parameters = {"print_segmented_M_chi_map_to_csv" : "true", 
                    "print_basin_raster" : "true"}

r_prefix = Dataset_prefix+"_SRTM30_UTM"
w_prefix = Dataset_prefix+"_SRTM30_UTM"
lsdtt_drive = lsdmw.lsdtt_driver(command_line_tool,read_prefix = r_prefix,
                                 write_prefix= w_prefix,
                                 read_path = "./",
                                 write_path = "./",
                                 parameter_dictionary=lsdtt_parameters)
lsdtt_drive.print_parameters()
lsdtt_drive.run_lsdtt_command_line_tool()

We can use the `!ls` command to see if the file is there (the `!` tells the notebook to access the underlying linux operating system, and `ls` is a linux command that lists (`ls` is short for list) the contents of the current directory).

In [None]:
!ls

Okay, the data is there, it is called `SorbasChi_MC_MChiSegmented.csv`. Lets load it with `geopandas`.

In [None]:
segmented_data_map_name = w_prefix+"_MChiSegmented.csv"
df = pd.read_csv(segmented_data_map_name)
gdf = gpd.GeoDataFrame(
    df, geometry=gpd.points_from_xy(df.longitude, df.latitude))
gdf.crs = "EPSG:4326" 
print(gdf.head())

Lets plot the points

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
from matplotlib.transforms import offset_copy
import cartopy.crs as ccrs
import cartopy.io.img_tiles as cimgt
plt.rcParams['figure.figsize'] = [10, 10]


bounds = gdf.total_bounds
stamen_terrain = cimgt.Stamen('terrain-background')

fig = plt.figure()

# Create a GeoAxes in the tile's projection.
ax = fig.add_subplot(1, 1, 1, projection=stamen_terrain.crs)

# Limit the extent of the map to a small longitude/latitude range.
ax.set_extent([bounds[0]-0.05, bounds[2]+0.05, bounds[1]-0.05, bounds[3]+0.05], crs=ccrs.Geodetic())

# Add the Stamen data at zoom level 11.
ax.add_image(stamen_terrain, 11)

# Add the channel data
gdf2 = gdf.to_crs(epsg=3857)    # We have to convert the data to the same 
                               #system as the ap tiles. It happens to be this one. 
                               # This epsg code is used for all map tiles (like google maps)

# IF YOU WANT TO PLOT ONE BASIN, UNCOMMENT THE LINE BELOW        
#gdf2 = gdf2[(gdf2['basin_key'] == 0)]
gdf2.plot(ax=ax, markersize=0.5, column='m_chi', zorder=10,cmap="jet")