<a href="https://colab.research.google.com/github/laura-turnbull-lloyd/STDH_teaching/blob/main/STDH2324_Lecture7_Morakot_landslides.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Landslides following Typhoon Morakot and their effectiveness

Typhoon Morakot was the deadliest typhoon in the recorded history of Taiwan, and formed over the open Pacific Ocean within a monsoon trough on August 2, 2009.


##Generating a landslide inventory using GEE
First of all, you are going to use google earth engine (GEE) to create your own landslide inventory (putting into practice what you learned in the first part of the module).

You will use ALDI (the Automatic Landslide Detection Index) (Milledge et al 2021) in GEE, which is based on detecting changes in vegetation cover using the Normalized Difference Vegetation Index. The GEE code to run ALDI is provided for you here: https://code.earthengine.google.com/4b2e372be0cdd10ba2b088bae2d97721

To run ALDI for Taiwan, you need to modify the code, making the changes/additions detailed below. First I suggest you scroll down the script to gain an appreciation of what the code does/see which image collections it uses etc.


###Set the study area of interest.
This is done by specifying the coordinates (longitude, latitude) of the bottom left and upper right corners of a bounding box that encompasses the study area (lines 19 and 20 of the script).

Change the lower left coordinate to be ‘120.05, 21.6’ and change the upper right coordinate to be ‘121.4, 23.7’.



### Set the start and end of the trigger event.
You then have to set the start and end date of the trigger event in the format YY-MM-dd (year-month-day) (lines 29 and 30 of the script):

	Start date: 2nd August 2009

	End date: 13th August 2009

The basic ALDI Google Earth Engine Script produces a raster map of ALDI which you can download. The key thing we’re interested in is where landslides occurred, so we can add in some extra routines to undertake some postprocessing of the data in Google Earth Engine, which will make it quicker to download the outputs of the analysis to your google drive.

We’re also only going to focus the analysis on the southern part of Taiwan (within the Region of Interest, specified by the bounding box at the start), where rainfall intensities were highest during Morakot. To do this, we need to clip the ALDI map to a the bounding box. At the end of the script, past the following

```
var ALDIclip = ALDI.clip(AOI);
```
###Mask ALDI less than 0.085.
You’re going to map out landslides, assuming that an ALDI value of 0.085 or greater corresponds to a typhoon-driven landslide scar. To do this, you will mask out all ALDI values that are less than 0.085.
At the end of the script paste the following:
```
// Postprocessing of ALDI results
// Remap values
var ALDImasked = ALDIclip.updateMask(ALDIclip.gte(0.085));
```
These values are real values (have are decimal) and therefore to get a simple identifier of whether or not a landslide occurred, we can convert the masked values into integer (whole number) values, so that the whole area of a landslide scar will have the same numeric identifier. To do this, append the following code to your script:
```
var floatToInt = ALDImasked.toInt()
```
And then to visualize the extracted landslides:
```
Map.addLayer(floatToInt, {color: 'FFFFFF'}, 'landslides085');
```
###Converting the landslide pixels into polygons:
The next step is to convert the pixels identified as landslides to a vector (feature class) and visualise the result on the map.
```
var vector = floatToInt.reduceToVectors({
  reducer: ee.Reducer.countEvery(),
  geometry: AOI,
  geometryType: 'polygon',
  scale: 30,
  maxPixels: 1e8
});
var result = ee.FeatureCollection(vector);
Map.addLayer(result, {color: 'red'}, 'landslide_polygons');
```
Finally, you will export the landslide features to your google drive. Paste the following code at the end of the script:

```
Export.table.toDrive({collection: result, description: 'landslides_Morakot', fileFormat: 'GeoJSON'});
```

**You can now run the script to detect landslides following Typhoon Morakot.**

To run the code, press ‘Run’. Any errors will be shown in the console window. Note that it might take a few minutes for the landslides polygons to display on the map.

Take the time to zoom into the map, toggle on and off layers to get a sense of how well the algorithm has detected areas that have had landslides. In particular, toggle on and off the pre and post-event Landsat 7 false colour composite and zoom in to see where landslides have occurred. Then turn on the landslide polygons to see how well they’ve picked out the landslide areas that you can see.

Once the code has finished running, the layers will load, and files available to download will be shown in Tasks. Click on ’Run’ next to  ‘Landslides_Morakot’ to save the file to your google drive (leave the name and format as they are).

You have now created the landslide inventory in GEE for Typhoon Morakot.

**How well do you think the ALDI algorithm has mapped out Typhon Morakot landslides?**

**What are some pros and cons of this approach?**


##Importing your landslide inventory into python
To import the landslide data to google colab, we first need to install and load the packages that we will use:

In [None]:
!pip install pandas fiona shapely pyproj rtree

import pandas as pd
# to make sure that pandas outputs values in an easy to read format, we can customise the output format:
pd.set_option('display.float_format', lambda x: '%.5f' % x)

import numpy as np
import matplotlib.pyplot as plt
!pip install geopandas
import geopandas as gpd
import shapely
import pyproj
#import rtree

import seaborn
from shapely.geometry import Polygon

Enable google colab to access your google drive (you will see a couple of pop up windows where you will be asked to allow colab to access your google drive account)

In [None]:
from google.colab import drive
drive.mount('/content/drive')

### Opening and viewing the landslide data
To read in and view your Morakot landslide inventory:

In [None]:
# read in the landslide data that you just created (in geojson format)
LS = gpd.read_file("/content/drive/MyDrive/landslides_Morakot.geojson") # this is the landslide data that you created in GEE

#Let's inspect the data
LS

You can see that there are 11590 rows in the data, which means there are 11509 individual landslide polygons.

#Determining the coordinate system of the data
It's important to be aware of the coordinate system if the data you're working with. You can find it out using the ```crs``` command. It's easy to make mistakes in spatial data analysis when you aren't aware of the coordinate system of your data. This becomes even more important when you are working with more than one dataset.


In [None]:
LS.crs

We can see above that the landslide data is in a geographic coordinate system. As we will ultimately want to calculate landslide area and volume (for event effectiveness), we need to work with the data in a projected coordinate system (i.e units in meters rather than decimal degrees). Therefore, we need to reproject the data.

A sensible coordinate system to work in is a UTM coordinate system. Taiwan is in UTM zone 51N. EPSG:3829.

You can read more about UTM coordiante systems here:
https://www.usgs.gov/faqs/how-are-utm-coordinates-measured-usgs-topographic-maps#:~:text=The%20UTM%20(Universal%20Transverse%20Mercator,Zone%2019%2C%20which%20includes%20Maine.

To reproject the data:

In [None]:
#reproject the landslide data
LS_proj  = LS.to_crs({'init': 'epsg:3829'})

If we inspect the data again, we can see that the coordinates in the 'geometry' have now changed, reflecting the new coordinate system.

In [None]:
LS_proj

# Plotting spatial data
It's always useful to visually inspect spatial data. We can plot the raw spatial data using the code below. In this code, we undertake the following steps:

1. Create a figure named f with one axis named ax by using the command plt.subplots (part of the library matplotlib, which we have imported at the top of the notebook). Note how the method is returning two elements and we can assign each of them to objects with different name (f and ax) by simply listing them at the front of the line, separated by commas.

2. We plot the geographies and tell the function that we want it to draw the polygons on the axis we are passing, ax. This method returns the axis with the geographies in them, so we make sure to store it on an object with the same name, ax.

3. We draw the entire plot by calling plt.show().
For more information on matplotlib plotting conventions, see here.

You can also save figures outside of this notebook, using the plt.savefig command -- see below for an example. Any figures you save will be saved in the current working directory and therefore will need to be moved if you want to access them after the session has ended.

In [None]:
# Setup figure and axis
f, ax = plt.subplots(1, figsize=(16, 8))
# Plot layer of polygons on the axis
LS_proj.plot(ax=ax)
# Remove axis frames
#ax.set_axis_off()
# Display
plt.show()

# Save figure to a PNG file
plt.savefig('Taiwan_landslides.png')

plt.tight_layout()

# For a very high resolution image we can add the dpi in the command, e.g.
#plt.savefig('Taiwan_landslides.png', dpi = 1080) # I've left this line commented, as you don't need to run it for now.


#Calculating the area of each landslide
Thus far, we don't know the area of each landslide polygon. We can calculate polygon area using the following commands:

In [None]:
LS_proj['LS_area'] = LS_proj['geometry'].area # units are in m^2

LS_proj # shows the dataframe below

# Exploratory data analysis
As we've established over the previous weeks, it's always useful to undertake exploratory data analysis to determine key characteristics about the data (hazard) that we're working with.

We can use the pandas summary statistics (describe) function to be able to explore the data.

The describe funciton gives the:
> count

> mean

> standard deviation

> minimum

> 25 percentile

> 50 percentile

> 75 percentiles

> max

of all numeric columns in the dataset.

In [None]:
LS_proj.describe()


As part of the exploratory data analysis, let's see if the frequency-area distribution shows powerlaw characteristics.

Run the code below to view the landslide frequency area distribution:

In [None]:
def plot_loghist(x, bins):
  hist, bins = np.histogram(x, bins=bins)
  logbins = np.logspace(np.log10(bins[0]),np.log10(bins[-1]),len(bins))
  ax.hist(x, bins=logbins)
  ax.set_ylabel("Frequency")
  ax.set_xlabel("LS area [m^2]")
  plt.xscale('log')
  plt.yscale('log')

fig, ax = plt.subplots()

plot_loghist(LS_proj["LS_area"], 15) # you can alter the number of bins by changin gthe number at the end of this line

**Do you think this plot shows powerlaw behaviour?**

**Based on this frequency area distribution, which landslide areas do you expect would do most of the "geomorphic work"**

# Calculating event effectiveness

To be able to determine effectiveness, we must be able to
  1.	convert landslide area to some measure of effect, and
  2.	calculate the effect of landslides in each size bin

A reasonable measure of landslide ‘effect’ is the landslide volume – this indicates the amount of erodible sediment that was created by the landslide, and which may have hazardous impacts as it is mobilized and moved downstream. To determine this, we need to use an empirical scaling relationship to convert area to volume.

Chen et al (2019) estimated that the scaling relation between landslide area and volume forlandslides in Taiwan is:
\begin{equation}
    \ V = 0.106 A^{1.25}
\end{equation}
where *A* is area in m<sup>2</sup> and *V* is volume in m<sup>3</sup>.

We can then apply this area-volume scaling relationship to each individual landslide, and then sum up the individual volumes.

Enter this equation into the code block below to calculate landslide volume for all landslides in the dataset.


In [None]:
# enter your code here


To divide the data into bins, you can use the following code to cut the data into bins, and then group the data into these different bins.

In [None]:
g = pd.cut(LS_proj['LS_area'], bins=np.logspace(np.log10(min(LS_proj.LS_area)),np.log10(max(LS_proj.LS_area)), 30)) # The cut function converts continuous data into discrete bins.
summaryLS = LS_proj.groupby(g, observed=True).agg(area_count=('LS_area','count'), volume_sum=('volume','sum'), area=('LS_area', 'mean'), volume=('volume', 'mean')).astype(float)


Now, plot the data....

In [None]:
fig, ax = plt.subplots()
ax.plot(summaryLS.area, summaryLS.area_count, color = 'red', label = 'frequency')
ax.plot(summaryLS.area, summaryLS.volume, color = 'green', label = 'volume')
ax.plot(summaryLS.area, summaryLS.volume_sum, color = 'blue', label = 'effectiveness (total sed mobilised)')
ax.set_xlabel("Area m^2")
ax.set_ylabel("Frequency/Volume m^3/Effectiveness m^3")
ax.set_yscale('log')
ax.set_xscale('log')
ax.legend()
plt.show()

**What do you observe about landslide effectiveness with an increase in landslide area?**

**What are some limitations/uncertainties of this analysis?**