## Summative Assignment 1: Natural Hazard Risk Assessment of Infrastructure

The assignment is divided into three sections: 

Section 1: We will look at damage assement of buildings/landuse classes subjected to windstorms

Section 2: The students are expected to perform the following tasks: 

- a) Extract buildings, power infrastructure and roads from OSM (15%)
- b) Download flood and windstorm (specific storm) from Climate store (15%)
- c) Flood damage assessment of roads (30%)
- d) Windstorm damage assessment of builings / power infrastructure(30%)

The students can chose between any of the following two regions

Section 3: A brief assessment quiz (20%)



## Learning Objectives
<hr>

- To understand the use of **OSMnx** to extract geospatial data from OpenStreetmap.
- To know how to rasterize vector data through using **Geocube**.
- To know how to visualise vector and raster data.
- To understand the basic functioning of **Matplotlib** to create a map.
- To know how one can generate routes between two points using **NetworkX**.
- To visualize networks on an interactive map 
- To know how to download data from the Copernicus Climate Data Store using the `cdsapi` and access it through Python.
- To be able to open and visualize this hazard data.
- To know how to access and open information from the Copernicus Land Monitoring System. Specifically the Corine Land Cover data.
- To understand the basic approach of a natural hazard risk assessment.
- To be able to use the `DamageScanner` to do a damage assessment.
- To interpret and compare the damage estimates.

### Section 1: Windstorm damage assessment

## 1.Introducing the packages
<hr>

Within this tutorial, we are going to make use of the following packages: 

[**GeoPandas**](https://geopandas.org/) is a Python packagee that extends the datatypes used by pandas to allow spatial operations on geometric types.

[**OSMnx**](https://osmnx.readthedocs.io/) is a Python package that lets you download geospatial data from OpenStreetMap and model, project, visualize, and analyze real-world street networks and any other geospatial geometries. You can download and model walkable, drivable, or bikeable urban networks with a single line of Python code then easily analyze and visualize them. You can just as easily download and work with other infrastructure types, amenities/points of interest, building footprints, elevation data, street bearings/orientations, and speed/travel time.

[**xarray**](https://docs.xarray.dev/) is a Python package that allows for easy and efficient use of multi-dimensional arrays.

[**Matplotlib**](https://matplotlib.org/) is a comprehensive Python package for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible.

*We will first need to install the missing packages in the cell below. Uncomment them to make sure we can pip install them*

In [1]:
import os
import cdsapi
import shapely 
import matplotlib
import urllib3
import pyproj

import osmnx as ox
import numpy as np
import xarray as xr
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
import networkx as nx

from matplotlib.colors import ListedColormap
from zipfile import ZipFile
from io import BytesIO
from urllib.request import urlopen
from zipfile import ZipFile
from tqdm import tqdm

urllib3.disable_warnings()

## 2. Downloading and accessing natural hazard data
<hr>

We are going to perform a damage assessment using both windstorm data and flood data for Europe.

### Windstorm Data

The windstorm data will be downloaded from the [Copernicus Climate Data Store](https://cds.climate.copernicus.eu/). As we have seen during the lecture, and as you can also see by browsing on this website, there is an awful lot of climate data available through this Data Store. As such, it is very valuable to understand how to access and download this information to use within an analysis. To keep things simple, we only download one dataset today: [A winter windstorm](https://cds.climate.copernicus.eu/cdsapp#!/dataset/sis-european-wind-storm-indicators?tab=overview). 

We will do so using an **API**, which is the acronym for application programming interface. It is a software intermediary that allows two applications to talk to each other. APIs are an accessible way to extract and share data within and across organizations. APIs are all around us. Every time you use a rideshare app, send a mobile payment, or change the thermostat temperature from your phone, you’re using an API.

However, before we can access this **API**, we need to take a few steps. Most importantly, we need to register ourselves on the [Copernicus Climate Data Store](https://cds.climate.copernicus.eu/) portal. To do so, we need to register, as explained in the video clip below:

<img src="https://github.com/ElcoK/BigData_AED/blob/main/_static/images/CDS_registration.gif?raw=1" class="bg-primary mb-1">
<br>

Now, the next step is to access the API. You can now login on the website of the [Copernicus Climate Data Store](https://cds.climate.copernicus.eu/). After you login, you can click on your name in the top right corner of the webpage (next to the login button). On the personal page that has just opened, you will find your user ID (**uid**) and your personal **API**. You need to add those in the cell below to be able to download the windstorm.

As you can see in the cell below, we download a specific windstorm that has occured on the 28th of October in 2013. This is storm [Carmen (also called St Jude)](https://en.wikipedia.org/wiki/St._Jude_storm). 

In [3]:
uid = XXX
apikey = 'XXX'

c = cdsapi.Client(key=f"{uid}:{apikey}", url="https://cds.climate.copernicus.eu/api/v2")

c.retrieve(
    'sis-european-wind-storm-indicators',
    {
        'variable': 'all',
        'format': 'zip',
        'product': 'windstorm_footprints',
        'year': '2013',
        'month': '10',
        'day': '28',
    },
    'Carmen.zip')

NameError: name 'XXX' is not defined

## 3. Visualising the hazard data
<hr>

In [None]:
with ZipFile('Carmen.zip') as zf:
    
    # Let's get the filename first
    file = zf.namelist()[0]
    
    # And now we can open and select the file within Python
    with zf.open(file) as f:
        windstorm_europe = xr.open_dataset(f)

In [None]:
windstorm_europe['FX'].plot()

Unfortunately, our data does not have a proper coordinate system defined yet. As such, we will need to use the `rio.write_crs()` function to set the coordinate system to **EPSG:4326** (the standard global coordinate reference system). 

We also need to make sure that the functions will know what the exact parameters are that we have to use for our spatial dimenions (e.g. longitude and latitude). It prefers to be named `x` and `y`. So we use the `rename()` function before we use the `set_spatial_dims()` function.

In [None]:
windstorm_europe.rio.write_crs(4326, inplace=True)
windstorm_europe = windstorm_europe.rename({'Latitude': 'y','Longitude': 'x'})
windstorm_europe.rio.set_spatial_dims(x_dim="x",y_dim="y", inplace=True)

In [None]:
windstorm_europe = windstorm_europe.rio.reproject(XXXX)

In [None]:
windstorm_map = windstorm_europe.rio.clip(area.envelope.values, area.crs)

In [None]:
windstorm_map['FX']. XXXX

## 4. Extracting buildings from OpenStreetMap
<hr>

In [None]:
tags = {"building": True}
buildings = ox.features_from_place(place_name, tags)

There is a lot more data to extract from OpenStreetMap besides land-use information. Let's extract some building data. To do so, we use the *"building"* tag.

In [None]:
buildings.head()

## 5. Analyze and visualize building stock
<hr>

In [None]:
fig,ax = plt.subplots(1,1,figsize=(5,18))

building_year.plot(kind='barh',ax=ax)

ax.tick_params(axis='y', which='major', labelsize=7)

### 6. Windstorm Damage

---
To estimate the potential damage of our windstorm, we use the vulnerability curves developed by [Yamin et al. (2014)](https://www.sciencedirect.com/science/article/pii/S2212420914000466). Following [Yamin et al. (2014)](https://www.sciencedirect.com/science/article/pii/S2212420914000466), we will apply a sigmoidal vulnerability function satisfying two constraints: (i) a minimum threshold for the occurrence of damage with an upper bound of 100% direct damage; (ii) a high power-law function for the slope, describing an increase in damage with increasing wind speeds. Due to the limited amount of vulnerability curves available for windstorm damage, we will use the damage curve that represents low-rise *reinforced masonry* buildings for all land-use classes that may contain buildings. Obviously, this is a large oversimplification of the real world, but this should be sufficient for this exercise. When doing a proper stand-alone windstorm risk assessment, one should take more effort in collecting the right vulnerability curves for different building types. 

In [None]:
wind_curves = pd.read_excel("https://github.com/ElcoK/BigData_AED/raw/main/week5/damage_curves.xlsx",sheet_name='wind_curves')
maxdam = pd.read_excel("https://github.com/ElcoK/BigData_AED/raw/main/week5/damage_curves.xlsx",sheet_name='maxdam')

In [None]:
landuse_map = CLC_region_wind['band_data'].to_numpy()[0,:,:]
wind_map = windstorm['FX'].to_numpy()[0,:,:]

In [None]:
wind_map.shape

In [None]:
wind_map_kmh = wind_map*XXX

In [None]:
wind_damage_CLC = DamageScanner(landuse_map,wind_map_kmh,wind_curves,maxdam)[1]

In [None]:
wind_damage_CLC

### Section 2: Tasks

### A. Extraction of buildings, roads and power infrstaructure assets from OSM 

### B. Hazard data extraction from CDS

### C. Flood damage assesment of roads

### D. Windstorm damage assessment of buildings

### Section 3: Quiz

#### Example 1:  ThE function damage scanner has been used quite extensively in this assignment. Explain in detail the sequantial flow of this functions (Can be added as comments)? CELL SIZE PARAMETER ??

In [None]:
def DamageScanner(landuse_map,inun_map,curve_path,maxdam_path,cellsize=100):
        
    
    landuse = landuse_map.copy()
    
   
    inundation = inun_map.copy()
    
    inundation = np.nan_to_num(inundation)        

    
    if isinstance(curve_path, pd.DataFrame):
        curves = curve_path.values   
    elif isinstance(curve_path, np.ndarray):
        curves = curve_path

   
    if isinstance(maxdam_path, pd.DataFrame):
        maxdam = maxdam_path.values 
    elif isinstance(maxdam_path, np.ndarray):
        maxdam = maxdam_path
        
    
    inun = inundation * (inundation>=0) + 0
    inun[inun>=curves[:,0].max()] = curves[:,0].max()
    waterdepth = inun[inun>0]
    landuse = landuse[inun>0]

    
    numberofclasses = len(maxdam)
    alldamage = np.zeros(landuse.shape[0])
    damagebin = np.zeros((numberofclasses, 4,))
    for i in range(0,numberofclasses):
        n = maxdam[i,0]
        damagebin[i,0] = n
        wd = waterdepth[landuse==n]
        alpha = np.interp(wd,((curves[:,0])),curves[:,i+1])
        damage = alpha*(maxdam[i,1]*cellsize)
        damagebin[i,1] = sum(damage)
        damagebin[i,2] = len(wd)
        if len(wd) == 0:
            damagebin[i,3] = 0
        else:
            damagebin[i,3] = np.mean(wd)
        alldamage[landuse==n] = damage

    
    loss_df = pd.DataFrame(damagebin.astype(float),columns=['landuse','losses','area','avg_depth']).groupby('landuse').sum()
    
    
    return loss_df.sum().values[0],loss_df