# TPM034A Machine Learning for socio-technical systems 
## `Mini-project #2: Heatwave Hotspot Identification (HHI)`

**Delft University of Technology**<br>
**Q2 2022**<br>
**Module manager:** Dr. Sander van Cranenburgh <br>
**Instructors:** Dr. Sander van Cranenburgh, Dr. Nadia Metoui, Dr. Amir Pooyan Afghari <br>
**TAs:**  Francisco Garrido Valenzuela & Lucas Spierenburg <br>

## `Learning objectives:`
This mini-project addresses LO3, LO4, LO5 and LO6 in the course.

After the course, students can:
1. explain fundamental concepts of machine learning (ML).
2. conceptually explain the workings of a selected number of ML models and eXplainable AI (XAI) techniques, and apply these to empirical data.
3. **identify applications of ML and XAI techniques in real-world socio-technical systems**
4. **examine the impact of ML-based solutions and interventions on individuals, organisations, and society through XAI.**
5. **conduct an in-depth analysis of a real-world socio-technical challenge, by applying ML and XAI to empirical data.**
6. **reflect on the strengths and limitations of ML and XAI in real-world socio-technical systems.**

## `Project description` <br>

### **Introduction**

Climate change has brought unknown challenges to cities and urban areas across the world. Since 2018, the Dutch cities, where the average summer temperature is around 23 °C, have had six heatwaves. In 2019, they experienced the highest temperature ever registered at 39,1 °C. Such events have put many urban systems at the edge of their capacity and have resulted in a devastating consequence: an increase in the total number of mortalities: 400 extra deaths in 2019.

Emergency services are at the core of a thriving city: ambulances rush to health emergencies, firefighters put out the fire, and the police help with car accidents. In such cases, it is critical to understand **where calls come from**, especially during a heatwave. Knowing where the **“heatwave hotspots”** (those locations with high number of calls) are located can often help emergency services plan limited resources. While there is empirical evidence, for example, that more ambulance calls are made from the urban heat islands. Additionally, it is vital to understand **when help is needed**: during the day when the temperature is the highest or at night when the effect of the heat has accumulated. We aim to dive into these spatial and temporal aspects with open data.

The project's goal is to find the heatwave hotspots (spatially and temporally) in three cities of interest (Amsterdam, Rotterdam, and The Hague), and to explain the underlying factors. More specifically:
*What were the most affected wijken in Amsterdam, Rotterdam, and The Hague and during the 2019 European heatwave?*
*What time of the day (daytime or night-time) those hotspots were most affected by the heatwave?*
*What factors contributed to the heatwave hotspots in 2019, and did those factors change during the 2020 heatwave?*

### **Data**
You have access to the following datasets:
1. Ambulance call records from the P2000 network for 2018, 2019, and 2020;
2. Socio-economic and built environment data by wijken en buurten from CBS for 2018, 2019, and 2020;
3. Places of interest (e.g. supermarkets, schools, etc.) data from OpenStreetMap.

For a translation of the columns in English, please check [this file](source/dictionary.csv).

### **Tasks and grading**

There are 8 tasks in this project. In total, 10 points can be earned for these 8 tasks. The weight per task is shown below.

1. **Data preparation:** construct data from multiple data sources, separate training and testing data, handle the missing data, handle outliers. [1 point]
2. **Data discovery and visualisation:** investigate the distribution of variables, the correlation between variables, etc. [1 point]
3. **Selection and application of a proper analytical technique:** create a regression or ML model to predict the heatwave hotspots. [1 point]
4. **Model evaluation and output visualization:** evaluate the prediction ability of the selected model and create heatmaps of model predictions. [1 point]
5. **Model explanation:** identify top 5 factors that have the most contribution to heatwave hotspots. [1 point]
6. **Reflection (a):** name two strengths and two limitations of using your selected model to predict the heatwave hotspots. [2 points]
7. **Reflection (b):** discuss the impact of these strengths and limitations on individuals, organisations, and society. [2 points]
8. **Reflection (c):** propose an alternative potential solution to mitigate the most severe limitation. [1 point]


### **Grading criteria:**

For the first 5 tasks:
**Correctness of methods and techniques (45%)**
**Completeness (45%)**
**Coding skills (10%)**

For tasks 6, 7 and 8:
**Depth of critical thinking and creativity (60%)**
**Completeness (40%)**

### **Submission**
When you finish the project, please submit the Jupyter Notebook file of your work to Brightspace and prepare a final presentation (including the results of the tasks) to be delivered on the presentations day.

This project is a group project and so each group must submit one Jupyter Notebook file. However, it is expected that all members of the group contribute to the project. **Please prepare a short statement about the "Members contributions" in your final presentation and outline who did what in the project.**

The deadline for submission is **13/01/2023**.

In [1]:
# Getting the data

## You can use the following methoda to get the databases for this project
### get_emergency_calls() to get the entire database of emergency calls
### get_pois_data() to get the POIS at WK level
### get_socio_dem(year, agg, lang)
#### year: 2018, 2019, 2020
#### agg_level: 'GM', 'WK' or 'BU'
#### Language (language): 'EN' or 'ND'

import source.heatwaves as htwv

calls = htwv.get_emergency_calls()
pois = htwv.get_pois()
socio_dem = htwv.get_socio_dem(agg_level = 'WK', year = 2018, lang = 'EN')

  df = pd.read_csv('source/socio_dem.csv', low_memory = 'False')


In [4]:
calls

Unnamed: 0,pmeId,pmeTimeStamp,wk_code,wk_naam,gm_naam
0,12284702,01/01/2017 00:00,WK051812,Wijk 12 Bomen- en Bloemenbuurt,'s-Gravenhage
1,12284706,01/01/2017 00:00,WK036304,Nieuwmarkt/Lastage,Amsterdam
2,12284715,01/01/2017 00:02,WK059915,Charlois,Rotterdam
3,12284725,01/01/2017 00:04,WK059910,Feijenoord,Rotterdam
4,12284746,01/01/2017 00:07,WK059914,Prins Alexander,Rotterdam
...,...,...,...,...,...
784147,17060161,30/08/2020 19:15,WK036335,IJburg West,Amsterdam
784148,17060178,30/08/2020 19:21,WK036329,Dapperbuurt,Amsterdam
784149,17060179,30/08/2020 19:23,WK059910,Feijenoord,Rotterdam
784150,17060180,30/08/2020 19:23,WK051826,Wijk 26 Bezuidenhout,'s-Gravenhage
