# TPM034A Machine Learning for socio-technical systems 
## `Lab session 05: Explainable AI of a classification model for appliance usage prediction`

**Delft University of Technology**<br>
**Q2 2024**<br>
**Instructor:** Giacomo Marangoni <br>
**TAs:**  Francisco Garrido-Valenzuela & Lucas Spierenburg <br>

## `Instructions`

**Lab sessions aim to:**<br>
* Show and reinforce how models and ideas presented in class are used in practice.<br>
* Help you gather hands-on machine learning skills.<br>

**Lab sessions are:**<br>
* Learning environments where you work with Jupyter notebooks and where you can get support from TAs and fellow students.<br> 
* Not graded and do not have to be submitted. 

### `Use of AI tools`
AI tools, such as ChatGPT and Co-pilot, are great tools to assist with programming. Moreover, in your later careers you will work in a world where such tools are widely available. As such, we **encourage** you to use AI tools **effectively** (both in the lab sessions and assignments). However, be careful not to overestimate the capacity of AI tools! AI tools cannot replace you: you still have to conceptualise the problem, dissect it and structure it, to conduct proper analysis and modelling. We recommend being especially **reticent** with using AI tools for the more conceptual and reflective oriented questions. 

### `Google Colab workspace set-up`

Uncomment the following cells code lines if you are running this notebook on Colab

In [1]:
#!git clone https://github.com/TPM034A/Q2_2024
#!pip install -r Q2_2024/requirements_colab.txt
#!mv "/content/Q2_2024/Lab_sessions/lab_session_05/data" /content/data

# `Application: Explainable AI of a classification model for appliance usage prediction` <br>

#### **Introduction**

In this notebook you are going to train and explain a Random Forest Classifier model to predict the probability of using a given appliance in the next 24 hours.

#### **Data**

1. `data/devices.pkl`: A pickle file with a pandas.DataFrame of Wh hourly energy consumption by appliance within a household of the REFIT dataset, over a period of about two years.
2. `data/weather.pkl`: A pickle file with a pandas.DataFrame of normalized weather variables: `dwpt` is Dew Point (related to moisture), `rhum` is relative humidity, `temp` is temperature, `wdir` is wind direction, `wspd` is wind speed.
3. `data/price.pkl`: A pickle file with a pandas.Series with electricity day-ahead prices in GBP/MWh.


**Learning objectives**. After completing the following exercises you will be able to: <br>

- prepare and explore appliance-level smart meter data for training a ML model to predict whether an appliance will be used or not at a given hour;
- use XAI tools to gain insights on usage behaviour;
- reflect on practical and ethical implications.

### Data preparation

In [5]:
# Load 'data/devices.pkl', 'data/weather.pkl' and 'data/price.pkl'.
# Weather is the same as in the lab session.
# Devices contains Wh consumed by given devices at each timestamp.
# Price contains electricity prices for each timestamp.

In [7]:
# Add a colum "Load" to devices as the sum of all appliances consumption

In [9]:
# Merge all the datasets in one dataframe

In [11]:
# Consider zeros in temperature ("temp" column) as NAs, and interpolate the resulting missing values linearly

### Data exploration

In [13]:
# Plot NAs count per day over the whole time range. Are there any evident missing periods?

# Hint: use isna(), groupby(), index.date and sum()

In [16]:
# In 1 year, from 2014-03-10 to 2015-03-09, which appliance cumulatively consumed the most energy?

In [19]:
# Which appliance was turned on for the highest number of hours (i.e. consumption > 1Wh)?

In [21]:
# Which appliance consumes per hour the highest?

In [20]:
# Plot the fraction of days (y-axis) in the given year by which each appliance (color) is used for each hour (x-axis)

### Model training

In [24]:
# Prepare train and test datasets with the following characteristics:
# Train data period: from 2014-03-10 to 2015-03-09
# Test data period: 2015-03-10
# y feature: usage of television (i.e. consumption > 1)
# X features:
# - hour (int)
# - weekday (int)
# - weather variables
# - usage 24h before (1 if television was used at the same hour the day before)
# - activity 24h before (1 if any appliance was used at the same hour the day before)
# - usage yesterday (1 if television was used at least 1 hour during the whole day before)
# - price
# Drop NAs

# Hint for computing "usage yesterday": group usage by date, take the max, shift, then reindex to hourly using forward fill 

In [24]:
# Train a Random Forest Classifier according to the directions given above.

In [27]:
# Plot the test vs the predicted Usage

### Model explanation

In [1]:
# Use LIME to understand the prediction for 2015-03-10 (test day) at 14:00 and 20:00. What can you conclude?

In [103]:
# Compute the SHAP values for the 24 hours of the test dataset with a KernelExplainer.
# Use a 100 sample of the train dataset as background dataset.
# Slice the SHAP matrix returned by calling the explainer to get only values for classifying a positive prediction of television activity (class = 1).

In [73]:
# Plot the SHAP values for each feature and test sample

In [76]:
# What are the 3 most predictive features?

In [93]:
# Explain the 14:00 and 20:00 prediction of the test day. How could you interpret the difference? Comment also on the comparison with what found with LIME.

In [83]:
# In what hours is the expected probability of watching television highest, across the train dataset? 
# Hint: use a partial dependence plot

In [86]:
# Compare the partial dependence plot of expected probability of watching television (y-axis) by hour (x-axis) with
# a scatter plot of SHAP values (y-axis) for the 24 hours test samples, by hour (x-axis).
# Comment on their similarity/difference.

# Hint: use shap.plots.scatter for the latter

In [None]:
# What is the expected probability of watching television given the electricity price throughout the train dataset?
# How could one interpret this relationship?

### Reflections

In [None]:
# What strategies could you use to improve the accuracy of predicting TV usage? What are the implications for interpretability?

In [None]:
# What could be the benefits to a user of a XAI-informed model for predicting appliances usage? What could be the risks?