<a href="https://colab.research.google.com/github/davidsjohnson/xai_ac_sose25/blob/main/notebooks/exercise4a.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# XAI for Affective Computing (SoSe2025)
# Exercise 4a: Concept-Based Explanations of Facial Expression Recognition

In this notebook you will attempt to generate concept-based explanations for a facial expression recognition (FER) CNN trained on raw image data, using a subset of the [AffectNet dataset](http://mohammadmahoor.com/affectnet/).

We will use a concept-based approach to generating explanations in this notebook. To do this we will use Concept Relevance Propagation (CRP), which we learned about in the paper ["From attribution maps to human-understandable explanations through Concept Relevance Propagation".](https://www.nature.com/articles/s42256-023-00711-8)

The documentation of the librqary is still limited but the [CRP GitHub Repo](https://github.com/rachtibat/zennit-crp) has enough to get use started.  So make sure to review the README.  

To use this notebook, please make sure to go step by step through each of the cells review the code and comments along the way.

***NOTE**: This notebook runtime could be improved by using a GPU if available.*

## Notebook Setup

Make sure to set to Colab flag below before running the code based on the environment you are using.

If you are running the notebook locally make sure to update the python packages by running `pip install -r requirements.txt` at the command line

In [None]:
colab = False # set to True if running in Google Colab or False if running locally

In [None]:
if colab:
  !git clone https://github.com/davidsjohnson/xai_ac_sose25.git

In [None]:
import sys
import os

if colab:
  sys.path.append(os.path.realpath('xai_ac_sose25'))
else:
  sys.path.append(os.path.realpath('../'))

In [None]:
from pathlib import Path

import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split, RandomizedSearchCV
from sklearn.metrics import confusion_matrix, classification_report
from scipy.stats import randint, uniform

from PIL import Image

import torch
from torch.utils.data import TensorDataset, DataLoader
from torchvision import datasets, transforms
import torch.nn.functional as F
import torch.nn as nn

import seaborn as sns
import matplotlib.pyplot as plt

from skimage import io

import utils
import img_utils
import models
import evaluate

In [None]:
base_dir = Path('../data/') if not colab else Path('xai_ac_sose25/data/')

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

In [None]:
# download the AffectNet dataset extracted features and a sample set of images for visualization
affnet_dir = utils.download_file('https://uni-bielefeld.sciebo.de/s/EmfF9r93LG4jcT9/download',
                          file_name='affectnet_data.zip',
                          cache_dir=base_dir,
                          extract=True,
                          force_download=False,     # set to False if you have already downloaded the dataset
                          archive_folder='affectnet_data')
affnet_dir

## XAI for FER with  Convoluational Neural Nets

### Setup the Pytorch Data Loader

In [None]:
#class labels
class_names = ['Neutral', 'Happy', 'Sad', 'Surprise', 'Fear', 'Disgust', 'Anger', 'Contempt']

# Setup XAI Data from AffectNet Deep Learning Model
TRAIN_MEAN = [0.485, 0.456, 0.406]
TRAIN_STD = [0.229, 0.224, 0.225]

# trainsform to preprocess the images
test_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=TRAIN_MEAN, std=TRAIN_STD),
    transforms.Resize((224, 224))
])

# set up and load the dataset
data_dir = base_dir / 'affectnet_data/affectnet/val_class'
dataset = datasets.ImageFolder(root=data_dir, transform=test_transform)
dataloader = DataLoader(dataset, batch_size=80, shuffle=False)

# load the images for visualization
images = [Image.open(f[0]).convert('RGB').resize((224,224)) for f in dataset.imgs] # load images as PIL objects and resize them
images = [np.array(img) / 255.0 for img in images] # convert to numpy arrays and rescale for display

# get the true labels and class names
y_true = np.array([f[1] for f in dataset.imgs])
y_labels = [class_names[f[1]] for f in dataset.imgs]


### Load Pretrained Model

In [None]:
# download checkpoint
ckpt_link = 'https://uni-bielefeld.sciebo.de/s/0tAa2wPhGxSDjbM/download'
ckpt_path = utils.download_file(ckpt_link,
                                'affectnet.pth',
                                cache_dir= base_dir / 'affectnet/model',
                                extract=False,
                                force_download=False
                                )
ckpt_path

In [None]:
model = models.ResNet18(n_classes=len(class_names), pretrained=True)
model.to(device)
model.load_state_dict(torch.load(ckpt_path, map_location=device))
model.eval();

### Evaluation of Model

This model performs much better than the AU dataset, with around $60\%$ accuracy.  Stil not great but this is pretty close the state-of-the-art for the AffectNet dataset

In [None]:
inverse_weights = torch.from_numpy(1.0/np.array([74874, 134415, 25459, 14090, 6378, 3803, 24882, 3750])).type(torch.float32).to(device)
loss = torch.nn.CrossEntropyLoss(weight=inverse_weights)
_, _, y_preds, probs = evaluate.evaluate_model(model, dataloader, loss, device=device)

y_preds = np.array(y_preds)
# validate predictions and true values
(y_preds == y_true).mean()

## Generate Explanations

In [None]:
if colab:
  !pip install -q zennit-crp[fast_img]  

In [None]:
from crp.attribution import CondAttribution
from crp.concepts import ChannelConcept
from crp.helper import get_layer_names

from zennit.composites import EpsilonPlusFlat
from zennit.canonizers import SequentialMergeBatchNorm
from zennit.torchvision import ResNetCanonizer

from crp.visualization import FeatureVisualization
from crp.image import plot_grid, imgify

### Task 1 - Generate CRP Attribution Maps

[CRP GitHub Repo](https://github.com/rachtibat/zennit-crp)

Review the [Attributions Tutorial](https://github.com/rachtibat/zennit-crp/blob/master/tutorials/attributions.ipynb) for info on CRP and help with these tasks

**Task 1.1:** Generate a basic feature attribution map for one example from the test data.  The feature attribution should be conditioned on just the predicted class.  This will provide us with a standard saliency map and is equivilant to LRP.


**Task1.2:** Generate attribution maps for three randomly selected "concepts" from the last layer of the network.  You can use the `get_layer_names` function to find the name of the last layer of our model.  Each "concept" defined as individual feature map from that layer. In our model the last layer has 512 feature maps, so just choose 3 random feature maps to use in your conditions for generating attribution values.  Then visualize the attibution maps.


You can use the "Broadcast" functionality described in the tutorial to do this.

Now the generated attribution maps represent the pixels of the image most important to that specific feature map.  (But note, that we do not yet now how important these feature maps are since we just randomly selected them)

**Task 1.3:** Identify the top 5 concepts (i.e. feature maps) from the last layer of the network for your selected image's predicted class.  Then plot their corresponding feature maps.  

**Preview the Dataset with Predictions**

The code below will display images from the XAI dataset.
- Try changing value of `start` to get a new set of images (there are 10 images for each class; for example, the class happy will be at indexes 10-19)
- Search through the images to find some that might be interesting to Explain

In [None]:
start=40
img_utils.display_nine_images(images, y_true, y_preds, start)

In [None]:
####### Select Your image #######
##################################

idx = 
cls = 

In [None]:
# get sample and visualize it
sample = images[idx]
sample = test_transform(sample)
sample = sample.unsqueeze(0)
sample = sample.to(torch.float).to(device)
imgify(sample[0])

In [None]:
###### Enter your Code Below ######
##################################




### Task 2 - Relevance Maximization

Relvance Maximation aims to identify the top images that maximize the relevence score for a given concept. The idea is to find a subsample of the dataset that helps to visually understand what is the human-understandable concept the model learned for that model concept.  

The [Feature Visualization Tutorial Notebook](https://github.com/rachtibat/zennit-crp/blob/master/tutorials/feature_visualization.ipynb) will help you with this task.  

**Tasks 2.1:** Using the previously identified top 10 concept ids, use the `FeatureVisualization` class to find the images that maximize the relevance values of each concept. Then plot the images using the `plot_grid` function.  

In [None]:
###### Enter your Code Below ######
##################################


### Task 3 - CRP Questions

Answer the questions below


**Q 1.1**  
Explain the purpose of the Composites and Canonizers from the Zennit package. Which composite did you select and why? Do the results look similar if you select another composite?

**Q 1.2**  
Try to semantically describe the identified concepts based on the images selected via RelMax.  Do you find clear "concepts" in the indentified images for of the top feature maps?

**Q 1.3**   
Is it as easy and straightforword to describe the concepts as they suggest in the original paper?

**Q 1.4:**  
How do these results compare with teh saliency maps from SHAP and Integrated Gradients?

**Q 1.5:**  
How might use CRP and RelMax to get a more detailed understanding of the different layers in the network?

**Q 1.6:**  
Can you think of an approach that would integrate facial action units into the idenitification of semantic concepts from RelMax?

Write your answer here...