Name | Matr.Nr. | Due Date
:--- | ---: | ---:
Firstname Lastname | 01234567 | 16.03.2025, 22:00

<h1 style="color:rgb(0,120,170)">Hands-on AI II</h1>
<h2 style="color:rgb(0,120,170)">Unit 1 &ndash; Recap Hands-on AI I (Assignment)</h2>

<b>Authors:</b> B. Schäfl, S. Lehner, J. Brandstetter, A. Schörgenhumer, S. Luukkonen, R. Dangl<br>
<b>Date:</b> 04.03.2025

This file is part of the "Hands-on AI II" lecture material. The following copyright statement applies to all code within this file.

<b>Copyright statement:</b><br>
This material, no matter whether in printed or electronic form, may be used for personal and non-commercial educational use only. Any reproduction of this material, no matter whether as a whole or in parts, no matter whether in printed or in electronic form, requires explicit prior acceptance of the authors.

<h3 style="color:rgb(0,120,170)">How to use this notebook</h3>

This notebook is designed to run from start to finish. There are different tasks (displayed in <span style="color:rgb(248,138,36)">orange boxes</span>) which require your contribution (in form of code, plain text, ...). Most/All of the supplied functions are imported from the file <code>u1_utils.py</code> which can be seen and treated as a black box. However, for further understanding, you can look at the implementations of the helper functions. In order to run this notebook, the packages which are imported at the beginning of <code>u1_utils.py</code> need to be installed.

<div class="alert alert-warning">
    <b>Important:</b> Set the random seed with <code>u1.set_seed(123)</code> to enable reproducible results in all tasks that incorporate randomness (e.g., t-SNE, splitting data intro train and test sets, initializing weights of a neural network, running the model optimization with random batches, etc.). You must use <code>123</code> as seed.
</div>

In [None]:
# Import pre-defined utilities specific to this notebook.
import u1_utils as u1

# Import additional utilities needed in this notebook.
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import pandas as pd
import torch
from scipy import signal

from sklearn.cluster import KMeans, AgglomerativeClustering
from scipy.cluster.hierarchy import linkage

# Set default plotting style.
sns.set_theme()

# Setup Jupyter notebook (warning: this may affect all Jupyter notebooks running on the same Jupyter server).
u1.setup_jupyter()

# Check minimum versions.
u1.check_module_versions()

<h2>1. Tabular data</h2>

<p>In this exercise you'll be working with another famous data set, the <i>breast cancer</i> data set. Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image [<a href="https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29">1</a>]. Publication:

<center><cite>W.N. Street, W.H. Wolberg and O.L. Mangasarian. Nuclear feature extraction for breast tumor diagnosis. IS&T/SPIE 1993 International Symposium on Electronic Imaging: Science and Technology, volume 1905, pages 861-870, San Jose, CA, 1993.</cite></center>

<div class="alert alert-warning">
    <b>Exercise 1.1. [3 Points]</b>
    <ul>
        <li>Load the <i>breast cancer</i> data set using the appropriate function as supplied by us.</li>
        <li>Split the data set into the feature vector matrix and the label vector.</li>
        <li>Visualize the data set in tabular form (the whole data set with target column).</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 1.2. [3 Points]</b>
    <ul>
        <li>How many samples does the data set contain?</li>
        <li>How many features does the data set consist of (not counting the class label column <i>class</i>)?</li>
        <li>How many different classes are there?</li>
    </ul>
</div>

Your answer

<div class="alert alert-warning">
    <b>Exercise 1.3. [1 Point]</b>
    <ul>
        <li>Print the location measures (min, max, median, mean, etc.) for the features.</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 1.4. [3 Points]</b>
    <ul>
        <li>Compute a pairplot of the data set with respect to all features that contain the <i>mean</i> value.</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 1.5. [3 Points]</b>
    <ul>
        <li>Create boxplots for all features that contain the <i>mean</i> value, grouped by class.</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 1.6. [3 Points]</b>
    <ul>
        <li>Create histograms for all features that contain the <i>mean</i> value, grouped by class.</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 1.7. [8 Points]</b>
    <p>
    Compare and interpret the results:
    </p>
    <ul>
        <li>In the pairplot, which feature might indicate linear separability of the classes?</li>
        <li>What can be said about outliers in the boxplots?</li>
        <li>What can be said about the median values in the boxplots with respect to the two groups?</li>
        <li>Are the distributions of the values symmetric when looking at the boxplots and histograms? How can you determine this from the boxplots?</li>
    </ul>
</div>

Your answer

  <div class="alert alert-warning">
    <b>Exercise 1.8. [1 Point]</b>
    <ul>
        <li>Use standard scaling on the feature vector matrix with <code>u1.standardize_data()</code></li>
            </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 1.9. [3 Points]</b>
    <ul>
        <li>Reduce the dimensionality of the scaled data set using <i>PCA</i> with 2 components and visualize the downprojection.</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 1.10. [3 Points]</b>
    <ul>
        <li>Reduce the dimensionality of the scaled data set using <i>t-SNE</i> with 2 components and visualize the downprojection.</li>
        <li>Choose a fitting perplexity.</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 1.11. [6 Points]</b>
    <ul>
        <li>Create a k-Means model using two clusters. Use <code>random_state=123</code></li>
        <li>Fit the model to the data set and predict the cluster labels.</li>
        <li>Visualize the data set with respect to the cluster labels on the pca downprojected data.</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 1.12. [4 Points]</b>
    <ul>
        <li>For hierarchical clustering: compute the linkage matrix (on scaled data) and plot the dendrogram.</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 1.13. [8 Points]</b>
    <ul>
        <li>Create a hierarchical model with the two clusters.</li>
        <li>Fit the model to the data set and predict the cluster labels.</li>
        <li>Visualize the data set with respect to the cluster labels on the pca downprojected data.</li>
        <li>Interpret: compare k-means and hierarchical clustering.</li>
    </ul>
</div>

In [None]:
# Your code

your answer

<h2>2. Image data</h2>

<p>In this exercise you'll be working with a data set composed of various images. The data set distinguishes <i>ten</i> different classes, one for each object (bird, cat, deer, etc.). For curious minds, more information regarding this data set and the images it contains can be found at the following link: <a href="https://www.cs.toronto.edu/~kriz/cifar.html">CIFAR-10</a>.</p>

<center><cite>Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009.</cite></center>


<div class="alert alert-warning">
    <b>Exercise 2.1. [2 Points]</b>
    <ul>
        <li>Load the <i>cifar10</i> data set using the appropriate function as supplied by us.</li>
        <li>Visualize the cifar data set in tabular form.</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 2.2. [10 Points]</b>
    <ul>
        <li>Define the following two $3 \times 3$ filters (shown in the formulae below) and apply them on 12 random images $A$ from the above data set (with $*$ as the convolution and $\sigma{}$ as the sigmoid operation) to produce the following 4 outputs $G_x, G'_x, G_y, G'_y$:</li>
    </ul>
    <p>
        \begin{equation}G_x = \left(
            \begin{array}{rrr}
                -5 & 0 & 5 \\
                -5 & 0 & 5 \\
                -5 & 0 & 5
            \end{array}\right) * A
            \qquad
            G'_x = \sigma (G_x)
        \end{equation}
    </p>
    <p>
        \begin{equation}G_y = \left(
            \begin{array}{rrr}
                -5 & -5 & -5 \\
                 0 &  0 &  0 \\
                 5 &  5 &  5
            \end{array}\right) * A
            \qquad
            G'_y = \sigma (G_y)
        \end{equation}
    </p>
    <ul>
        <li>Hint: Make sure to exclude the class label column <i>item_type</i> before processing your data.</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 2.3. [5 Points]</b>
    <ul>
        <li>Using the data of the 12 samples from above, create a plot with 5 rows (or 5 columns, choose what you like), where</li>
        <ul>
            <li>(1) shows the original samples</li>
            <li>(2) shows the samples after the convolution using the first filter, i.e., $G_x$</li>
            <li>(3) shows the samples after the convolution using the first filter and after the application of sigmoid, i.e., $G'_x$</li>
            <li>(4) shows the samples after the convolution using the second filter, i.e., $G_y$</li>
            <li>(5) shows the samples after the convolution using the second filter and after the application of sigmoid, i.e., $G'_y$</li>
        </ul>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 2.4. [8 Points]</b>
    <ul>
        <li>Implement a class <code>FNN</code> that derives from <code>torch.nn.Module</code> with the following architecture:</li>
    </ul>
    <table style="text-align:center;vertical-align:middle">
        <th>Position</th>
        <th>Element</th>
        <th>Comment</th>
        <tr>
            <td>0</td>
            <td>input</td>
            <td>input size = $3\times{}32\times{}32$ (flattened)</td>
        </tr>
        <tr>
            <td>1</td>
            <td>fully connected</td>
            <td>$512$ output features</td>
        </tr>
        <tr>
            <td>2</td>
            <td>ReLU</td>
            <td>-</td>
        </tr>
        <tr>
            <td>3</td>
            <td>fully connected</td>
            <td>$512$ output features</td>
        </tr>
        <tr>
            <td>4</td>
            <td>ReLU</td>
            <td>-</td>
        </tr>
        <tr>
            <td>5</td>
            <td>fully connected</td>
            <td>$10$ output features</td>
        </tr>
    </table>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 2.5. [8 Points]</b>
    <ul>
        <li>Implement a class <code>CNN</code> that derives from <code>torch.nn.Module</code> with the following architecture:</li>
    </ul>
    <table style="text-align:center;vertical-align:middle">
        <th>Position</th>
        <th>Element</th>
        <th>Comment</th>
        <tr>
            <td>0</td>
            <td>input</td>
            <td>input size = $3\times{}32\times{}32$</td>
        </tr>
        <tr>
            <td>1</td>
            <td>2D convolution</td>
            <td>$32$ output channels and a kernel size of $3\times{}3$</td>
        </tr>
        <tr>
            <td>2</td>
            <td>ReLU</td>
            <td>-</td>
        </tr>
        <tr>
            <td>3</td>
            <td>max pooling</td>
            <td>kernel size of $2\times{}2$</td>
        </tr>
        <tr>
            <td>4</td>
            <td>fully connected</td>
            <td>$512$ output features</td>
        </tr>
        <tr>
            <td>5</td>
            <td>ReLU</td>
            <td>-</td>
        </tr>
        <tr>
            <td>6</td>
            <td>fully connected</td>
            <td>$10$ output features</td>
        </tr>
    </table>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 2.6. [3 Points]</b>
    <ul>
        <li>Split the cifar data set in a <i>training</i> set ($75\%$) as well as <i>test</i> set ($25\%$).</li>
        <li>Print the size of the full data set, the training set and the test set.</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 2.7. [5 Points]</b>
    <ul>
        <li>Create a corresponding <tt>TensorDataset</tt> for the training as well as the test set.</li>
        <li>Wrap the previously defined <tt>TensorDataset</tt> instances in separate <tt>DataLoader</tt> instances with a batch size of $32$ (shuffle the training data set).</li>
    </ul>
</div>

In [None]:
# Your code

<div class="alert alert-warning">
    <b>Exercise 2.8. [10 Points]</b>
    <ul>
        <li>For both an instance of your <code>FNN</code> and <code>CNN</code> model from above, train for $5$ epochs, print the training accuracy as well as the loss per epoch, and afterwards, print the final test set loss and accuracy.</li>
    </ul>
</div>

In [None]:
# Your code