Osnabrück University - Machine Learning (Summer Term 2024) - Prof. Dr.-Ing. G. Heidemann, Ulf Krumnack, Lukas Niehaus

# Exercise Sheet 07

## Introduction

This week's sheet should be solved and handed in before the end of **Sunday, June 2th, 2023**. If you need help (and Google and other resources were not enough), ask in the forum, contact your groups designated tutor or whomever of us you run into first. Please upload your results to your group's studip folder.

## Assignment 0: Math recap (Hyperplanes) [0 Points]

This exercise is supposed to be very easy and is voluntary. There will be a similar exercise on every sheet. It is intended to revise some basic mathematical notions that are assumed throughout this class and to allow you to check if you are comfortable with them. Usually you should have no problem to answer these questions offhand, but if you feel unsure, this is a good time to look them up again. You are always welcome to discuss questions with the tutors or in the practice session. Also, if you have a (math) topic you would like to recap, please let us know.

**a)** What is a *hyperplane*? What are the hyperlanes in $\mathbb{R}^2$ and $\mathbb{R}^3$? How are the usually described?

YOUR ANSWER HERE

**b)** What is the Hesse normal form? What is the intuition behind? What are its advantages?

YOUR ANSWER HERE

**c)** Can you transform the standard form of a hyperplane into the Hesse normal form and vice versa?

YOUR ANSWER HERE

## Assignment 1: Local PCA (7 points)

In the lecture we learned that regular PCA is ill suited for special cases of data. In this assignment we will take a look at local PCA which is used for clustered data (ML-06, Slide 25). This is mostly a repetition of algorithms we already used. Feel free to use the built-in functions for k-means clustering and PCA from the libraries (we already included the right imports to set you on track). See:
* For K-Means
    * [scipy.cluster.vq.vq](https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.vq.vq.html)
    * [scipy.cluster.vq.kmeans](https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.vq.kmeans.html)
* For PCA    
    * [sklearn.decomposition.PCA](https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html)
* To plot arrows
    * [matplotlib.pyplot.quiver](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.quiver.html)
* Numpy
    * [@ / np.matmul](https://numpy.org/doc/stable/reference/generated/numpy.matmul.html#numpy.matmul)
    * [.T / np.transpose](https://numpy.org/doc/stable/reference/generated/numpy.transpose.html#numpy.transpose)

In [None]:
%matplotlib notebook

import numpy as np
import numpy.random as rnd
import matplotlib.pyplot as plt

from numpy.random import multivariate_normal as multNorm

from scipy.cluster.vq import kmeans, vq
from sklearn.decomposition import PCA

np.random.seed(42)

# Generate clustered data - you may plot the data to take a look at it
data = np.vstack((multNorm([2, 2], [[0.1, 0], [0, 1]], 100),
                  multNorm([-2, -4], [[1, 0], [0, 0.3]], 100)))

# TODO: Apply k-means to the data.
# YOUR CODE HERE

# TODO: Apply PCA for each cluster and store each two largest components.
# YOUR CODE HERE

# TODO: Plot the results of k-means and local PCA
# YOUR CODE HERE

## Assignment 2: Projection Pursuit (3 points)

**a)** Explain in your own words the idea of projection pursuit. Is it a linear or non-linear method for dimension reduction? Discuss why high variance, non-Gaussianness and clusters indicate an interesting feature. What is the relation to PCA? 

YOUR ANSWER HERE

**b)** Explain how the different indices (Friedman-Tukey, Hermite, Natural Hermite, and Entropy) detect interesting features and discuss advantages and disadvantages.

YOUR ANSWER HERE

**c)** Explain the idea of projection pursuit for clustering (ML-06, slides 56-58). How is the index computed and why does maximizing that index yield good clusters?

YOUR ANSWER HERE

## Assignment 3: Hebbian Learning (6 points)

In the lecture (ML-07, Slides 10ff.) there is a simplified version of Ivan Pavlov's famous experiment on classical conditioning. In this exercise you will take a look into this simplified model and create your own conditionable dog with a simple Hebbian learning rule.

### a) Programming a Dog
To model the dog salivation behavior we will need to model an unconditioned and a conditioned stimulus: food and bell. They are represented as lists: `weight_food` and `weight_bell`. Note that one could just use a single number, the lists are only here to keep track of the history for a nice output. It is possible to access the current weight by selecting the last item of each list, respectively: `weight_food[-1]`.

A list of trials is already given as well as a condition database. Each entry represents an index to select from the `condition_db`. To figure out the value of the stimulus `food` in the second trial (which maps to condition `1`) one could do: `condition_db[1]["food"]`.

Your task is to implement a `for` loop over all trials. In each iteration select the correct values for $x_1$ and $x_2$ from the condition database and retrieve the current weights $w_1$ and $w_2$. Then calculate the response of the dog with the threshold $\theta$:

$$
r_t = \Theta(x_{1,t} w_{1,t-1} + x_{2,t} w_{2,t-1})\\
\Theta(x)= \begin{cases}1 \text{ if } x >= \theta\\0 \text{ else }\end{cases}
$$

With this response calculate both $w_{n,t}$ according to the Hebbian rule:

$$w_{n,t} = w_{n, t-1} + \epsilon \cdot r_t \cdot x_{n,t}$$

*Note: While you program the output might look a little messy, don't worry about it. Once you fill up all three lists properly, it will look much like on ML-07, Slide 14.*

*Hint: The [list.append()](https://docs.python.org/3/tutorial/datastructures.html) function is probably rather useful.*

In [None]:
# Initialization
condition_db = [{"food": 1, "bell": 0}, 
                {"food": 0, "bell": 1},
                {"food": 1, "bell": 1}]

trials = [0, 1, 2, 1, 2, 1, 2, 1]

epsilon = 0.2
theta = 1/2

responses = []
weight_food = [1]
weight_bell = [0]

# TODO: For each trial, update the current weights of the US and CS and store
# the results in the respective lists. Also store the response.
# YOUR CODE HERE

# Output
print("| Food   |   |" + "|   |".join(["{:3d}".format(condition_db[trial]["food"]) for trial in trials]) + "|   |")
print("| Bell   |   |" + "|   |".join(["{:3d}".format(condition_db[trial]["bell"]) for trial in trials]) + "|   |")
print("| Saliva |   |" + "|   |".join(["{:3d}".format(response) for response in responses]) + "|   |")
print("| w_Food |" + "|   |".join(["{:3.1f}".format(w) for w in weight_food]) + "|")
print("| w_Bell |" + "|   |".join(["{:3.1f}".format(w) for w in weight_bell]) + "|")

### b) Parameter adjustment

In the above default setting of trials (`[0, 1, 2, 1, 2, 1, 2, 1]`, in case you changed it), how many learning steps, i.e. simultaneous presentation of unconditioned and conditioned stimulus, did you need until the dog started to produce saliva on the conditioned stimulus? What happens if you change the parameters $\epsilon$ and $\theta$? Try smaller and bigger values for each or present different conditions to the dog.

YOUR ANSWER HERE

## Assignment 4: The Logic Perceptron (4 points)

### a) The Logic Perceptron

For the following two logical functions sketch the weights of a perceptron after it was trained. To do so, figure out when the perceptron should fire. Then come up with ideas of how you can achieve this. Remember that $w_0$, the bias, is used as a threshold and that there is a constant $x_0 = 1$. Provide the values for $w_0,w_1,w_2$ as well as some explanation.

#### 1) $(A \wedge B) \vee (\neg A \wedge B)$

YOUR ANSWER HERE

#### 2) $(A \wedge B) \vee (\neg A \wedge B) \vee (A \wedge \neg B)$

YOUR ANSWER HERE