Osnabrück University - Machine Learning (Summer Term 2016) - Prof. Dr.-Ing. G. Heidemann, Ulf Krumnack

# Exercise Sheet 08

## Introduction

This week's sheet should be solved and handed in before the end of **Sunday, June 12, 2016**. If you need help (and Google and other resources were not enough), feel free to contact your groups' designated tutor or whomever of us you run into first. Please upload your results to your group's Stud.IP folder.

## Assignment 1: Multilayer Perceptron (MLP) [10 Points]

Last week you implemented a simple perceptron. This week we already provide some basic perceptron which you will adjust to build network from it.

In [None]:
import numpy as np
class SimplePerceptron:
    """
    A simple perceptron implementation.
    """

    def __init__(self, dimensions=100, epsilon=0.03):
        """
        Initializes the perceptron. Creates dimensions + 1
        random weights (the additional weight is the bias.)

        Args:
            dimensions  the data dimensionality N
            epsilon     the learning rate
        """
        self.w = np.random.rand(dimensions + 1)
        self.epsilon = epsilon
        
    def activation(self, X):
        """
        The activation function. Prepends a 1 to X for the
        bias and calculates the activation function of the 
        perceptron.

        Args:
            X           the data point, should be a numpy
                        arary or a 1xN numpy matrix

        Returns:
            True  if the activation of X is bigger than 0
            False elseal
        """
        return np.append(1, X) @ self.w > 0

    def train(self, X, t):
        """
        Trains the perceptron. Adjusts the weights according to 
        the learning rate and the error between the activation and t.

        Args:
            X           the data point, should be a numpy
                        arary or a 1xN numpy matrix
            t           the label for this data point should be
                        True or False
        """
        self.w += self.epsilon * (t - self.activation(X)) * np.append(1, X)

# Generate some data.
N = 1000
dim = 3
D = np.random.rand(1000, dim)
# Label data: sum should be > 0.8 * dim
D = np.hstack((D, np.matrix(np.sum(D, 1) > 0.8 * dim).T))

# Instantiate a Perceptron.
perceptron = SimplePerceptron(D.shape[1] - 1)

# Train the perceptron for several epochs.
epochs = 20
sample_size = 100
for epoch in range(epochs):
    for sample in range(sample_size):
        sample_data = D[np.random.choice(range(N), replace=False),:]
        for data in sample_data:
            x = data[0,0:-1]
            t = data[0,-1]
            perceptron.train(x, t)

# Test the perceptron on all data.
error = 0
for data in D:
    error += np.abs(data[0,-1] - perceptron.activation(data[0,0:-1])) / N
print("The perceptron classifies {:.2%} of the data correctly.".format(1 - error))

In [None]:
class Perceptron:

    def __init__(self, dimensions=100, epsilon=0.03):
        self.w = np.random.rand(dimensions + 1)
        self.epsilon = epsilon
        self.y = None

    def activation(self, X):
        self.y = ...
        return self.y

    def train(self, X, t, delta):
        pass

In [None]:
class MultiLayerPerceptron:
    def __init__(self, neurons_per_layer):
        

## Assignment 2: MLP and RBFN [10 Points]

This exercise is aimed at deepening the understanding of Radial Basis Function Networks and how they relate to Multilayer Perceptrons. Not all of the answers can be found directly in the slides - so when answering the (more algorithmic) questions, first take a minute and think about how you would go about solving them and if nothing comes to mind search the internet for a little bit. If you are interested in a real life application of both algorithms and how they compare take a look at this paper: [Comparison between Multi-Layer Perceptron and Radial Basis Function Networks for Sediment Load Estimation in a Tropical Watershed](http://file.scirp.org/pdf/JWARP20121000014_80441700.pdf)

![Schematic of a RBFN](RBFN.png)

We have prepared a little example that shows how radial basis function approximation works in Python. This is not an example implementation of a RBFN but illustrates the work of the hidden neurons.

In [None]:
%matplotlib notebook

import numpy as np
from numpy.random import uniform

from scipy.interpolate import Rbf

import matplotlib
import matplotlib.pyplot as plt
from matplotlib import cm


def func(x,y):
    '''
    This is the example function that should be fitted.
    Its shape could be described as two peaks close to
    each other - one going up, the other going down
    '''
    return (x + y) * np.exp(-4.0 * (x**2 + y**2))
 
x = uniform(-1.0, 1.0, size=50)
y = uniform(-1.0, 1.0, size=50)

# sample 50 random datapoints from the underlying function
fvals = func(x, y)

# get the aprroximation via RBF
new_func = Rbf(x, y, fvals)

# sample 100x100 values from the approximated function
x_new, y_new = np.mgrid[-1:1:100j, -1:1:100j]
f_new = new_func(x_new, y_new)

plt.figure("Original Function")
# This plot represents the original function
x_orig, y_orig = np.mgrid[-1:1:100j, -1:1:100j]
data_orig = func(x_orig, y_orig)
plt.imshow(data_orig, extent=[-1,1,-1,1], cmap=plt.cm.jet)

plt.figure("RBF Result")
# This plots the approximation of the original function by the RBF
# if the plot looks strange try to run it again, the sampling
# in the beginning is random
plt.imshow(f_new, extent=[-1,1,-1,1], cmap=plt.cm.jet)
plt.xlim(-1,1)
plt.ylim(-1,1)
# scatter the datapoints that have been used by the RBF
plt.scatter(x, y)

### Radial Basis Function Networks

#### What are radial basis functions?

Radial basis functions are all functions that fullfill the following criteria:

The value of the function for a certain point depends only on the distance of that point to the origin or some other fixed center point. In mathematical formulation that spells out to: 
$\phi (\mathbf {x} )=\phi (\|\mathbf {x} \|)$  or  $\phi (\mathbf {x} ,\mathbf {c} )=\phi (\|\mathbf {x} -\mathbf {c} \|)$. Notice that it is not necessary (but most common) to use the norm as the measure of distance.

#### What is the structure of a RBFN? You may also use the notion from the above included picture.

RBFN's are networks that contain only one hidden layer. The input is connected to all the hidden units. Each of the hidden units has a different radial basis function that is *sensitive* to ranges in the input domain. The output is then a linear combination of the outpus ot those functions.

#### How is a RBFN trained?

Note: all input data has to be normalized.

Training a RBFN is a two-step process. First the functions in the hidden layer are initialized. This can be either done by sampling from the input data or by first performing a k-means clustering, where k is the number of nodes that have to be initialzed.

The second step fits a linear model with coefficients $w_{i}$ to the hidden layer's outputs with respect to some objective function. The objective function depends on the task: it can be the least squares function, or the weights can be adapted by gradient descent.

### Comparison to the Multilayer Perceptron

#### What do both models have in common? Where do they differ?

|RBFN                 |MLP                  | 
|---------------------|---------------------|
| non-linear layered feedforward network|non-linear layered feedforward network| 
| hidden neurons use radial basis functions, output neurons use linear function| input, hidden and output-layer all use the same activation function| 
| universal approximator |   universal approximator |
| learning usually affects only one or some RBF | learning affects many weights throught the network|

#### How can classification in both networks be visualized?

![Classification](Solution_Classification.png)

#### When would you use a RBFN instead of a Multilayer Perceptron?

RBFNs are more robust to noise and should therefore be used when the data contains false-positives.