Osnabrück University - Machine Learning (Summer Term 2016) - Prof. Dr.-Ing. G. Heidemann, Ulf Krumnack

# Exercise Sheet 08

## Introduction

This week's sheet should be solved and handed in before the end of **Sunday, June 12, 2016**. If you need help (and Google and other resources were not enough), feel free to contact your groups' designated tutor or whomever of us you run into first. Please upload your results to your group's Stud.IP folder.

## Assignment 1: Multilayer Perceptron (MLP) [10 Points]

Last week you implemented a simple perceptron. This week we already provide some basic perceptron which you will adjust to build network from it.

In [1]:
import numpy as np

# Generate some data.
N = 1000
input_dim = 3
D = np.random.rand(N, input_dim)
# Label data: sum should be > 0.8 * dim.
T = (np.sum(D, 1) > 0.8 * input_dim) * 1

In [2]:
import sys
import numpy as np

def train_perceptron(perceptron, D, T, epochs, sample_size, verbose=True):
    """
    Trains the perceptron over epochs epochs with sample_size 
    random samples drawn from D. No replacement is done in one epoch.
    
    Args:
        perceptron  The perceptron. Must implement a function
                    adaption(X, t) where X is a row of D and t
                    is its label.
        D           The data of size N x d where N is the
                    number of samples and d is the number
                    of dimensions.
        T           The training labels. Iterable with
                    N x do elements where N is the number
                    of samples in D and do is the dimension
                    of the perceptrons output.    
        epochs      The number of training epochs.
        sample_size The number of random samples per epoch.
        verbose     Prints status messages if True (default).
    """
    if verbose:
        print('Training {}\nEpochs: {}\nSamples per Epoch: {}'.format(perceptron, epochs, sample_size))
    for epoch in range(epochs):
        if verbose:
            sys.stdout.write("\rEpoch {:5d}, {:7.2%}".format(epoch + 1, (epoch + 1) / epochs))
            sys.stdout.flush()
        sample_indices = np.random.choice(range(N), sample_size, replace=False)
        for index in sample_indices:
            x = D[index]
            t = T[index]
            perceptron.adaption(x, t)
    if verbose:
        print('\nFinished.')

def test_perceptron(perceptron, D, T, verbose=True):
    """
    Tests the perceptron on all provided data.
    
    Args:
        perceptron  The perceptron. Must implement a function
                    activation(X) where X is a row of D.
        D           The data of size N x d where N is the
                    number of samples and d is the number
                    of dimensions.
        T           The training labels. Iterable with
                    N x do elements where N is the number
                    of samples in D and do is the dimension
                    of the perceptrons output.
        verbose     Prints status messages if True (default).
    Returns:
        The absolute error per output component.
    """
    if verbose:
        print('Testing {}'.format(perceptron))
    error = 0
    for i, t in enumerate(T):
        error += np.abs(t - perceptron.activation(D[i])) / len(D)
    if verbose:
        print('Total error:', error)
    return error

In [3]:
import numpy as np

class BasePerceptron:
    """
    A simple perceptron implementation.
    """

    def __init__(self, dimensions=100, epsilon=0.03):
        """
        Initializes the perceptron. Creates dimensions + 1
        random weights (the additional weight is the bias.)

        Args:
            dimensions  the data dimensionality N
            epsilon     the learning rate
        """
        self.w = np.zeros(dimensions + 1)#np.random.rand(dimensions + 1)
        self.epsilon = epsilon

    def activation(self, X):
        """
        The activation function. Prepends a 1 to X for the
        bias and calculates the activation function of the 
        perceptron.

        Args:
            X   the data point, should be a numpy
                array or a 1xN numpy matrix
        Returns:
            1   if the activation of X is bigger than 0
            0   else
        """
        return 1 if np.append(1, X) @ self.w > 0 else 0

    def adaption(self, X, t):
        """
        Trains the perceptron. Adjusts the weights according to 
        the learning rate and the error between the activation and the 
        label (delta).

        Args:
            X   the data point, should be a numpy
                array or a 1xN numpy matrix
            t   the label
        """
        self.w += self.epsilon * (t - self.activation(X)) * np.append(1, X)

    def __repr__(self):
        return '{}({}, {})'.format(self.__class__.__name__, len(self.w) - 1, self.epsilon)

In [4]:
# Instantiate a perceptron.
epsilon = 0.03
perceptron = BasePerceptron(input_dim, epsilon)

# Train and test the perceptron.
train_perceptron(perceptron, D, T, epochs=20, sample_size=100)
_ = test_perceptron(perceptron, D, T)

Training BasePerceptron(3, 0.03)
Epochs: 20
Samples per Epoch: 100
Epoch    20, 100.00%
Finished.
Testing BasePerceptron(3, 0.03)
Total error: 0.011


In [5]:
class InputPerceptron(BasePerceptron):
    """
    InputPerceptron inherits all properties of the 
    BasePerceptron but implements a new activation function:
    Instead of just using a threshold, it ignores its 
    weights and just passes on its input.
    """

    def activation(self, X):
        """
        The activation function for input perceptrons is
        the identity function.

        Args:
            X           the data point
        Returns:
            X
        """
        return X

In [6]:
import numpy as np
import scipy.special

class ContinuousPerceptron(BasePerceptron):
    """
    ContinuousPerceptron inherits all properties of the 
    BasePerceptron but implements a new activation function:
    Instead of just using a threshold, the continuous perceptron
    uses a sigmoid function.
    """

    def activation(self, X):
        """
        The activation function. Prepends a 1 to X for the
        bias and calculates the activation function of the 
        perceptron.

        Args:
            X           the data point, should be a numpy
                        array or a 1xN numpy matrix
        Returns:
            1 / (1 + exp( -y ))
            where y is the dot product of the weights and 
            the padded input.
        """
        return scipy.special.expit(self.w @ np.append(1, X))

    def adaption(self, X, t):
        """
        Trains the perceptron. Adjusts the weights according to 
        the learning rate and the error between the activation and the 
        label (delta).

        Args:
            X   the data point, should be a numpy
                array or a 1xN numpy matrix
            t   the delta
        """
        self.w += self.epsilon * t * np.append(1, X)

In [29]:
import numpy as np

class MultiLayerPerceptron:
    def __init__(self, dimensions=[2, 1, 1], epsilon=0.03):
        self.layers = []
        self.epsilon = epsilon
        
        # Create input layer.
        self.layers.append([InputPerceptron(1, 0) for i in range(dimensions[0])])
        
        # Generate hidden and output layers.
        dim = len(self.layers[0])
        for N in dimensions[1:]:
            layer = [ContinuousPerceptron(dim, epsilon) for i in range(N)]
            self.layers.append(layer)
            dim = N
        
        # Initialize outputs and deltas.
        self.outputs = []
        self.deltas = []

    def activation(self, X):
        """
        Feed forward activation.
        """
        # Clear potentially stored outputs.
        self.outputs = []
        
        # Activate input layer and store its outputs.
        layer_outputs = np.array([self.layers[0][i].activation(x) for i, x in enumerate(X)])
        self.outputs.append(layer_outputs)

        # Activate all other layers with the outputs from before.
        for layer in self.layers[1:]:
            layer_outputs = np.array([layer[j].activation(layer_outputs) for j in range(len(layer))])
            self.outputs.append(layer_outputs)

        # Return the last outputs for the output layer.
        return np.copy(layer_outputs)

    def adaption(self, X, t):
        """
        Backpropagation adaption.
        """
        from pprint import pprint
        print()
        print('X',X)
        print('t',t)
        # Clear potentially stored deltas.
        self.deltas = []
        
        # Activate perceptron to figure out and store each
        # neuron's output.
        outputs = self.activation(X)
        
        # Compute error:
        error = (t - outputs)
        print('error',error)
        
        print('Outputs')
        pprint(self.outputs)
        
        # Calculate deltas for output layer and store them.
        layer_deltas = outputs * (1 - outputs) * error
        self.deltas.insert(0, layer_deltas)
        
        # Calculate other deltas.
        for k in range(len(self.layers) - 2, 0, -1):
            layer_deltas = []
            for i, neuron in enumerate(self.layers[k]):
                sigma = self.outputs[k][i] * (1 - self.outputs[k][i])
                delta = self.deltas[0]
                w = np.array([j.w[i] for j in self.layers[k + 1]])
                n_delta = sigma * w @ delta
                print(sigma, delta, w, n_delta)
                layer_deltas.append(n_delta)
            print(layer_deltas)
            self.deltas.insert(0, np.array(layer_deltas))

        print('deltas')
        pprint(self.deltas)
        # Adapt weights for hidden and output neurons.
        for k, layer in enumerate(self.layers[1:]):
            for i, neuron in enumerate(layer):
                neuron.adaption(self.outputs[k], self.deltas[k][i])
    
    def __repr__(self):
        return 'MultiLayerPerceptron({}, {})'.format([len(l) for l in self.layers], self.epsilon)

    def __str__(self):
        return '\n\t'.join([repr(self)] + [str(l) for l in self.layers])
        

In [30]:
# Instantiate a multilayer perceptron.
epsilon = 0.03
output_dim = 1
layers = [input_dim, 3, 2, output_dim]
perceptron = MultiLayerPerceptron(layers, epsilon)

# Train and test the perceptron.
# train_perceptron(perceptron, D, T, epochs=20, sample_size=100)
train_perceptron(perceptron, D, T, epochs=1, sample_size=2, verbose=False)
# _ = test_perceptron(perceptron, D, T)


X [ 0.09951099  0.40638386  0.61842743]
t 0
error [-0.5]
Outputs
[array([ 0.09951099,  0.40638386,  0.61842743]),
 array([ 0.5,  0.5,  0.5]),
 array([ 0.5,  0.5]),
 array([ 0.5])]
0.25 [-0.125] [ 0.] -0.0
0.25 [-0.125] [ 0.] -0.0
[-0.0, -0.0]
0.25 [-0. -0.] [ 0.  0.] 0.0
0.25 [-0. -0.] [ 0.  0.] 0.0
0.25 [-0. -0.] [ 0.  0.] 0.0
[0.0, 0.0, 0.0]
deltas
[array([ 0.,  0.,  0.]), array([-0., -0.]), array([-0.125])]
1 0
[ 0.09951099  0.40638386  0.61842743]
0.0
1 1
[ 0.09951099  0.40638386  0.61842743]
0.0
1 2
[ 0.09951099  0.40638386  0.61842743]
0.0
2 0
[ 0.5  0.5  0.5]
-0.0
2 1
[ 0.5  0.5  0.5]
-0.0
3 0
[ 0.5  0.5]
-0.125

X [ 0.90325242  0.67528833  0.29493246]
t 0
error [-0.49859375]
Outputs
[array([ 0.90325242,  0.67528833,  0.29493246]),
 array([ 0.5,  0.5,  0.5]),
 array([ 0.5,  0.5]),
 array([ 0.49859375])]
0.25 [-0.12464745] [-0.00375] 0.000116856986666
0.25 [-0.12464745] [-0.001875] 5.84284933329e-05
[0.00011685698666582227, 5.8428493332911134e-05]
0.25 [  1.16856987e-04   5.8428

## Assignment 2: MLP and RBFN [10 Points]