# COMP527 Data Mining & Visualization: Text Classification Using Binary Perceptron Algorithm

*Student 201442927. University of Liverpool.*

## Questions/Tasks

(1) Explain the Perceptron algorithm for the binary classification case, providing its pseudo code. (20 marks)

(2) Prove that for a linearly separable dataset, perceptron algorithm will converge. (10 marks)

(3) Implement a binary perceptron. (20 marks)

(4) Use the binary perceptron to train classifiers to discriminate between (a) class 1 and class 2,
(b) class 2 and class 3 and (c) class 1 and class 3. Report the train and test classification
accuracies for each of the three classifiers after 20 iterations. Which pair of classes is most
difficult to separate? (20 marks)

(5) For the classifier (a) implemented in part (3) above, which feature is the most discriminative? (5 marks)

(6) Extend the binary perceptron that you implemented in part (2) above to perform multi-class
classification using the 1-vs-rest approach. Report the train and test classification accuracies
for each of the three classes after training for 20 iterations. (15 marks),

(7) Add an $ \ell_{2} $ regularisation term to your multi-class classifier implemented in question (5). Set
the regularisation coefficient to 0.01, 0.1, 1.0, 10.0, 100.0 and compare the train and test
classification accuracy for each of the three classes. (10 marks)

## Task (1) 
>Explain the Perceptron algorithm for the binary classification case, providing its pseudo code. (20 marks)

The *Perceptron Algorithm*, first described by Frank Rosenblatt in 1958, is inspired by the idea of a biological neuron which is sensitive to a number of stimuli and is deterministically activated when the effect of those combined stimuli exceeds some activation threshold.

Mathematically, we model this as the dot product of a vector of quantified stimuli $\mathbf{x}_{i}$ and a vector of weighted sensitivities $\mathbf{w}_{i}$, plus a bias term $ b $.

Given a dataset that can be described as a linearly-separable collection of vectors, and has already been divided into two disjoint labelled categories, we 


the Perceptron Algorithm shows how to adjust our model's sensitivity weights and bias until the model is able to correctly classify all of our data.

In pseudo-code:

// initialize weights
// initialize bias



The Algorithm provides a recipe for 

In pseudo-code
Input: ProblemSize, InputPatterns, , 
Output: Weights
Weights  InitializeWeights(ProblemSize)
For ( To )
      SelectInputPattern(InputPatterns)
      ActivateNetwork(, Weights)
      TransferActivation()
    UpdateWeights(, , )
End
Return (Weights)

## Task (2) 
> Prove that for a linearly separable dataset, perceptron algorithm will converge. (10 marks)

## Task (3) 
> Implement a binary perceptron. (20 marks)

In [21]:
import numpy as np

In [77]:
def load_data(filename, with_print=False):
    """
    Read in labelled data from file.
    
    Args:
        filename (str): Name of file in local directory.
    """
    
    # open the file
    with open(filename,'r') as f:
        file_data = f.read()

    # split lines
    split_data = file_data.split('\n')

    # creat dict to store data
    data = {}
    for i, datum in enumerate(split_data):
        try:
            # split the data-vector from the class-label
            split = datum.split(',class-')
            label = split[1]
            
            # split the elements of the data-vector
            list_of_strings = split[0].split(',')
            list_vector = []
            
            # convert the elements of the data-vector...
            # ... from text strings to floating-point numbers
            for string in list_of_strings:
                element = float(string)
                list_vector.append(element)
            
            # convert the list of floats to a numpy array vector
            vector = np.array(list_vector)
            
            # load the label and vector into the data dict
            data[i] = (label, vector)
            
            if with_print==True:
                print(f'Extracted "{datum}" to "{data[i]}".')
        except IndexError:
            if with_print==True:
                print(f'Could not split "{datum}".\nProbably this was the end of the file.')
            
    return data

In [78]:
test = load_data('test.data')

In [79]:
train = load_data('train.data')

In [80]:
train

{0: ('1', array([5.1, 3.5, 1.4, 0.2])),
 1: ('1', array([4.9, 3. , 1.4, 0.2])),
 2: ('1', array([4.7, 3.2, 1.3, 0.2])),
 3: ('1', array([4.6, 3.1, 1.5, 0.2])),
 4: ('1', array([5. , 3.6, 1.4, 0.2])),
 5: ('1', array([5.4, 3.9, 1.7, 0.4])),
 6: ('1', array([4.6, 3.4, 1.4, 0.3])),
 7: ('1', array([5. , 3.4, 1.5, 0.2])),
 8: ('1', array([4.4, 2.9, 1.4, 0.2])),
 9: ('1', array([4.9, 3.1, 1.5, 0.1])),
 10: ('1', array([5.4, 3.7, 1.5, 0.2])),
 11: ('1', array([4.8, 3.4, 1.6, 0.2])),
 12: ('1', array([4.8, 3. , 1.4, 0.1])),
 13: ('1', array([4.3, 3. , 1.1, 0.1])),
 14: ('1', array([5.8, 4. , 1.2, 0.2])),
 15: ('1', array([5.7, 4.4, 1.5, 0.4])),
 16: ('1', array([5.4, 3.9, 1.3, 0.4])),
 17: ('1', array([5.1, 3.5, 1.4, 0.3])),
 18: ('1', array([5.7, 3.8, 1.7, 0.3])),
 19: ('1', array([5.1, 3.8, 1.5, 0.3])),
 20: ('1', array([5.4, 3.4, 1.7, 0.2])),
 21: ('1', array([5.1, 3.7, 1.5, 0.4])),
 22: ('1', array([4.6, 3.6, 1. , 0.2])),
 23: ('1', array([5.1, 3.3, 1.7, 0.5])),
 24: ('1', array([4.8, 3.4

In [81]:
def train_perceptron(training_dataset):
    """Train Perceptron on given training dataset.
    
    Args:
        training_dataset(dict): Dict with integer keys containing tuples 
                            with labels and np.array data vectors."""
    
    # load training_dataset
    data = training_dataset
    
    # find length of dataset vectors
    vector_length = len(data[0][1])
    
    # initialize weight_vector
    weight_vector = np.zeros(vector_length + 1)
    
    return vector_length

In [82]:
train_perceptron(train)

4

In [86]:
np.zeros(3)

array([0., 0., 0.])

32.33

## Task (4) 
> Use the binary perceptron to train classifiers to discriminate between (a) class 1 and class 2, (b) class 2 and class 3 and (c) class 1 and class 3. Report the train and test classification accuracies for each of the three classifiers after 20 iterations. Which pair of classes is most difficult to separate? (20 marks)

## Task (5) 
> For the classifier (a) implemented in part (3) above, which feature is the most discriminative? (5 marks)

## Task (6) 
> Extend the binary perceptron that you implemented in part (2) above to perform multi-class classification using the 1-vs-rest approach. Report the train and test classification accuracies for each of the three classes after training for 20 iterations. (15 marks),

## Task (7) 
> Add an $\ell_{2}$ regularisation term to your multi-class classifier implemented in question (5). Set the regularisation coefficient to 0.01, 0.1, 1.0, 10.0, 100.0 and compare the train and test classification accuracy for each of the three classes. (10 marks)