### Week 3 Linear Classification Models: Task 1
_Linear Classification models using haberman data set with the Perceptron model.
Recommended: Press the fast forward symbol to run the whole notebook so that the imports and methods are fully run before running the experiments below._
##### Authors: Miss Katrina Jones and Dr. Aniko Ekart (ML Module, 2021)
_PLEASE NOTE: Try to avoid copying and pasting this code or text in your portfolio submissions; try to complete the task on your own, and compare the solution to your own. Use this as reference if you encounter any issues in completing the task by yourself, and let us know if you have any queries or spot any issues with this code._

In [1]:
# Import general modules
import pandas as pd # For reading the csv
from sklearn.linear_model import Perceptron # Our chosen model
from sklearn.metrics import confusion_matrix # For creation of the confusion matrix
from sklearn.metrics import classification_report # For creation of precision, recall and f1-measures
from sklearn.metrics import accuracy_score # For help in comparing data given using accuracy score vs confusion matrix
import warnings # Allows better control of warning messages

# Set various display options
pd.set_option('display.max_row', 20) # This is just so that we're not printing every single row in one go.
pd.set_option('display.max_columns', 5) # If we were to expand the dataset, this is so that we're not printing every single column present.
warnings.simplefilter('ignore') # So that we can ignore depreciation issues in Python.

_Setting up and reading the database from the csv file._

In [2]:
features = ['Age', 'YearOfOp', '+AuxillaryNodes']
classes = ['SurvivalStatus']

hdata = pd.read_csv("haberman.csv", names=features+classes)
hdata.shape

(306, 4)

In [3]:
hdata.head()

Unnamed: 0,Age,YearOfOp,+AuxillaryNodes,SurvivalStatus
0,30,64,1,1
1,30,62,3,1
2,30,65,0,1
3,31,59,2,1
4,31,65,4,1


In [4]:
hdata.describe() # Useful in this context due to us having so much data - we can see the range of the values in each columns...

Unnamed: 0,Age,YearOfOp,+AuxillaryNodes,SurvivalStatus
count,306.0,306.0,306.0,306.0
mean,52.457516,62.852941,4.026144,1.264706
std,10.803452,3.249405,7.189654,0.441899
min,30.0,58.0,0.0,1.0
25%,44.0,60.0,0.0,1.0
50%,52.0,63.0,1.0,1.0
75%,60.75,65.75,4.0,2.0
max,83.0,69.0,52.0,2.0


#### Pre-Processing the Data: Grouping the data using the Year of Operation for the train-test split criteria
_We were asked to split the data so that there were 229 rows for the train set using the years 1958-1965, and 77 rows for the test data, using the years 1966-1969._

In [5]:
# Let's get the train set - so the data that resides between 1958-1965 (229 rows)
hdatatrain = hdata.loc[(hdata['YearOfOp'] <= 65) & (hdata['YearOfOp'] >= 58)] # I chose to use split using boolean_indexing, but Pandas groupby is a good alternative
hdatatrain.shape

(229, 4)

In [6]:
# Let's get the test set - so the data that resides above 1965 (77 rows)
hdatatest = hdata.loc[(hdata['YearOfOp'] <= 69) & (hdata['YearOfOp'] >= 66)] # I chose to use split using boolean_indexing, but Pandas groupby is a good alternative
hdatatest.shape

(77, 4)

#### Creating the Train and Test datasets
_Divide the dataset into training and testing sets, by using the data for years. 1958-1965 (229 instances) for training and 1966-1969 (77 instances) for testing. Now that we have split the data in such a way, we can do this simply:_

In [7]:
# Our train set...
x_train = hdatatrain[features].values.reshape(-1, len(features))
y_train = hdatatrain[classes]

# Our test set...
x_test = hdatatest[features].values.reshape(-1, len(features))
y_test = hdatatest[classes]

#### Setting up the Perceptron 
_Apply a single perceptron using sklearn.linear_model.Perceptron._

In [8]:
p = Perceptron() # Initialise the model we are using
model = p.fit(x_train, y_train)
y_pred = model.predict(x_test)

#### Create a Confusion Matrix
_Use sklearn.metrics.confusion_matrix to obtain the confusion matrix of your classifier._

In [9]:
print(confusion_matrix(y_test, y_pred))

[[56  4]
 [13  4]]


#### Discussion Point #1: Confusion Matrix
- What is a confusion matrix? 
- What does the number 56 mean?
- What does the number on the top left mean?
- How many did it get correct?
- Could you explain what these results are showing us?
- Is this good... or bad..? How can you tell?

In [10]:
# This is taken from the Python documentation, ravel() is a useful function for retrieving these values separately.
truenegatives, falsepositives, falsenegatives, truepositives = confusion_matrix(y_test, y_pred).ravel()
''' NOTE: This is different to the lecture formatting of a confusion matrix, be careful! '''

' NOTE: This is different to the lecture formatting of a confusion matrix, be careful! '

In [11]:
# This is just to clearly illustrate what they are.
print('TN:' + str(truenegatives), '|| FP: ' + str(falsepositives), '|| FN: ' + str(falsenegatives), '|| TP:' + str(truepositives))

TN:56 || FP: 4 || FN: 13 || TP:4


In [12]:
# This is in case you wish to format the confusion matrix to have specific headings.
rowlabels = ["Actual 0", "Actual 1"]
columnlabels = ["Predicted 0", "Predicted 1"]
pd.DataFrame(confusion_matrix(y_test, y_pred), rowlabels, columnlabels)

Unnamed: 0,Predicted 0,Predicted 1
Actual 0,56,4
Actual 1,13,4


#### Create a Classification Report

_Use sklearn.metrics.classification_report to print out a report on precision, recall, f1-measure for both training and test data._

In [13]:
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           1       0.81      0.93      0.87        60
           2       0.50      0.24      0.32        17

    accuracy                           0.78        77
   macro avg       0.66      0.58      0.59        77
weighted avg       0.74      0.78      0.75        77



#### Discussion Point #2: Classification Report
- What is precision?
- What is recall?
- What is an f1-score?
- What is 'support'?
- What is accuracy?
- What is the macro average?
- What is the weighted average?

In [14]:
y_pred_train = model.predict(x_train)
print(classification_report(y_train, y_pred_train))

              precision    recall  f1-score   support

           1       0.80      0.91      0.85       165
           2       0.64      0.42      0.51        64

    accuracy                           0.77       229
   macro avg       0.72      0.67      0.68       229
weighted avg       0.76      0.77      0.76       229



#### Discussion Point #3: Comparing Classification Reports
- Is this a good model?
- Why?

#### Experiment with Different Parameter Settings (in a Systematic Way) 
_Note: Your experiments may be different to mine._

##### A helper function to help keep the experiment simple.

In [15]:
''' A helper function that just applies the data to the new model and prints out the output'''
def create_and_change_defaults_of_perceptron(penaltyvalue, maxitervalue):
    
    # Create our local variable...
    parametersapplied = 'N/A'
    
    # Check what the programmer has input, and make the Perceptron associated to it.
    if penaltyvalue is None and maxitervalue is None:
        percept = Perceptron()
    elif penaltyvalue is not None and maxitervalue is None:
        percept = Perceptron(penalty=penaltyvalue)
        parametersapplied = 'Penalty type: ' + penaltyvalue
    elif penaltyvalue is None and maxitervalue is not None:
        percept = Perceptron(max_iter=maxitervalue)
        parametersapplied = 'Maximum number of iteration(s): ' + str(maxitervalue)
    else:
        percept = Perceptron(penalty=penaltyvalue, max_iter=maxitervalue)
        parametersapplied = 'Penalty type: ' + penaltyvalue + ' Maximum number of iteration(s): ' + str(maxitervalue)
    
    # Apply the data to the model...
    emodel = percept.fit(x_train, y_train)
    ey_pred = emodel.predict(x_test)
    # Create a confusion matrix...
    cm = confusion_matrix(y_test, ey_pred)
    tn, fp, fn, tp = cm.ravel()
    ttpc = tn + tp
    ttpi = fp + fn
    
    # Print out statistics about train-test data split and amount of data used (will help us to understand the confusion matrix).
    print("Total amount of data (test and train): %d" % len(hdata))
    print("Total amount of training data: %d" % len(x_train))
    print("Total amount of testing data: %d \n" % len(x_test))
    print("The aribtary accuracy score: %.2f" % accuracy_score(y_test, ey_pred))
    # Print out our findings...
    print("The confusion matrix with a " + parametersapplied + " applied: \n")
    print(cm)
    print('TN: ' + str(tn), '|| FP: ' + str(fp), '|| FN: ' + str(fn), '|| TP: ' + str(tp))
    print('Total Number predicted correctly (true negatives/positives): ' + str(ttpc))
    print('Total Number predicted incorrectly: (false negatives/positives): ' + str(ttpi))
    print("\n --------------------- \n Classification Report: \n")
    # Print out our classification report...
    print(classification_report(y_test, ey_pred))

#### Experiment Part 1: Applying Different Types of Penalties to Perceptrons

##### Our default Perceptron, just for quick reference...

In [16]:
create_and_change_defaults_of_perceptron(None, None)

Total amount of data (test and train): 306
Total amount of training data: 229
Total amount of testing data: 77 

The aribtary accuracy score: 0.78
The confusion matrix with a N/A applied: 

[[56  4]
 [13  4]]
TN: 56 || FP: 4 || FN: 13 || TP: 4
Total Number predicted correctly (true negatives/positives): 60
Total Number predicted incorrectly: (false negatives/positives): 17

 --------------------- 
 Classification Report: 

              precision    recall  f1-score   support

           1       0.81      0.93      0.87        60
           2       0.50      0.24      0.32        17

    accuracy                           0.78        77
   macro avg       0.66      0.58      0.59        77
weighted avg       0.74      0.78      0.75        77



In [17]:
create_and_change_defaults_of_perceptron('l2', None)

Total amount of data (test and train): 306
Total amount of training data: 229
Total amount of testing data: 77 

The aribtary accuracy score: 0.51
The confusion matrix with a Penalty type: l2 applied: 

[[32 28]
 [10  7]]
TN: 32 || FP: 28 || FN: 10 || TP: 7
Total Number predicted correctly (true negatives/positives): 39
Total Number predicted incorrectly: (false negatives/positives): 38

 --------------------- 
 Classification Report: 

              precision    recall  f1-score   support

           1       0.76      0.53      0.63        60
           2       0.20      0.41      0.27        17

    accuracy                           0.51        77
   macro avg       0.48      0.47      0.45        77
weighted avg       0.64      0.51      0.55        77



In [18]:
create_and_change_defaults_of_perceptron('l1', None)

Total amount of data (test and train): 306
Total amount of training data: 229
Total amount of testing data: 77 

The aribtary accuracy score: 0.78
The confusion matrix with a Penalty type: l1 applied: 

[[56  4]
 [13  4]]
TN: 56 || FP: 4 || FN: 13 || TP: 4
Total Number predicted correctly (true negatives/positives): 60
Total Number predicted incorrectly: (false negatives/positives): 17

 --------------------- 
 Classification Report: 

              precision    recall  f1-score   support

           1       0.81      0.93      0.87        60
           2       0.50      0.24      0.32        17

    accuracy                           0.78        77
   macro avg       0.66      0.58      0.59        77
weighted avg       0.74      0.78      0.75        77



In [19]:
create_and_change_defaults_of_perceptron('elasticnet', None)

Total amount of data (test and train): 306
Total amount of training data: 229
Total amount of testing data: 77 

The aribtary accuracy score: 0.51
The confusion matrix with a Penalty type: elasticnet applied: 

[[32 28]
 [10  7]]
TN: 32 || FP: 28 || FN: 10 || TP: 7
Total Number predicted correctly (true negatives/positives): 39
Total Number predicted incorrectly: (false negatives/positives): 38

 --------------------- 
 Classification Report: 

              precision    recall  f1-score   support

           1       0.76      0.53      0.63        60
           2       0.20      0.41      0.27        17

    accuracy                           0.51        77
   macro avg       0.48      0.47      0.45        77
weighted avg       0.64      0.51      0.55        77



In [20]:
create_and_change_defaults_of_perceptron(None, 1)

Total amount of data (test and train): 306
Total amount of training data: 229
Total amount of testing data: 77 

The aribtary accuracy score: 0.78
The confusion matrix with a Maximum number of iteration(s): 1 applied: 

[[60  0]
 [17  0]]
TN: 60 || FP: 0 || FN: 17 || TP: 0
Total Number predicted correctly (true negatives/positives): 60
Total Number predicted incorrectly: (false negatives/positives): 17

 --------------------- 
 Classification Report: 

              precision    recall  f1-score   support

           1       0.78      1.00      0.88        60
           2       0.00      0.00      0.00        17

    accuracy                           0.78        77
   macro avg       0.39      0.50      0.44        77
weighted avg       0.61      0.78      0.68        77



In [21]:
create_and_change_defaults_of_perceptron(None, 50)

Total amount of data (test and train): 306
Total amount of training data: 229
Total amount of testing data: 77 

The aribtary accuracy score: 0.78
The confusion matrix with a Maximum number of iteration(s): 50 applied: 

[[56  4]
 [13  4]]
TN: 56 || FP: 4 || FN: 13 || TP: 4
Total Number predicted correctly (true negatives/positives): 60
Total Number predicted incorrectly: (false negatives/positives): 17

 --------------------- 
 Classification Report: 

              precision    recall  f1-score   support

           1       0.81      0.93      0.87        60
           2       0.50      0.24      0.32        17

    accuracy                           0.78        77
   macro avg       0.66      0.58      0.59        77
weighted avg       0.74      0.78      0.75        77



In [22]:
create_and_change_defaults_of_perceptron('l1', 1)

Total amount of data (test and train): 306
Total amount of training data: 229
Total amount of testing data: 77 

The aribtary accuracy score: 0.78
The confusion matrix with a Penalty type: l1 Maximum number of iteration(s): 1 applied: 

[[60  0]
 [17  0]]
TN: 60 || FP: 0 || FN: 17 || TP: 0
Total Number predicted correctly (true negatives/positives): 60
Total Number predicted incorrectly: (false negatives/positives): 17

 --------------------- 
 Classification Report: 

              precision    recall  f1-score   support

           1       0.78      1.00      0.88        60
           2       0.00      0.00      0.00        17

    accuracy                           0.78        77
   macro avg       0.39      0.50      0.44        77
weighted avg       0.61      0.78      0.68        77



#### End of Task Discussion: Are the results as you expected? How can you explain them?

_For each solution, compare the results on training and testing data..._

- What are the values of a default Perceptron?
- What are the differences?
- Can you explain them?
- How did you decide which model is better?
- What parameters led to the best model?