# In-Class Challenge Assignment: Experimenting with the Perceptron
# Day 18 Extension
# CMSE 202

## Now that you have a working Perceptron Classifier... let's experiment with it a bit!

When building and testing your Perceptron Classifier you used a simplified version of the iris dataset that has been reduced to just two features and two class labels. 

### Will your Perceptron classifier work on a more complex dataset?

Another widely used dataset for experimenting with binary classification is the [sonar dataset](https://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+(Sonar,+Mines+vs.+Rocks)).

A version of this dataset can be found here:

`https://raw.githubusercontent.com/msu-cmse-courses/cmse202-S22-data/main/data/sonar.csv`

Make sure you take a moment to read the [UC Irvine Machine Learning Repository page](https://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+(Sonar,+Mines+vs.+Rocks)) to understand exactly what is in this dataset, but essentially is a collection of sonar measurements of rocks and "mines" (metal cynlinders). 



In [1]:
# Do This: Load in the binary-iris.csv file and plot the data based on the iris classifications
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np 
df=pd.read_csv('sonar.csv')
print(df.columns)
len(df)



Index(['attribute_1', 'attribute_2', 'attribute_3', 'attribute_4',
       'attribute_5', 'attribute_6', 'attribute_7', 'attribute_8',
       'attribute_9', 'attribute_10', 'attribute_11', 'attribute_12',
       'attribute_13', 'attribute_14', 'attribute_15', 'attribute_16',
       'attribute_17', 'attribute_18', 'attribute_19', 'attribute_20',
       'attribute_21', 'attribute_22', 'attribute_23', 'attribute_24',
       'attribute_25', 'attribute_26', 'attribute_27', 'attribute_28',
       'attribute_29', 'attribute_30', 'attribute_31', 'attribute_32',
       'attribute_33', 'attribute_34', 'attribute_35', 'attribute_36',
       'attribute_37', 'attribute_38', 'attribute_39', 'attribute_40',
       'attribute_41', 'attribute_42', 'attribute_43', 'attribute_44',
       'attribute_45', 'attribute_46', 'attribute_47', 'attribute_48',
       'attribute_49', 'attribute_50', 'attribute_51', 'attribute_52',
       'attribute_53', 'attribute_54', 'attribute_55', 'attribute_56',
       'attribu

208

In [2]:
## Fill in the skeleton

class Perceptron():

    def __init__ (self, labeled_data, iters, learning_rate):        
        # attributes now are: data, weights, iterations, learning rate
        self.data=labeled_data
        self.weights=np.ones(shape=61)
        self.iters=iters
        self.learning_rate=learning_rate
        self.error_count=0
         # delete this line when you add your code
        
    def predict(self, feature_set):
        result=np.dot(feature_set,self.weights[1:])+self.weights[0]
        if result>0:
            return 1
        if result<0:
            return -1
         # delete this line when you add your code
    
    def fit(self):
        # for all iterations
        #    for each row in the data
        #        find the update value, changes the weights (including bias weight)
        for i in range(self.iters):
            for row in self.data:
                predict=self.predict(row[:-1])
                actual=row[-1]
                if int(predict)!=int(actual):
                    self.error_count+=1
                self.weights[0]=self.weights[0]+self.learning_rate*(actual-predict)
                for r in range(1,len(self.weights)):
                    self.weights[r]=self.weights[r]+self.learning_rate*(actual-predict)*row[r-1]
                    
                    
            
         # delete this line when you add your code

    def errors(self):
        # how many rows of the data don't match the provided label?
        print(self.error_count)
        print(self.weights)
        
            
            
        
        pass # delete this line when you add your code
    

In [3]:
df["Class"].replace({'Rock': 1, 'Mine': -1}, inplace=True)
data=df.values
data

p = Perceptron(data, 10, .07)
p.fit()
p.errors()

86
[-1.52      0.628496  0.552798  0.427498 -0.018164  0.11898   0.586202
  0.512884  0.89899   0.647704  0.315582  0.173202  0.374088  0.245386
 -0.053276 -0.475964 -0.224202 -0.199646 -0.232266 -0.163288 -0.04979
 -0.148784  0.219836  0.24603  -0.327956 -0.200738  0.218912  0.148296
 -0.30487   0.113408 -0.17775  -0.01283   0.124692  0.115592  0.101648
 -0.21478  -0.056146  0.234886  0.031522  0.094578 -0.06512  -0.586606
 -0.33987   0.067978  0.582212  0.25464   0.037304  0.411566  0.54115
  0.709094  0.935516  0.902294  0.874504  0.912976  0.92433   0.934004
  0.981814  0.962732  0.914754  0.895154  0.889078]


In [4]:
# Metrics for Evaluation of model Accuracy and F1-score
from sklearn.metrics  import f1_score,accuracy_score
 
#Importing the Decision Tree from scikit-learn library
from sklearn.tree import DecisionTreeClassifier
 
# For splitting of data into train and test set
from sklearn.model_selection import train_test_split

from sklearn.linear_model import Perceptron

df1=pd.read_csv('sonar.csv')
data=df.drop(columns=['Class'])
data=data.values

datasets = train_test_split(data, 
                            df['Class'],
                            test_size=0.25)

train_data, test_data, train_labels, test_labels = datasets

# p = Perceptron(max_iter=40,shuffle=False)
# p.fit(train_data, train_labels)


# predictions_train = p.predict(train_data)
# predictions_test = p.predict(test_data)
# train_score = accuracy_score(predictions_train, train_labels)
# print("score on train data: ", train_score)
# test_score = accuracy_score(predictions_test, test_labels)
# print("score on train data: ", test_score)

# p.score(train_data, train_labels)


# from sklearn.metrics import classification_report

# print(classification_report(p.predict(train_data), train_labels))
# print(classification_report(p.predict(test_data), test_labels))

In [5]:
p = Perceptron(max_iter=40,shuffle=False)
p.fit(train_data, train_labels)


predictions_train = p.predict(train_data)
predictions_test = p.predict(test_data)
train_score = accuracy_score(predictions_train, train_labels)
print("score on train data: ", train_score)
test_score = accuracy_score(predictions_test, test_labels)
print("score on train data: ", test_score)

p.score(train_data, train_labels)


from sklearn.metrics import classification_report

print(classification_report(p.predict(train_data), train_labels))
print(classification_report(p.predict(test_data), test_labels))

score on train data:  0.8269230769230769
score on train data:  0.6923076923076923
              precision    recall  f1-score   support

          -1       0.79      0.88      0.83        75
           1       0.88      0.78      0.82        81

    accuracy                           0.83       156
   macro avg       0.83      0.83      0.83       156
weighted avg       0.83      0.83      0.83       156

              precision    recall  f1-score   support

          -1       0.70      0.70      0.70        27
           1       0.68      0.68      0.68        25

    accuracy                           0.69        52
   macro avg       0.69      0.69      0.69        52
weighted avg       0.69      0.69      0.69        52



In [8]:
! pip list

Package                            Version
---------------------------------- -------------------
absl-py                            1.0.0
alabaster                          0.7.12
anaconda-client                    1.7.2
anaconda-navigator                 2.0.3
anaconda-project                   0.9.1
anyio                              2.2.0
appdirs                            1.4.4
applaunchservices                  0.2.1
appnope                            0.1.2
appscript                          1.1.2
argh                               0.26.2
argon2-cffi                        20.1.0
asn1crypto                         1.4.0
asteval                            0.9.26
astroid                            2.5
astropy                            4.2.1
astunparse                         1.6.3
async-generator                    1.10
atomicwrites                       1.4.0
attrs                              20.3.0
autopep8                           1.5.6
Babel           

In [None]:
# import tensorflow as tf
# mnist = tf.keras.datasets.mnist

# (train_data, train_labels),(test_data, test_labels) = mnist.load_data()
# train_data, test_data = train_data / 255.0, test_data / 255.0

# model = tf.keras.models.Sequential([
#   tf.keras.layers.Flatten(input_shape=(28, 28)),
#   tf.keras.layers.Dense(128, activation='relu'),
#   tf.keras.layers.Dropout(0.2),
#   tf.keras.layers.Dense(10, activation='softmax')
# ])

# model.compile(optimizer='adam',
#               loss='sparse_categorical_crossentropy',
#               metrics=['accuracy'])

# model.fit(train_data, train_labels, epochs=5)
# model.evaluate(test_data, test_labels)

---
### Testing your new tool and exploring others

With any time that you have left in class, see if you can accomplish the following:

1. Load up the sonar dataset and change the class labels so that they can be used with the Perceptron classifier.

2. Use the Perceptron classifier you built from scratch to see how well you can do at distinguishing rocks from mines. You may need to make some modifications to your code if you didn't build it to be flexible enough to accept an arbitary number of data deatures. Experiment with the learning rate and number of iterations to see how high of an accuracy you can get with your classifier.

3. If you get your Perceptron classifier working, can you figure out how to use the Perceptron Classifier that is available in [scikit-learn](https://scikit-learn.org/stable/index.html)? You may need to do a bit of Google searching and exploration of the documentation to figure this out. How well does the scikit-learn version do compared to the one you built?

4. If you're feeling really ambitious, can you build a Perceptron classifier with [Tensorflow](https://www.tensorflow.org/)? Remember, the Perceptron is basically just a single-neuron single-layer neural network.

&#169; Copyright Department of Computational Mathematics, Science and Engineering; Michigan State University