# Perceptron algorithm (exercise)

In this exercise, implement a single neuron (perceptron) that classifies two groups of flowers from the Iris dataset.

The data set consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters. Based on the combination of these four features, a linear discriminant model can be developed to distinguish the species from each other.

For the purpose of this exercise, you will only use two features (sepal length and sepal width) of two species of flowers (Setosa and Versicolor).

In [1]:
# allows inline plotting below each cell
%matplotlib inline

In [21]:
# import necessary libraries 
import numpy as np
import matplotlib.pyplot as plt 
import pandas as pd 

In [22]:
# Function that plots data and linear classifier

def plot_border(w1, b, data):
    plt.axis([0, 10, 0, 6])
    plt.grid()
    
    # scatter data
    for i in range(len(data)) :
        point = data[i]
        color = "r"
        if point[2] == 0 :
            color = "b"
        plt.scatter(point[0], point[1], c=color)

    # separation line
    x = np.linspace(0, 10, 100)
    plt.plot(x, w1*x+b, '-g', label='y=w1*x+b')
    plt.show()  

In [5]:
data = pd.read_csv("iris.data")
data

Unnamed: 0,5.1,3.5,1.4,0.2,Iris-setosa
0,4.9,3.0,1.4,0.2,Iris-setosa
1,4.7,3.2,1.3,0.2,Iris-setosa
2,4.6,3.1,1.5,0.2,Iris-setosa
3,5.0,3.6,1.4,0.2,Iris-setosa
4,5.4,3.9,1.7,0.4,Iris-setosa
...,...,...,...,...,...
144,6.7,3.0,5.2,2.3,Iris-virginica
145,6.3,2.5,5.0,1.9,Iris-virginica
146,6.5,3.0,5.2,2.0,Iris-virginica
147,6.2,3.4,5.4,2.3,Iris-virginica


In [15]:
# each point is a tuple (sepal length, sepal width, flower type)
train = data.drop(["1.4", "0.2", "Iris-setosa"], axis=1)
training_data = np.array(train)
training_data

array([[4.9, 3. , 0. ],
       [4.7, 3.2, 0. ],
       [4.6, 3.1, 0. ],
       [5. , 3.6, 0. ],
       [5.4, 3.9, 0. ],
       [4.6, 3.4, 0. ],
       [5. , 3.4, 0. ],
       [4.4, 2.9, 0. ],
       [4.9, 3.1, 0. ],
       [5.4, 3.7, 0. ],
       [4.8, 3.4, 0. ],
       [4.8, 3. , 0. ],
       [4.3, 3. , 0. ],
       [5.8, 4. , 0. ],
       [5.7, 4.4, 0. ],
       [5.4, 3.9, 0. ],
       [5.1, 3.5, 0. ],
       [5.7, 3.8, 0. ],
       [5.1, 3.8, 0. ],
       [5.4, 3.4, 0. ],
       [5.1, 3.7, 0. ],
       [4.6, 3.6, 0. ],
       [5.1, 3.3, 0. ],
       [4.8, 3.4, 0. ],
       [5. , 3. , 0. ],
       [5. , 3.4, 0. ],
       [5.2, 3.5, 0. ],
       [5.2, 3.4, 0. ],
       [4.7, 3.2, 0. ],
       [4.8, 3.1, 0. ],
       [5.4, 3.4, 0. ],
       [5.2, 4.1, 0. ],
       [5.5, 4.2, 0. ],
       [4.9, 3.1, 0. ],
       [5. , 3.2, 0. ],
       [5.5, 3.5, 0. ],
       [4.9, 3.1, 0. ],
       [4.4, 3. , 0. ],
       [5.1, 3.4, 0. ],
       [5. , 3.5, 0. ],
       [4.5, 2.3, 0. ],
       [4.4, 3.2

In [18]:
# types: 0 for Setosa & 1 for Versicolor
test = data.iloc[0:100, 4].values
test_data = np.where(test == 'Iris-setosa', 0, 1)
test_data

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

In [23]:
# write a prediction function that, when the output z is 0 prints "Iris-setosa" and when z = 1 prints "Iris-versicolor".
# Z is the combination of w1, w2 and the bias that make the perceptron. 
def guess_flower(SepalLength, SepalWidth):
    z = w1*SepalLength + w2*SepalWidth + b 
    if (z<0):
        print("Iris-setosa") 
    else:
        print("Iris-versicolor")

In [35]:
### visualize training data in 2D ###

# x-axis: sepal length, y-axis: sepl width
# use 2 colors to visualize 2 different classes of data 


In [4]:
### training loop ###

# pick a learning rate

# initialize weights randomly and set bias to zero

# write a loop of arbitrary n iterations

    # if a point is 0 and is missclassified as 1:
        #update the weights accordingly
        
    # if a point is 1 and is missclassified as 0:
        #update the weights accordingly 

# plot the final result

# BONUS: plot the result after each iteration

In [None]:
### evaluation ###

# perform prediction on the test dataset



In [None]:
### plot the evaluation result ###


### Can the accuracy be improved given the limitations (lineal function) of the perceptron algorithm?

Type your answer here:

In [None]:
# BONUS: Create a confusion matrix with the type of classification errors