# DATA 4319 (Statistical & Machine Learning)
## Lecture 1. The Perceptron Learning Model

In this notebook we will implement the perceptron learning model in order to classify data from the [iris data set](https://en.wikipedia.org/wiki/Iris_flower_data_set). Our task is to predict the species of flower based off of measurements of sepeal length and width. This task is often referred to as the ''Hello World'' of machine learning .  

You will need to import the following packages:
- CSV [(documentation)](https://juliadata.github.io/CSV.jl/stable/)
- Plots [(documentation)](http://docs.juliaplots.org/latest/) 



In [None]:
using CSV
""" Provided you have a saved and valid .csv file in your current working directory, you may 
    load this file as a Dataframe using the following syntax. 
"""
iris = CSV.read("iris_data.csv")
iris = iris[1:100,1:5]

In [None]:
# We will only use the sepal length and width for our analysis 
data = [x for x in zip(iris[1], iris[2], iris[5])]

Notice the following cell and associated plot illustrates linearily seperable data.


In [None]:
using Plots
scatter([x[1:2] for x in data if x[3] == "setosa"], label = "setosa")
scatter!([x[1:2] for x in data if x[3] != "setosa"], label = "versicolor")
plot!(title = "Iris 2-D Data", xlabel = "Sepal Length", ylabel = "Sepal Width")                      

In [None]:
# Assign X: input data
# Assign Y: known values 
X, Y = [[x[1], x[2]] for x in data], [x[3] == "setosa" ? 1 : -1 for x in data]

In [None]:
# Assign random weights
w = rand(3)

# Perceptron Hypothesis Function 
function h(w, x)
    x_new = [1.0, x[1], x[2]]
    return w'x_new > 0 ? 1 : -1
end

In [None]:
# Perceptron Learning Algorithm 
function PLA(w, x, y)
    if h(w, x) != y
        w += y*[1.0, x[1], x[2]]
    end
    return w
end

In [None]:
# Iterate the PLA 20 times 
for i = 1:1000
    # Choose random entries to update (if possible )
    j = rand(1:100)
    w = PLA(w, X[j], Y[j])
end

In [None]:
# Check errors
temp_error = 0
for i = 1:100
    if abs(h(w, X[i]) - Y[i]) != 0
        temp_error += 1
    end
end
println("After 1000 terations, we have $temp_error number of errors")
