# Perceptron

There's no doubt that neural networks are one of the most potent and powerful machine learning algorithms in today's world. 

Today we will implement the simplest neural networks possible. It consists of only one neuron and it's called __Perceptron__.

The perceptron works great for two-class classification problems. In order to solve more complex tasks many perceptrons can be combined!

The perceptron algorithm draws inspiration from the way a single cell, called "neuron", processes information. Each neuron accepts input data via its dendrites and pass it as a signal to its body. In a similar fashion, a perceptron takes training examples as input signals and combine it in a linear equation called _activation_, that is defined as follows:

$$ activation = bias + \sum_{i=1}^{N}(weigth * x_i)$$

The activation is then thresholded to output a value or prediction.

$$ prediction=1.0\ if\ activation>=0.0\ else\ 0.0 $$

In order to determine the _weight_ (just another name for _coefficient_) we use __Gradient Descent__.

Let's start our implementation by loading the code and libraries we'll need. We will build our solution on top of the ones we implemented in the [previous notebook](https://github.com/jesus-a-martinez-v/toy-ml/blob/master/src/main/scala/notebooks/logistic_regression.ipynb).

In [1]:
import $ivy.`com.github.tototoshi::scala-csv:1.3.5`
import $file.^.datasmarts.ml.toy.scripts.LogisticRegression, LogisticRegression._
import scala.util.Random

[32mimport [39m[36m$ivy.$                                      
[39m
[32mimport [39m[36m$file.$                                             , LogisticRegression._
[39m
[32mimport [39m[36mscala.util.Random[39m

## Data

We'll use the [Sonar](https://archive.ics.uci.edu/ml/machine-learning-databases/undocumented/connectionist-bench/sonar/sonar.all-data) dataset. It involves the prediction of whether or not a certain object is a mine or a rock given the strength of sonar returns at various angles. It is, of course, a binary classification problem, perfect for our perceptron.


Let's load the data:

In [2]:
val BASE_DATA_PATH = "../../resources/data"
val sonarPath = s"$BASE_DATA_PATH/10/sonar.all-data.csv"

val rawData = loadCsv(sonarPath)
val numberOfRows = rawData.length
val numberOfColumns = rawData.head.length
println(s"Number of rows in dataset: $numberOfRows")
println(s"Number of column in dataset: $numberOfColumns")

val (data, lookUpTable) = {
    val dataWithNumericColumns = (0 until (numberOfColumns - 1)).toVector.foldLeft(rawData) { (d, i) => textColumnToNumeric(d, i)}
    categoricalColumnToNumeric(dataWithNumericColumns, numberOfColumns - 1)
}

Number of rows in dataset: 208
Number of column in dataset: 61


[36mBASE_DATA_PATH[39m: [32mString[39m = [32m"../../resources/data"[39m
[36msonarPath[39m: [32mString[39m = [32m"../../resources/data/10/sonar.all-data.csv"[39m
[36mrawData[39m: [32mVector[39m[[32mVector[39m[[32mData[39m]] = [33mVector[39m(
  [33mVector[39m(
    Text(0.0200),
    Text(0.0371),
    Text(0.0428),
    Text(0.0207),
    Text(0.0954),
    Text(0.0986),
    Text(0.1539),
    Text(0.1601),
    Text(0.3109),
    Text(0.2111),
[33m...[39m
[36mnumberOfRows[39m: [32mInt[39m = [32m208[39m
[36mnumberOfColumns[39m: [32mInt[39m = [32m61[39m
[36mdata[39m: [32mVector[39m[[32mVector[39m[[32mData[39m]] = [33mVector[39m(
  [33mVector[39m(
    Numeric(0.02),
    Numeric(0.0371),
    Numeric(0.0428),
    Numeric(0.0207),
    Numeric(0.0954),
    Numeric(0.0986),
    Numeric(0.1539),
    Numeric(0.1601),
    Numeric(0.3109),
    Numeric(0.2111),
[33m...[39m
[36mlookUpTable[39m: [32mMap[39m[[32mData[39m, [32mInt[39m] = [33mMap[39m(

## Making Predictions

Let's proceed to implement a function that makes prediction on a row, given some fitted weights. This will be very useful during the training phase as well as in the test stage.

In [3]:
def predictWithWeights(row: Vector[Data], weights: Vector[Double]) = {
  val indices = row.indices.init

  val activation = indices.foldLeft(0.0) { (accumulator, index) =>
    accumulator + weights(index + 1) * getNumericValue(row(index)).get
  } + weights.head

  if (activation >= 0.0) 1.0 else 0.0
}

defined [32mfunction[39m [36mpredictWithWeights[39m

Let's test it on a mock dataset:

In [4]:
val mockDataset = Vector(
    (2.7810836, 2.550537003,0),
    (1.465489372, 2.362125076, 0),
    (3.396561688, 4.400293529, 0),
    (1.38807019, 1.850220317, 0),
    (3.06407232, 3.005305973, 0),
    (7.627531214, 2.759262235, 1),
    (5.332441248, 2.088626775, 1),
    (6.922596716, 1.77106367, 1),
    (8.675418651,-0.242068655, 1),
    (7.673756466, 3.508563011, 1)).map{ case (x1, x2, y) => Vector(Numeric(x1), Numeric(x2), Numeric(y)) }

val mockWeigths = Vector(-0.1, 0.20653640140000007, -0.23418117710000003)

mockDataset.foreach { case row @ Vector(Numeric(x1), Numeric(x2), Numeric(y)) => 
    val predicted = predictWithWeights(row, mockWeigths)
    println(s"Expected=$y, Predicted=$predicted")
}

Expected=0.0, Predicted=0.0
Expected=0.0, Predicted=0.0
Expected=0.0, Predicted=0.0
Expected=0.0, Predicted=0.0
Expected=0.0, Predicted=0.0
Expected=1.0, Predicted=1.0
Expected=1.0, Predicted=1.0
Expected=1.0, Predicted=1.0
Expected=1.0, Predicted=1.0
Expected=1.0, Predicted=1.0


[36mmockDataset[39m: [32mVector[39m[[32mVector[39m[[32mNumeric[39m]] = [33mVector[39m(
  [33mVector[39m([33mNumeric[39m([32m2.7810836[39m), [33mNumeric[39m([32m2.550537003[39m), [33mNumeric[39m([32m0.0[39m)),
  [33mVector[39m([33mNumeric[39m([32m1.465489372[39m), [33mNumeric[39m([32m2.362125076[39m), [33mNumeric[39m([32m0.0[39m)),
  [33mVector[39m([33mNumeric[39m([32m3.396561688[39m), [33mNumeric[39m([32m4.400293529[39m), [33mNumeric[39m([32m0.0[39m)),
  [33mVector[39m([33mNumeric[39m([32m1.38807019[39m), [33mNumeric[39m([32m1.850220317[39m), [33mNumeric[39m([32m0.0[39m)),
  [33mVector[39m([33mNumeric[39m([32m3.06407232[39m), [33mNumeric[39m([32m3.005305973[39m), [33mNumeric[39m([32m0.0[39m)),
  [33mVector[39m([33mNumeric[39m([32m7.627531214[39m), [33mNumeric[39m([32m2.759262235[39m), [33mNumeric[39m([32m1.0[39m)),
  [33mVector[39m([33mNumeric[39m([32m5.332441248[39m), [33mNumeric[39m(

Now we are ready to implement stochastic gradient descent to train the weights of the perceptron. Let's do it.

## Estimating Weights

Now that we have a predicting function in place, the next step is to implement a function to estimate the weights that'll be used later on in the pipeline:

We are implementing Stochastic Gradient Descent. It requires two parameters:

 - __Learning Rate__: It is used to control the amount of correction each parameter will receive at a time.
 - __Number of epochs__: Number of times the algorithm will loop over all the data, updating the weights.
 
The outline of the algorithm is as follows:

 1. Loop over each epoch.
 2. Loop over each row in the training set.
 3. Loop over each weigth and update it for a row in an epoch.

In [7]:
def trainWeights(train: Dataset, learningRate: Double, numberOfEpochs: Int) = {
  var weights = Vector.fill(train.head.length)(0.0)

  for {
    _ <- 1 to numberOfEpochs
    row <- train
  } {

    val predicted = predictWithWeights(row, weights)
    val actual = getNumericValue(row.last).get
    val error = actual - predicted

    val bias = weights.head + learningRate * error
    val indices = row.indices.init

    val remainingWeights = indices.foldLeft(weights) { (w, index) =>
      val actual = getNumericValue(row(index)).get
      updatedVector(w, w(index + 1) + learningRate * error * actual, index + 1)
    }

    weights = Vector(bias) ++ remainingWeights.tail
  }

  weights
}

defined [32mfunction[39m [36mtrainWeights[39m

Let's get the weights for our mock dataset:

In [8]:
trainWeights(mockDataset, 0.1, 5)

[36mres7[39m: [32mVector[39m[[32mDouble[39m] = [33mVector[39m([32m-0.1[39m, [32m0.20653640140000007[39m, [32m-0.23418117710000003[39m)

## Perceptron

We have all that we need to implement a simple perceptron algorithm:

In [11]:
def perceptron(train: Dataset, test: Dataset, parameters: Parameters) = {
  val learningRate = parameters("learningRate").asInstanceOf[Double]
  val numberOfEpochs = parameters("numberOfEpochs").asInstanceOf[Int]

  val weights = trainWeights(train, learningRate, numberOfEpochs)

  test.map { row =>
    Numeric(predictWithWeights(row, weights))
  }
}

defined [32mfunction[39m [36mperceptron[39m

Good. We just need to unpack the relevant parameters, use SGD to obtain the weights and then use them to make predictions on the test set.

Let's now use our new algorithm to test it on the Sinar dataset.

We'll start by running a baseline model on it and then our freshly implemented perceptron algorithm and then we will compare their performance.

As a baseline we will use a __zero rule classifier__.

In [12]:
// Normalize data
val minMax = getDatasetMinAndMax(data)
val normalizedData = normalizeDataset(data, minMax)

val baselineAccuracy = evaluateAlgorithmUsingTrainTestSplit[Numeric](
        normalizedData, 
        (train, test, parameters) => zeroRuleClassifier(train, test), 
        Map.empty, 
        accuracy, 
        trainProportion=0.8)

println(s"Random Algorithm accuracy: $baselineAccuracy")

Random Algorithm accuracy: 0.5714285714285714


[36mminMax[39m: [32mMinMaxData[39m = [33mVector[39m(
  [33mSome[39m(([32m0.0015[39m, [32m0.1371[39m)),
  [33mSome[39m(([32m6.0E-4[39m, [32m0.2339[39m)),
  [33mSome[39m(([32m0.0015[39m, [32m0.3059[39m)),
  [33mSome[39m(([32m0.0058[39m, [32m0.4264[39m)),
  [33mSome[39m(([32m0.0067[39m, [32m0.401[39m)),
  [33mSome[39m(([32m0.0102[39m, [32m0.3823[39m)),
  [33mSome[39m(([32m0.0033[39m, [32m0.3729[39m)),
  [33mSome[39m(([32m0.0055[39m, [32m0.459[39m)),
  [33mSome[39m(([32m0.0075[39m, [32m0.6828[39m)),
  [33mSome[39m(([32m0.0113[39m, [32m0.7106[39m)),
  [33mSome[39m(([32m0.0289[39m, [32m0.7342[39m)),
[33m...[39m
[36mnormalizedData[39m: [32mDataset[39m = [33mVector[39m(
  [33mVector[39m(
    Numeric(0.1364306784660767),
    Numeric(0.15645092156022286),
    Numeric(0.13567674113009198),
    Numeric(0.03542558250118878),
    Numeric(0.22495561755008875),
    Numeric(0.2375705455522709),
    Numeric(0.4074675324675

In [14]:
val perceptronAccuracy = evaluateAlgorithmUsingTrainTestSplit[Numeric](
        normalizedData, 
        perceptron, 
        Map("learningRate" -> 0.01, "numberOfEpochs" -> 500), 
        accuracy, 
        trainProportion=0.8)

println(s"Perceptron accuracy: $perceptronAccuracy")

Perceptron accuracy: 0.7857142857142857


[36mperceptronAccuracy[39m: [32mDouble[39m = [32m0.7857142857142857[39m

It's quite noticeable that the difference in performance is remarkable.

As happens with many of the algorithms that belong to the _regression family_, the perceptron is very simple at its core, but it is very powerful when used in the right context.

Moreover, a perceptron is the basic building block of really complex architectures that are pushing forward the boundaries of machine learning and AI in general. 

Such enterprises as self-driving cars, flying cars, robot assistants, face recognition systems (and many more) implement complex neural networks that, in the end, are just composites of many units similar to the perceptron that are experts at noticing a particular feature in the data.

Quite exciting, huh? ;)