# The Perceptron
Let us implement a perceptron.
As a reminder, our perceptron takes multiple inputs, weights each of them with a certain factor and checks if the sum is bigger than a threshold.
![Perceptron](./images/perceptron.png)
First we are going to collect the data to run this example:

In [None]:
! mkdir -p ~/data/workshop_data
! wget -c --retry-connrefused --tries=0 https://archive.ics.uci.edu/ml/machine-learning-databases/00357/occupancy_data.zip -O ~/data/workshop_data/occupancy_data.zip
! unzip ~/data/workshop_data/occupancy_data.zip -d ~/data/workshop_data/occupancy_data

In [None]:
# Let's start by importing the relevant packages
# matplotlib for plots
import matplotlib as mpl
from matplotlib import pyplot as plt
# pandas to read in some data
import pandas as pd
# numpy to build our first perceptron
import numpy as np
# Train test split to do validate our findings from the perceptron training
from sklearn.model_selection import train_test_split
# MinMaxScaler to normalise the data before inputting them to the perceptron
from sklearn.preprocessing import MinMaxScaler
%matplotlib inline
mpl.rcParams['figure.figsize'] = (16, 9)
import os

home = os.path.expanduser("~")
data = home + '/data/workshop_data/occupancy_data/datatraining.txt'

## Occupancy Detection Dataset
For training the perceptron we will utilise the [occupancy detection dataset](https://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+) from the [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets.html?task=&area=&type=ts&view=table). It contatins experimental data for binary classification  if a person or not is in a room given temperature, humidity, light and CO$_2$.
	


In [None]:
# Load the occupancy data so we have something to predict
df = pd.read_csv(data)
target = 'Occupancy'
# Let us drop the date for the time being as it is no cyclic feature, meaning it will not appear again
features = [col for col in df.columns if target not in col and 'date' not in col]
df.head()

In [None]:
print(df.min(), df.max())

## Split data into train and test
First we will split the data to validate what we learned on a dataset that we haven't seen before. We will utilize [train_test_split](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html) from the sklean package for this.

In [None]:
x_train, x_test, y_train, y_test = train_test_split(df[features], df[target].values)

## Normalize the data
We will normalize the data to be in a range from 0 to 1. This makes sure that all weights are in the same order of magnitude. Otherwise, the perceptron would need to learn the range of the data first and then how to separate the data best.

Let's have a look at an example for this:
- The lights minimum is 0 and its maximum is 1546.33. If we start with an inital weight between 0 and 1 the output would be as well between 0 and 1546.33.
- Looking at the Humidity ratio its minimum is 0.00267 and its maximum is 0.00647601. If we start with an inital weight between 0 and 1 the output would be between 0.00267 and its maximum is 0.00647601.
- The issue is, that we now sum outputs on each other. In the worst case the initial contribution of the light could be $1546.33/0.00267 \approx 500000$ higher. 
- Now, the algorithm would first need to learn to decrease the weight for the light a lot and increase the one for the humidity ratio a lot. 
- This can be avoided if we scale all of the features to be between 0 and 1. This can be easily done using sklearn's [MinMaxScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html)
- We have to make sure, that we only fit the scaler on the training data and only transform the test data. Otherwise, the learned weights of the perceptron would have another meaning for the training vs the test set.

In [None]:
scaler = MinMaxScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)

## Build the perceptron
To build and train a perceptron we have to perform three steps:
- Calculate the perceptron's output $\hat{y} = \left(\sum_i w_i X_i \geq 0\right)$ (this can be done in numpy using np.dot [docs](https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html))
- Determine the update for the weights using the error and the learning rate: $\partial w_i = \alpha (y-\hat{y}) X_i$
- Calculate new weights as: $w_i \leftarrow w_i + \partial w_i$
- Repeat the above steps until there occur no more updates (we will iterate once over the dataset instead)

In [None]:
# initializes weights
w = np.random.rand(len(features))
print("initial weights: {}".format(w))
# set a learning rare
alpha = 1e-2

In [None]:
def calculate_perceptron_output(w, x):
    # Calculate the perceptrons output using 
    # np.dot(w, x) to calculate the sum and
    # thresholding the output using >=0
    # Hint: You will need to use .astype(float) to cast 
    # the output to a float
    

In [None]:
def calculate_weight_update(alpha, y, y_hat, x):
    # Calculate the update of the weights output:

In [None]:
def update_weights(w, delta_w):
    # Add the weight change to the current weights

In [None]:
for x, y in zip(x_train, y_train.values):
    y_hat = calculate_perceptron_output(w, x)
    delta_w = calculate_weight_update(alpha, y, y_hat, x)
    w = update_weights(w, delta_w)

In [None]:
results = []
expected = []
for x in x_train:
    results.append(calculate_perceptron_output(w, x))
results = np.array(results)
expected = np.array(y_train.values)
print("final weights: {}".format(w))

In [None]:
print("accuracy: {}".format(np.mean(results == expected)))

## Let us change to PyTorch

To use PyTorch instead of numpy we need to replace all calls to numpy with calls to PyTorch.
- We need to import torch
- np.dot needs to be replaced with [torch.dot](https://pytorch.org/docs/stable/torch.html#torch.dot)
- numpy arrays need to be converted to torch tensors using [torch.from_numpy](https://pytorch.org/docs/stable/torch.html#torch.from_numpy)

In [None]:
import torch

In [None]:
def calculate_perceptron_output_torch(w, x):
    # Calculate the perceptrons output using 
    # using torch.dot instead of np.dot.
    # Hint do not need to cast the output anymore.
    # torch is great isn't it? :) 
    

In [None]:
w = np.random.rand(len(features))
alpha = np.array(alpha)
# Convert w, alpha, x_train and y_train to torch tensors
w = 
alpha = 
x_ttrain = 
y_ttrain = 
print("initial weights: {}".format(w))
for x, y in zip(x_ttrain, y_ttrain):
    # Use the new torch function to calculate the perceptron's output
    y_hat = calculate_perceptron_output_torch(w, x)
    # The weight update works as before as no specific calls
    # to numpy were made
    delta_w = calculate_weight_update(alpha, y, y_hat, x)
    w = update_weights(w, delta_w)
print("final weights: {}".format(w))

In [None]:
results = []
expected = []
for x, y in zip(x_ttrain, y_ttrain):
    result = calculate_perceptron_output_torch(w, x)
    expected.append(y)
    results.append(result)
results = torch.stack(results)
expected = torch.stack(expected)
print("weights: {}".format(w))
print("accuracy: {}".format((results == expected.byte()).float().mean()))

In [None]:
ax = df[df.Occupancy==1].plot(x='CO2', y='Light', ls='', marker='o', ms=3, color='r', label='occupied')
df[df.Occupancy==0].plot(x='CO2', y='Light', ls='', marker='o', ms=3, color='b', ax=ax, label='empty')
ax.set_ylabel('Light')