# Week 03 (practical 2)

In this practical we will implement different versions of the Perceptron model introduced during last week lectures. As presented during lectures, the simplest Perceptron is a model that implements the following function:
\begin{equation*}
y=\Theta(w_1x_1 + w_2x_2 + \dots + w_kx_k + b)=\Theta(wx+b)
\end{equation*}
where:
\begin{equation*}
\Theta(v)=
  \begin{cases}
   1 & \text{if } v \geq 0 \\
   0       & \text{if } v < 0 \end{cases}
\end{equation*}



## Task 1: Using perceptron to implement logical functions 
In this task you will use perceptron to implement different logical functions: NOT, AND, OR and XOR. For a logical function, both the input and the output have only two possible states: 0 and 1 (i.e., False and True). The  step binary activation function fits the purpose well since it produces a binary output.

In [1]:
import numpy as np

***
**T1.1 Implementing the step binary activation function**

In this task we implement binary step function introduced during lecture. The method takes one input. If the input value is equal or greater than zero, the method should return one, otherwise it returns zero. Note the method `step_binary` implemented below will work for both scalar and vector inputs. You can test it by passing a scalar and a numpy array as input to the method.

In [2]:
def step_binary(v):
    return 1 * (v > 0)

***
**T1.2 Implementing perceptron**

In this task we implement a method which performs operation of a perceptron. It takes three inputs: `x`, `w` and `b`. `x` represents the input variables, `w` represents weights of the perceptron and `b` stands for the bias. The method returns $\Theta(wx+b)$, where $\Theta$ is the activation function. Note that we multiply `x` and `w` using the `dot` function, rather than using `@` as it was done during lectures. This is because for some of the exercises we will need for `x` and `w` to be scalars, and `@` only works with vectors/matrices. As discussed during the lectures, we can use both `@` or `dot`, we just need to ensure that `x` and `w` are of appropriate shape.

In [3]:
def perceptron(x,w,b):
    v = np.dot(x,w)+b 
    out = step_binary(v)
    return out

***
**T1.3 Implementing NOT logical function**

$NOT(x)$ is a 1-variable function, this means that we will have one input at a time (number of input variables $k=1$). In this task we implement a method, which takes a single value as input (we can skip validation of the input and assume that it is always correct: zero or one) and returns its negation using the perceptron method with appropriate values of the parameters `w` and `b`. Note that the main challenge of this task is finding the values of parameters `w` and `b`, which allows the perceptron to implement the NOT logical function. There are many different solutions that we could use, in this example we use $w=-1$ and $b = 0.5$.

In [4]:
def NOT_(x):
    return perceptron(x, -1, 0.5)

Call the method passing $0$ or $1$ as parameter in order to evaluate whether it works correctly (i.e returns negation of the input).

In [5]:
NOT_(0),NOT_(1)

(1, 0)

***
**T1.4 Implementing AND logical function**

The AND logical function is a 2-variables function, $AND(x_1, x_2)$, with binary inputs and output.

0 AND 1 = 0

1 AND 0 = 0

0 AND 0 = 0

1 AND 1 = 1

In this task we implement a method, which takes two values as an input and it returns the conjunction (AND) of the two boolean values. This time the perceptron is associated with the following computation: $y = \theta(w_1x_1 + w_2x_2 + b)$. As in the previous task, we need to figure out what values of parameters $w_1$, $w_2$, and $b$ need to be used for the perceptron to behaves as AND logical operator. Let's try $w_1=1, w_2=1$ and $b=-1.5$. Hint: note that in this case we are dealing with a vector of weights rather than a single weight as in case on the NOT function.

In [6]:
def AND_(x):
  return perceptron(x,(1,1),-1.5)

Call the method with the following examples in order to test whether it behaves as expected.

In [7]:
x1 = np.array([0,1])
x2 = np.array([1,0])
x3 = np.array([0,0])
x4 = np.array([1,1])

In [8]:
AND_(x1),AND_(x2),AND_(x3),AND_(x4),

(0, 0, 0, 1)

***
**T1.5 Implementing OR logical functions**

Following the same idea, implement method which performs OR operation as follow:

0 OR 1 = 1

1 OR 0 = 1

0 OR 0 = 0

1 OR 1 = 1

Can you guess values of the parameters that will make the perceptron operate as the OR logical function?

In [9]:
def OR_(x):
  return perceptron(x,(1,1),-0.5)

In [10]:
OR_(x1),OR_(x2),OR_(x3),OR_(x4)

(1, 1, 0, 1)

***
**T1.6 Implementing XOR logical function**

In the previous tasks you have developed three fundamental logical perceptrons: NOT, AND and OR. Now you have to build a network of those perceptrons so that it implements XOR function:

0 XOR 1 = 1

1 XOR 0 = 1

0 XOR 0 = 0

1 XOR 1 = 0

The solution is as follow:
\begin{equation*}
XOR(x_1,x_2) = AND(NOT(AND(x_1,x_2)),OR(x_1,x_2))
\end{equation*}

Using the above formula, implement a method which performs XOR operation. 

In [11]:
def XOR_(x):
  return AND_([NOT_(AND_(x)),OR_(x)])

In [12]:
XOR_(x1),XOR_(x2),XOR_(x3),XOR_(x4)

(1, 1, 0, 0)

## Task 2: Training Perceptron with Simple Update Rule

In this task we will implement a simply perceptron and train it by applying the Simple Update Rule introduced during one of our lectures. Note that training perceptron means searching for the optimal values of its parameters `w` and `b` (instead of guessing them as we did in the previous task). 

***
**T2.1 Loading dataset**

1. We will first load some of the toy datasets for binary classification from the sckit-learn library. Familiarise yourself with the format of the `Breast Cancer` dataset. 
2. We will split the dataset into input and output variables.
3. We will scale the input variable using $min$ $max$ $scaling$ method.  

In [13]:
from sklearn import datasets
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from sklearn import metrics

bc = datasets.load_breast_cancer()

#extracting input and output variables
x = bc.data
y = bc.target

#scaling the input data
x = preprocessing.MinMaxScaler().fit_transform(x)
print(x.shape)
print(y.shape)

(569, 30)
(569,)


***
**T2.2 Implementing the Update Rule for training perceptron**

You need to implement a method (perceptron update rule), which updates the parameters of the perceptron as per pseudo code: 

__Input:__ training set $x=\{x^1, \dots, x^n\}, y=\{y^1,\dots,y^n\}$, learning rate $\alpha$ (set it t 0.01)

1. Initialize weights $w_1,\dots,w_k$ and $b$ with random values. Use [uniform distribution](https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.random.uniform.html) to assign values between -1 and 1 to each parameter. Note that the weights vector should have the same dimension as the input and the bias is just a scalar.

2. For each $x^j= \langle x_1, \dots, x_k\rangle$ from x:

   - Pass $x^j$ as input to the perceptron with weights $w_1,\dots,w_k$ and bias $b$
   - Compute the error $e$ as the difference between the expected value and the output of the perceptron $y_j-\bar{y}^j$
   - Adjust the parameters according to the formulas: $w_{i}^{new}=w_i+\alpha ex^j$     $\hspace{10mm} b^{new}=b+\alpha e$ 

In [14]:
def update_perceptron(x,y,lr,iterations):
  if len(x) != len(y):
    return
  weights = np.random.uniform(-1,1,len(x[1]))
  b = np.random.uniform(-1,1)
  for iter in range(iterations):
    ypreds = []
    for i in range(len(x)):
      xj = x[i]
      ypred = perceptron(xj,weights,b)
      e = y[i] - ypred
      weights = weights + lr*e*xj
      b = b + e*lr
    preds = perceptron(x,weights,b)
    acc = sum(preds==y)/len(y)
    print('Iteration: %s\t Accuracy: %s' %(iter,acc))
      
  return(weights, b)

In [15]:
update_perceptron(x,y,0.01,1)

Iteration: 0	 Accuracy: 0.8295254833040422


(array([ 0.55000653,  0.64141169, -0.05691036, -0.80462096,  0.23542635,
         0.8691089 , -0.35925376, -0.96611106, -0.87011833, -0.33600459,
        -0.63515532,  0.2374898 ,  0.71294102, -0.98673566,  0.22740413,
        -0.42993118,  0.69421344, -0.49636165,  0.50704678,  0.66676978,
        -0.58601742,  0.38801433, -0.23818426,  0.70839039, -0.42289283,
        -0.39783356,  0.75031452, -0.67098686,  0.85667087, -0.24817206]),
 0.42569116361394466)

***
**T2.3 Evaluating the perceptron model** 

Evaluate the perceptron before (parameters assigned with random values) and after training (parameters updated via the update rule). At this point we can just use our training data for the evaluation. Since we are dealing with a classification problem (we are predicting labels: 1 or 0) you can also use the [accuracy](https://scikit-learn.org/stable/modules/model_evaluation.html) metric (mean absolute/squared error metrics are also fine).

Note that the weights are assigned randomly at the beginning, which means that you can obtain different result each time unless you set the seed of the random number generator (np.random.seed(0)). Run you program number of times to see whether the perceptron improves each time as a result of the training.

In [16]:
ypreds = []
np.random.seed(0)
weights = np.random.uniform(-1,1,len(x[1]))
b = np.random.uniform(-1,1)
for i in range(len(x)):
  ypreds.append(perceptron(x[i],weights,b))

(weights,b) = update_perceptron(x,y,0.01,1)
newPreds = []
for i in range(len(x)):
  newPreds.append(perceptron(x[i],weights,b))

Iteration: 0	 Accuracy: 0.7100175746924429


**T2.4 Increasing the time of training**

As you could observe, applying the Update Rule only once to the training data set may not be enough to obtain a good model, in particular in the case of unfavorable initialization of the parameters. Increase the training of the model by applying the Update Rule multiple times to the training dataset (using for loop) to see if you can get more consistent results.

In [17]:
ypreds = []
np.random.seed(0)
weights = np.random.uniform(-1,1,len(x[1]))
b = np.random.uniform(-1,1)
for i in range(len(x)):
  ypreds.append(perceptron(x[i],weights,b))

(weights,b) = update_perceptron(x,y,0.01,50)
newPreds = []
for i in range(len(x)):
  newPreds.append(perceptron(x[i],weights,b))

Iteration: 0	 Accuracy: 0.7100175746924429
Iteration: 1	 Accuracy: 0.8084358523725835
Iteration: 2	 Accuracy: 0.8488576449912126
Iteration: 3	 Accuracy: 0.8664323374340949
Iteration: 4	 Accuracy: 0.8840070298769771
Iteration: 5	 Accuracy: 0.8998242530755711
Iteration: 6	 Accuracy: 0.9068541300527241
Iteration: 7	 Accuracy: 0.9121265377855887
Iteration: 8	 Accuracy: 0.9191564147627417
Iteration: 9	 Accuracy: 0.9209138840070299
Iteration: 10	 Accuracy: 0.9226713532513181
Iteration: 11	 Accuracy: 0.929701230228471
Iteration: 12	 Accuracy: 0.9349736379613357
Iteration: 13	 Accuracy: 0.9332161687170475
Iteration: 14	 Accuracy: 0.9367311072056239
Iteration: 15	 Accuracy: 0.9437609841827768
Iteration: 16	 Accuracy: 0.945518453427065
Iteration: 17	 Accuracy: 0.9437609841827768
Iteration: 18	 Accuracy: 0.9507908611599297
Iteration: 19	 Accuracy: 0.9525483304042179
Iteration: 20	 Accuracy: 0.9490333919156415
Iteration: 21	 Accuracy: 0.9578207381370826
Iteration: 22	 Accuracy: 0.945518453427065
I