# Perceptron, Adaline and Logistique Regression

We are interested in the implementation of the perceptron algorithm (Rosenblatt, 68), Adaline (Widrow et Hoff, 60) and Logisitc Regression (Cox, 66) whose pseudo-code are the following:

Perceptron:
`Input: Train, eta, MaxEp
init: w
epoch = 0
err = 1
m = len(Train)
while epoque <= MaxEp and err! = 0
    err = 0
    for i in 1: m
        h <- w * x
        if (y * h <= 0)
            w <- w + eta * y * x
            err <- err + 1
     epoch <- epoch + 1
output: w`

Adaline:
`input: Train, eta, MaxEp
init : w
epoque=0
err=1
m = len(Train)
while epoque<=MaxEp and err!=0
    err=0
    for i in 1:m
        h <- w*x
        if(y*h<=0)
           err <- err+1
        w <- w + eta*(y-dp)*x
     epoque <- epoque+1
output: w`

Logistic Regression:
`input: Train, eta, MaxEp
init : w
epoque=0
err=1
m = len(Train)
while epoque<=MaxEp and err!=0
    err=0
    for i in 1:m
        choisir un exemple (x,y) de Train de façon aléatoire
        h <- w*x
        if(y*h<=0)
           err <- err+1
        w <- w + eta*y*(1-sigm(y*dp))*x
     epoque <- epoque+1
output: w`

1. Create a list of 4 elements corresponding to the logical AND example called `Train`:
$Train=\{((+1,+1),+1),((-1,+1),-1),((-1,-1),-1),((+1,-1),-1)\}$

Each element of the list is a list which last characteristic is the class of the example and the first characteristics their coordinates.

    

In [3]:
Train=[[+1,+1,+1],[-1,+1,-1],[-1,-1,-1],[+1,-1,-1]] # To be filled

2. Code the Perceptron, Adaline and LR (Logistic regression) programs

Hint: You can write a function that calculates the dot product between an example $\mathbf{x} = (x_1, \ldots, x_d)$ and the weight vector $\mathbf{w} = (w_0, w_1, \ldots, w_d)$: 
$ h(\mathbf{x},\mathbf{w}) = w_0 + \sum_ {j = 1} ^ d w_j x_j $.


In [34]:
from math import exp
def h(x,w):
    # The prediction of the model
    Pred=w[0]
    d=len(x)
    for j in range(d):
        Pred+= w[j+1]*x[j]
    return Pred


def Perceptron(Train,eta,MaxEp):
    # Perceptron Algorithm 
    d=len(Train[0])-1
    m=len(Train)
    W=[0.0 for i in range(d+1)]
    epoque=0
    err=1
    while epoque <= MaxEp and err != 0:
        err = 0
        for Obs in Train:
            x,y = Obs[:-1], Obs[-1]
            h_w = h(x, W)
            if y * h_w <= 0: 
                W = [W[0] + eta*y] + [wj + eta*y*xj for wj, xj in zip(W[1:], x)]
                err += 1
        epoque += 1
    return W

    
def Adaline(Train,eta,MaxEp):
    # Adaline Algorithm 
    d=len(Train[0])-1
    m=len(Train)
    W=[0.0 for i in range(d+1)]
    err=1
    epoque=0
    while epoque <= MaxEp and err != 0:
        err = 0
        for Obs in Train:
            x,y = Obs[:-1], Obs[-1]
            h_w = h(x, W)
            if y * h_w <= 0: 
                err += 1
            W = [W[0] +  eta*(y - h_w)] + [wj + eta*(y - h_w)*xj for wj, xj in zip(W[1:], x)]
        epoque += 1
    return W

def sigmoid(z):
    return (1.0/(1.0+exp(-z)))

def LR(Train,eta,MaxEp):
    # Logisitc Regression Algorithm 
    d=len(Train[0])-1
    m=len(Train)
    W=[0.0 for i in range(d+1)]
    err=1
    epoque=0
    while epoque <= MaxEp and err != 0:
        err = 0
        for Obs in Train:
            x,y = Obs[:-1], Obs[-1]
            h_w = h(x, W)
            s   = (1.0-sigmoid(y*h_w))
            if y * h_w <= 0: 
                err += 1
            W = [W[0] + eta*y*s] + [wj + eta*y*s*xj for wj, xj in zip(W[1:], x)]
        epoque += 1
    return W


3. Apply the three learning models on the logical AND, and calculate the model error rate on this basis.

Hint: You can write a function that takes a weight vector $\mathbf{w}$ and an example $(\mathbf{x},y)$ and calculates the error rate of the model with weight $\mathbf{w}$.

In [35]:
def EmpiricalRisk(Test,W):
    E=0.0
    m=len(Test)
    # The empirical error of a model with weight W on a test set of size m
    for Obs in Test:
        y = Obs[-1]
        h_w = h(Obs[:-1], W)
        if (y*h_w <= 0):
            E+=1.0
    return E/float(m)

WP=Perceptron(Train,0.1,4)
print(f"ErrPerceptron={EmpiricalRisk(Train,WP):.1f}")
WA=Adaline(Train,0.1,4)
print(f"ErrAdaline={EmpiricalRisk(Train,WA):.1f}")
WLR=LR(Train,0.1,4)
print(f"ErrRL={EmpiricalRisk(Train,WLR):.1f}")


ErrPerceptron=0.0
ErrAdaline=0.0
ErrRL=0.0


4. We are now going to focus on the behavior of the three models on http://archive.ics.uci.edu/ml/datasets/connectionist+bench+(sonar,+mines+vs.+rocks), https://archive.ics.uci.edu/ml/datasets/spambase, https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29, https://archive.ics.uci.edu/ml/datasets/Ionosphere. These files are in the current respository with the names `sonar.txt`; `spam.txt`; `wdbc.txt` and `ionoshpere.txt`. We can use the following `ReadCollection` function in order to read the files in the form of the training set that is requested. 

In [28]:
from math import sqrt
import pandas as pd
import random
from sklearn.model_selection import train_test_split

def Normalize(x):
    norm=0.0
    for e in x:
        norm+=e**2
    for i in range(len(x)):
        x[i]/=sqrt(norm)
    return x

def ReadCollection(filename):
    tag_df=pd.read_table(filename,sep=',',header=None)
    if("wdbc" in filename):
        Dic={'M': -1, 'B': +1}
    elif("sonar" in filename):
        Dic={'R': -1, 'M': +1}
    elif("iono" in filename):
        Dic={'g': -1, 'b': +1}
    elif("spam" in filename):
        Dic={0:-1, 1:+1}
        
    X=[]
    for e in range(len(tag_df)):
        x=list(tag_df.loc[e,:])
        if("wdbc" in filename):
            x.pop(0)
            cls=x.pop(0)
        else:
            cls=x.pop()
        x=Normalize(x)
        x.insert(len(x),Dic[cls])
        X.append(x)
    random.shuffle(X)

    return X

wdbc_col = ReadCollection("wdbc.txt")
sonar_col = ReadCollection("sonar.txt")
iono_col = ReadCollection("ionosphere.txt")
spam_col = ReadCollection("spam.txt")


 2. Run the three models on these files with $\eta=0.01$ et $\eta=0.1$ and `MaxEp=500`.
 
 3. Report in the table below the average of the error rates on the test by repeating each experiment 20 times. 
 
 <br>
 <br>
 
 
 <center> $\eta=0.01$, MaxE$=500$ </center>
    
    
  | Collection | Perceptron | Adaline |    RL    |
  |------------|------------|---------|----------|
  | WDBC | 0.1171 | 0.0871 | 0.09965
  | Ionosphere | 0.1028 | 0.1023 | 0.0767
  | Sonar | 0.2856 | 0.2385 | 0.2577
  | Spam | 0.2295 | 0.2615 | 0.2311
 
 <br><br>
  
  <center> $\eta=0.1$, MaxEp$=500$ </center>
    
    
  | Collection | Perceptron | Adaline |    RL    |
  |------------|------------|---------|----------|
  | WDBC | 0.1350 | 0.1042 | 0.09301
| Ionosphere | 0.1068 | 0.1131 | 0.08352
| Sonar | 0.2913 | 0.2673 | 0.226
| Spam | 0.2298 | 0.2736 | 0.1646
  
  Hint: you can use the following function

In [33]:
for eta in [0.01,0.1]:
    print(f'eta={eta}\n-------')
    for name, X in zip(["WDBC", "Ionosphere", "Sonar", "Spam"],[wdbc_col,iono_col,sonar_col,spam_col]):
        errP=errA=errL=0.0
        for i in range(20):
            x_train ,x_test = train_test_split(X,test_size=0.25)
            WLP=Perceptron(x_train,eta,500)
            errP+=EmpiricalRisk(x_test,WLP)
            WLA=Adaline(x_train,eta,500)
            errA+=EmpiricalRisk(x_test,WLA)
            WLR=LR(x_train,eta,500)
            errL+=EmpiricalRisk(x_test,WLR)
    
        print(f"| {name} | {errP/float(20):.4f} | {errA/float(20):.4f} | {errL/float(20):0.4}")