# Linear Classification and Linear Predictors
---

## Linear Classification: Perceptron algorithm

In the next code cell I realized the Perceptron algorithm (for halfspaces) 
- $\mathbf{x}\in\mathbb{R}^d$: for each sample is a vector of features (_input_)
- $\mathbf{x'}\in \mathbb{R}^{d+1}$ using homogeneous coordinates: 
    $$
    \mathbf{x}\rightarrow \mathbf{x'} = (1, \mathbf{x})
    $$
- $\mathbf{y} = \{-1, 1\}$ in the case of binary classification (_output_) $\rightarrow$ 0-1 loss 
- $\mathbf{w}$: weights vector ($b$: bias) 

In [14]:
import numpy as np
import random

def perceptron(S, maxcycles=None, lam = 1 , w=None):
    # S is the training set: (x1, y1), ..., (x_m, y_m) given as an array of 2 list: (xvec, yvec)
    # maxcycles is the maximum number of cicles that the algorithm can make (to avoid infinit loop)
    # w is the initial solution, it is an optional argument. It is w' for the homogeneous linear function: w'=(b, w)
    
    if maxcycles == None:
        print("You haven't specified the maximum number of cicles. Risk of having an infinite loop if the data are not linearly separable.")
    
    x = S[0]
    y = S[1]
    # print('x:', x, '\ny:', y)
    
    m = len(S[0])
    f = len(x[0])+1   # number of features (+1 for homogeneous coordinates)
    
    xprime = np.zeros((m, f), dtype='int')

    for i in range(m):
        xprime[i, 0] = 1
        xprime[i, 1:] = x[i]
    x = xprime
    # print("x':", x)
    
    if w is None: w = np.zeros(f, dtype='int')  
    
    # perceptron algorithm
    exist_misclassified = True
    loop_check = 0
    
    while exist_misclassified == True:
        loop_check +=1 
        
        wrong_samples = []
        for i in range(m):
            if y[i]*np.dot(w, x[i]) <= 0:
                wrong_samples.append(i)
                
        # check if wrong_samples is empty (if it's empty calling it return false)
        if not wrong_samples:   
            exist_misclassified = False
        else: 
            selected_i = random.choice(wrong_samples)
            w = w+lam*y[selected_i]*x[selected_i]
        
        if maxcycles is not None and loop_check > maxcycles:
            # compute 0-1 Loss:
            Loss = len(wrong_samples)/m
            print('Excedeed maximum number of cycles allowed')
            return w, Loss
    
    return w, 0        

xtrain = np.array([[1, 2, 3], [0, 2, 2], [0, 2, 0], [0, 2, 2]])
ytrain = np.array([1, -1, -1, -1])
trainSet = [xtrain, ytrain]

perceptron(trainSet)


You haven't specified the maximum number of cicles. Risk of having an infinite loop if the data are not linearly separable.


(array([-2,  3, -4,  3]), 0)

From the perceptron function I get the weights vector $w'=(b, w)$ in homogeneous coordinates. I could print the data and the separating hyperplane (in the 2D and 3D cases), to have a graphic representation of the problem.

## Linear Classification with Logistic Regression
> There is a Logistic regression function from Python libraries.

## Linear Regression: Least Squares algorithm

Now I realize, in the next code cell, there is Least Squares algorithm.

In [3]:
def leastSquares(S):
    pass