# Lab 6 - Logistic Regression

In this lab we implement and use logistic regressione for binary claffication problems.

We start including some libraries and functions already seen in the previous labs (or slight variations of them). Have a look and verify you understand their purpose.

<b>READ all the text parts very carefully, as you will find instructions on how to proceed.</b>

In [1]:
# import libraries
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import os
from scipy.interpolate import griddata

In [53]:
def mixGauss(means, sigmas, n):

    means = np.array(means)
    sigmas = np.array(sigmas)

    d = means.shape[1]
    num_classes = sigmas.size
    data = np.full((n * num_classes, d), np.inf)
    labels = np.zeros(n * num_classes)

    for idx, sigma in enumerate(sigmas):
        data[idx * n:(idx + 1) * n] = np.random.multivariate_normal(mean=means[idx], cov=np.eye(d) * sigmas[idx] ** 2,
                                                                    size=n)
        labels[idx * n:(idx + 1) * n] = idx 
        
    if(num_classes == 2):
        labels[labels==0] = -1

    return data, labels

In [54]:
def flipLabels(Y, perc):

    if perc < 1 or perc > 100:
        print("p should be a percentage value between 0 and 100.")
        return -1

    if any(np.abs(Y) != 1):
        print("The values of Ytr should be +1 or -1.")
        return -1

    Y_noisy = np.copy(np.squeeze(Y))
    if Y_noisy.ndim > 1:
        print("Please supply a label array with only one dimension")
        return -1

    n = Y_noisy.size
    n_flips = int(np.floor(n * perc / 100))
    idx_to_flip = np.random.choice(n, size=n_flips, replace=False)
    Y_noisy[idx_to_flip] = -Y_noisy[idx_to_flip]

    return Y_noisy

In [61]:
def separatingFLR(Xtr, Ytr, Ypred, w):
    
    xi = np.linspace(Xtr[:, 0].min(), Xtr[:, 0].max(), 200)
    yi = np.linspace(Xtr[:, 1].min(), Xtr[:, 1].max(), 200)
    X, Y = np.meshgrid(xi,yi)
    
    zi = griddata(Xtr, Ypred, (X,Y), method='linear')
    
    plt.contour(xi, yi, zi, 15, linewidths=2, colors='k', levels=[0])
    # plot data points.
    plt.scatter(Xtr[:,0], Xtr[:,1], c=Ytr, marker='o', s=100, zorder=10, alpha=0.8)
    plt.xlim(Xtr[:,0].min(), Xtr[:,0].max())
    plt.ylim(Xtr[:,1].min(), Xtr[:,1].max())


### Estimating the function on the training set

We define now the function <b>linearLRTrain(Xtr, Ytr, reg_par)</b> to estimate the classification function on the training set. The use is as follows:
##### w, errs = linearLRTrain(Xtr, Ytr, reg_par)
where
- <b>Xtr</b> is the nxD matrix of training set inputs
- <b>Ytr</b> is the n vector of training set outputs
- <b>reg_par</b> is the value of the lammbda
- <b>w</b> is the D vector of the estimated function parameters
- <b>errs</b> is the vector of the errors made, at each iteration, in the function estimation


In [51]:
def linearLRTrain(Xtr, Ytr, reg_par):
  
    epsilon = 1e-6
    iter = 10
    
    # size of the input in the trainind
    n, D = np.shape(Xtr)
    
    # initialization of the vectir w
    w = np.zeros(D)
    
    # estimation of the gamma parameter (fixed)
    eigvals, eigvects = np.linalg.eig(np.dot(Xtr, np.transpose(Xtr)))
    L = np.max(eigvals.astype(float)) / n + reg_par
    gamma = 1/L
    
    # initialization of some supporting variables
    j=0
    f_old = 0
    f = float("inf")
    
    errs = np.zeros(iter+1)
    
    while j < iter and abs(f - f_old) >= epsilon:
        
        fold = f
        j = j + 1
        
        w = # ... fill here ...
        
        f = # ... fill here ...
        
        #print("iter:"+str(j)+" err:"+str(abs(f-fold)))
        errs[j] = abs(f-fold)
        
    return w, errs[0:j]


### Evaluation the function on the test set

Here we define the function to evaluate the training function on a set of samples. Use it as follows
##### Ypred, Ppred = linearLRTest(w, Xte)
where
- <b>w</b> is the D vector of the estimated function parameters
- <b>Xte</b> is the matrix of input points in the test set
- <b>Ypred</b> is the vector of predictions
- <b>Ppred</b> is a confidence associated with each prediction

In [52]:
def linearLRTest(w, Xtw):
    
    Ypred = # ... fill here ...
    
    # Try and understand what it does, deriving the formula
    Ppred = np.divide(np.exp(Ypred), (1 + np.exp(Ypred)))
    
    return Ypred, Ppred

### Evaluation of the quality of the prediction

We need a way to quantify how good is the estimated function and its predictions. We thus define the following:

##### class_err = calcError(Ypred, Y)
where
- <b>Ypred</b> is the vector of predictions obtained with linearLRTest
- <b>Y</b> is the vector of true labels
- <b>class_err</b> is the percentage of misclassified samples

In [57]:
def calcError(Ypred, Y):
    
    class_err = # ... fill here ...
    return class_err

### Analysis

A guideline maybe the following:

- Build two binary classification datasets, for training and test
- Visualize them (see examples in the notebook of the first lab if you do not remember the syntax)
- Pick a reasonable value for the lambda parameter (e.g. reg_par = 0.1, 0.01,...)
- Run the training
- Plot the errors associated with each iteration
- Show the separating curve corresponding to the obtained w (...what do you expect?)
- Evaluate the estimated function on the test set
- Compute and show training and test classification errors


In [None]:
# ... fill here ...

### What are we still missing?

Think to (and possibly implement) what you should do to select an appropriate value for lambda!!

In [None]:
# ... fill here ...