## Quiz #0501

### "Logistic Regression and Gradient Descent Algorithm"

#### Answer the following questions by providing Python code:
#### Objectives:
- Code a logistic regression class using only the NumPy library.
- Implement in Python the Sigmoid function.
- Implement in Python the Gradient of the logarithmic likelihood.
- Implement in Python the Gradient Descent Algorithm.

In [1]:
import numpy as np
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
import warnings

warnings.filterwarnings('ignore')

#### Read in data:

In [2]:
# Load data.
data = load_breast_cancer()
# Explanatory variables.
X = data['data']
# Relabel such that 0 = 'benign' and 1 = malignant.
Y = 1 - data['target']

In [3]:
# Split the dataset into training and testing.
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.4, random_state=1234)

1). Define the 'sigmoid' and 'gradient' functions to produce the output shown below:

In [4]:
def sigmoid(x):
    return 1.0/(1.0 + np.exp(-x))

def gradient(X, Y, beta):
    z = np.dot(X,beta.T)*Y
    ds = -Y*(1-sigmoid(z))*X
    return ds.sum(axis=0)

2). Define the 'LogisticRegression' class to produce the output shown below:

In [9]:
class LogisticRegression:
    def __init__(self, learn_rate):
        self.rate = learn_rate
        
    def train(self, input_X, input_Y, n_epochs):
        self.n_nodes = input_X.shape[1] + 1
        self.beta = np.random.normal(0.0,1.0,(1,self.n_nodes))
        ones_column = np.ones((input_X.shape[0],1))
        X = np.concatenate((ones_column,input_X),axis=1)
        Y = (2*input_Y - 1).reshape(-1,1)
        for n in range(n_epochs):
            self.beta = self.beta - self.rate*gradient(X,Y,self.beta)
        return self.beta
    
    def query(self, input_X, prob=True, cutoff=0.5):
        ones_column = np.ones((input_X.shape[0],1))
        X = np.concatenate((ones_column,input_X),axis=1)
        z = np.dot(X,(self.beta).T)
        p = sigmoid(z)
        if prob :
            return p
        else:
            return (p > cutoff).astype('int')

#### Sample run:

In [16]:
# Setteamos el hiperparámetro.
learning_rate = 0.001

In [17]:
# Entrenamos y predecimos.
LR = LogisticRegression(learning_rate)
LR.train(X_train, Y_train, 2000)
Y_pred = LR.query(X_test,prob=False,cutoff=0.5)

In [18]:
acc = (Y_pred == Y_test.reshape(-1,1)).mean()
print('Accuracy : {}'.format(np.round(acc,3)))

Accuracy : 0.912
