# Logistic Regression with Newton's Method

This is a response to Siraj Raval's [Coding Challenge](https://github.com/llSourcell/Second_Order_Optimization_Newtons_Method) 
In this notebook, we will be building a logistic regression model using Newton's 2nd Order Optimization Method instead of the regular gradient descent.

We will be using the Kaggle [Breast Cancer Wisconsin](https://www.kaggle.com/uciml/breast-cancer-wisconsin-data) Data Set to classify if a tumour is malignant or benign, based on 30 features, such as the mean radius.

References:  
[Logistic Regression Newton's Method](https://github.com/llSourcell/logistic_regression_newtons_method) by Siraj Raval  
[CS229 - Logistic Regression](http://cs229.stanford.edu/notes/cs229-notes1.pdf) by Andrew Ng

In [3]:
## Dependencies
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## Data Preparation
We will begin by doing some simple data cleaning and preparation before delving into the logistic regression and newton's method formulas.

In [4]:
data = pd.read_csv('data.csv')

## Dropping a unused fields
fields_to_drop = ['id', 'Unnamed: 32'] 
data = data.drop(fields_to_drop, axis=1)

## Converting diagnosis to int - 1 for malignant, 0 - for benign
d = {'M': 1, 'B': 0}
data['diagnosis'] = data['diagnosis'].map(d)

## Visualising the data set
data.head()

Unnamed: 0,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,symmetry_mean,...,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst
0,1,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,1,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,1,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758
3,1,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,...,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173
4,1,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,...,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678


## Splitting Data into Train and Test Sets

In [5]:
## Using 10% of dataset for testing
split_idx = int(data.shape[0]*0.9) 
test_data = data[split_idx:]
data = data[:split_idx]

## Separating data to features and targets
train_Y, train_X = data['diagnosis'], data.drop('diagnosis', axis=1)
test_Y, test_X = test_data['diagnosis'], test_data.drop('diagnosis', axis=1)

In [9]:
train_X.head()

Unnamed: 0,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,symmetry_mean,fractal_dimension_mean,...,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst
0,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,0.07871,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,0.05667,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,0.05999,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758
3,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,0.09744,...,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173
4,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,0.05883,...,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678


## Logistic Regression
Logistic Regression is a method in machine learning for classification problems to output discrete values. For example, given an input of the tumour size, the logistic function can classify it as malignant (1) or benign (0).

In [6]:
def sigmoid(x):
    return 1/(1+np.exp(-x))

## Newton's Method
Newton's method is a second-order optimization algorithm that can help us find the best weights in our logistic function in fewer iterations compared to batch gradient descent.

In [78]:
def newton_step(curr, y, X, lam=None):
    p = np.array(sigmoid(X.dot(curr[:,0])), ndmin=2).T
    W = np.diag((p*(1-p))[:,0])
    hessian = X.T.dot(W).dot(X)
    grad = X.T.dot(y-p)
    
    step = np.dot(np.linalg.inv(hessian), grad)
    beta = curr + step
    
    return beta

In [93]:
def test_model(X, y, beta):
    logodds = X.dot(beta)
    odds = np.exp(logodds)
    
    ## Converting odds to prediction, >1 = True, <1 = False
    prob = np.greater(odds, np.ones((odds.shape[1],1)))
    accuracy = np.count_nonzero(np.equal(p, y))/p.shape[0] * 100
    print('Test Accuracy: {}%'.format(accuracy))

In [97]:
beta_old, beta = np.ones((30,1)), np.zeros((30,1))
iter_count = 0

while iter_count < 20:
    print('Iteration: {}'.format(iter_count))
    test_model(test_X, test_Y.to_frame(), beta_old)
    beta_old = beta
    beta = newton_step(beta, train_Y.to_frame(), train_X, None)
    iter_count += 1
    
    
print('Iterations: {}'.format(iter_count))
#print('Beta: {}'.format(beta))
test_model(test_X, test_Y.to_frame(), beta)

Iteration: 0
Test Accuracy: 24.561403508771928%
Iteration: 1
Test Accuracy: 75.43859649122807%
Iteration: 2
Test Accuracy: 96.49122807017544%
Iteration: 3
Test Accuracy: 96.49122807017544%
Iteration: 4
Test Accuracy: 96.49122807017544%
Iteration: 5
Test Accuracy: 96.49122807017544%
Iteration: 6
Test Accuracy: 96.49122807017544%
Iteration: 7
Test Accuracy: 98.24561403508771%
Iteration: 8
Test Accuracy: 100.0%
Iteration: 9
Test Accuracy: 100.0%
Iteration: 10
Test Accuracy: 98.24561403508771%
Iteration: 11
Test Accuracy: 94.73684210526315%
Iteration: 12
Test Accuracy: 94.73684210526315%
Iteration: 13
Test Accuracy: 94.73684210526315%
Iteration: 14
Test Accuracy: 94.73684210526315%
Iteration: 15
Test Accuracy: 94.73684210526315%
Iteration: 16
Test Accuracy: 94.73684210526315%
Iteration: 17
Test Accuracy: 94.73684210526315%
Iteration: 18
Test Accuracy: 94.73684210526315%
Iteration: 19
Test Accuracy: 94.73684210526315%
Iterations: 20
Test Accuracy: 94.73684210526315%


  app.launch_new_instance()
  from ipykernel import kernelapp as app
