## LOGISTIC REGRESSION FROM SCRATCH

This problem implements logistic regression on university exams of students and predicts if a particular student will pass the exam or not. 

The data is taken from a file named 'ex2data1.txt' which contains marks of different students in two exams in first two columns and admission verdict in third coloumn as 0 or 1 (0 == failure and 1 == success).

In [3]:
# importiing important libraries 
%matplotlib notebook
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.optimize as op # used scipy to implement gradient descent

In [4]:
# function to add a column of ones to the matrix
def add_ones(X):
    m = len(X)
    X = np.c_[np.ones(m), X]
    return X

In [5]:
# function to perform feature scaling
def scaling(arr):
    arr = arr.astype('float64')
    n = len(arr[0])
    mu = np.mean(arr, axis=0).reshape(1,n)
    sigma = np.std(arr, axis=0).reshape(1,n)
    for x in range(n):
        arr[:,x]-= mu[0,x]
        arr[:,x]/= sigma[0,x]
    return (mu,sigma,arr)

In [6]:
# function to calculate sigmoid of a value
def sigmoid(z):
    sig = 1/(1+np.exp(-z))
    return sig

In [7]:
# calculates the loss function for a given theta vector
def cost_func(t, X, y):
    m,n = X.shape
    t = t.reshape((n,1));
    y = y.reshape((m,1))
    hypo = sigmoid(np.dot(X,t))  # calculating hypothesis
    cost = np.multiply(y,np.log(hypo)) + np.multiply(1-y,np.log(1-hypo))  # element wise cost calculation
    J = (-1/m) * np.sum(cost)  # total cost summation
    return J

In [8]:
# calculating values of gradients for all theta values simultaneously
def gradient(t,X,y):
    m,n = X.shape
    t = t.reshape((n,1));
    y = y.reshape((m,1))
    temp = sigmoid(np.dot(X,t)) - y
    grad = np.dot(X.T, temp) / m
    return grad

In [9]:
df = pd.read_csv('ex2data1.txt', names = ['exam1', 'exam2', 'status'])
X = df[['exam1','exam2']].values
y = df['status'].values.reshape(len(X),1)
X = add_ones(X)
theta = np.zeros((len(X[1]),1))
print(cost_func(theta, X, y))
print(gradient(theta,X,y))

0.6931471805599453
[[ -0.1       ]
 [-12.00921659]
 [-11.26284221]]


In [11]:
initial_theta = np.zeros(len(theta));
Result = op.minimize(fun = cost_func, 
                                 x0 = initial_theta, 
                                 args = (X, y),
                                 method = 'TNC',
                                 jac = gradient);
theta = Result.x
theta

array([-25.16131853,   0.20623159,   0.20147149])

Now lets see what is the probability of a student getting admission with scores 45 and 85 in the exams.

In [15]:
# vector contaning scores with added column of one
test = [1,45,85]

In [17]:
probability = sigmoid(np.dot(test,theta))
probability

0.7762906220893772

In [21]:
print(f"So there is a {probability*100:1.2f} % chance of the student getting addmission.")

So there is a 77.63 % chance of the student getting addmission
