# Logistic Regression

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Logistic-Regression" data-toc-modified-id="Logistic-Regression-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Logistic Regression</a></span><ul class="toc-item"><li><ul class="toc-item"><li><span><a href="#Cross-Entropy-Loss" data-toc-modified-id="Cross-Entropy-Loss-1.0.1"><span class="toc-item-num">1.0.1&nbsp;&nbsp;</span>Cross Entropy Loss</a></span></li></ul></li></ul></li><li><span><a href="#Scikit-learn-implementation-of-Logistic-Regression" data-toc-modified-id="Scikit-learn-implementation-of-Logistic-Regression-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Scikit-learn implementation of Logistic Regression</a></span></li></ul></div>

In [3]:
from IPython.display import IFrame
IFrame('https://www.youtube.com/embed/yIYKR4sgzI8',560,315)

The cost function for linear regression: $J_{\theta} = \frac{1}{m}\sum_{1}^{m}\cdot\frac{1}{2}\cdot(h_{\theta}(x^{(i)}) - y^{(i)})^2$

The new predictor: $h_{\theta}(x^{(i)}) = \frac{1}{1 + e^{-\theta^T{x(i)}}}$

The problem is with the new predictor $J_{\theta}$ is no longer convex and thus we can't optimize it.

### Cross Entropy Loss

$$cost(h_{\theta}(x),y)= \left\{   \begin{array}{lr}        -log(h_{\theta}(x)) & y= 1 \\       -log(1 - h_{\theta}(x)) & y = 0       \end{array}\right.$$

We can solve this with gradient descent.

$$\theta_{j} := \theta_{j} - \alpha\frac{\partial{J}}{\partial\theta_{j}}J_{\theta}$$

Where the partial is $$\frac{\partial{J}}{\partial\theta_{j}} = \frac{1}{m}\sum_{i=1}^m(h_{\theta}(x^i) - y^i)*x^i_j$$

Using this we can implement logistic regression using numpy as follows

In [15]:
import pandas as pd
import numpy as np

In [16]:
data = pd.read_csv("scaled.csv")
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

X_train, X_test, y_train, y_test = train_test_split(
    data.drop("species", axis=1), data.species)

In [17]:
def cross_entropy_loss(y_pred,target):
    return -np.mean((target*np.log1p(y_pred)+(1-target)*np.log1p(1-y_pred)))

In [18]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

In [66]:
theta = np.zeros(X_train.shape[1])
print("Shape of Theta is",theta.shape)
b = np.random.uniform(0,1,1)
print("Shape of Bias is",b.shape)

lr = 0.1
for _ in range(10000):
    prediction = sigmoid(X_train @ theta + b)
    partial = (prediction - y_train) @ X_train
    theta = theta - lr * partial / len(X_train)
    b = b - lr * np.average((prediction - y_train))
    if _ % 5000 == 0:
        loss = cross_entropy_loss(prediction,y_train)
        print(loss)

Shape of Theta is (5,)
Shape of Bias is (1,)
-0.4030869648865438


  result = getattr(ufunc, method)(*inputs, **kwargs)


-0.6890725364375996


In [67]:
theta.values

array([-0.51582478,  6.97084089, 11.92544902, -2.16618833, -4.0722729 ])

In [68]:
b

array([14.17139349])

In [69]:
from sklearn.metrics import log_loss

preds = sigmoid(X_test @ theta + b)
pred = preds.round()
log_loss(y_test,preds)
print(classification_report(y_test,pred))

# Scikit-learn implementation of Logistic Regression

In [31]:
from sklearn.linear_model import LogisticRegression

In [32]:
clf = LogisticRegression()
clf.fit(X_train, y_train)
preds = clf.predict(X_test)
print(classification_report(y_test,preds))


              precision    recall  f1-score   support

           0       1.00      1.00      1.00        27
           1       1.00      1.00      1.00        11

    accuracy                           1.00        38
   macro avg       1.00      1.00      1.00        38
weighted avg       1.00      1.00      1.00        38



In [33]:
clf.coef_

array([[-0.97314805, -0.04223688,  0.00869855, -0.08623715, -0.0964776 ]])

In [34]:
cross_entropy_loss(preds,y_test)

-0.6931471805599451

In [35]:
log_loss(y_test,preds)

9.992007221626415e-16