# Logistic regression Implementation


## 1. Information
Logistic regression is a machine learning algorithm for binary classification problems.

Logistic regression is similar to linear regression. We’re still dealing with a line equation for making predictions. The results are passed in a Sigmoid activation function to convert real values to probabilities.

The probability tells you the chance of the instance belonging to a positive class. These probabilities are then turned to actual classes based on a threshold value.

#### 1.2 Choses mathématiques
We’re still dealing with a line equation:
$$\hat{y}=wx+b$$
The output of the line equation is passed through a Sigmoid (Logistic) function
$$S(x)=\frac{1}{1+e^{-x}}$$
The purpose of a sigmoid function is to take any real value and map it to a probability — value between zero and one.

As a cost function, we’ll use a Binary Cross Entropy function, shown in the following formula:
$$BCE = -\frac{1}{n}\sum_{i}^n y_i log\hat{y}+(1+y_i)log(1-\hat{y})$$

We will need to use this cost function in the optimization process to update weights and bias iteratively. 
$$\partial_w = \frac{1}{n}\sum_{i}^n2x_i(\hat{y}-y_i)$$
$$\partial_b = \frac{1}{n}\sum_{i}^n2(\hat{y}-y_i)$$

Gradient descent update rules
$$w=w-\alpha \partial_w$$
$$b=b-\alpha \partial_b$$

#### 1.3 NumPy Implementation 

In [45]:
import numpy as np

In [46]:
class LogisticRegression:
    def __init__(self, alpha=0.1, n=1000):
        self.alpha = alpha
        self.n = n
        self.w, self.b = None, None


    def _sigmoid(self, x):    
        return 1 / (1 + np.exp(-x))
    
    def bce(self, y, y_hat):
        def safe_log(x): return 0 if x == 0 else np.log(x)
        
        total = 0
        for _y, _y_hat in zip(y, y_hat):
            total += (_y * safe_log(_y_hat) + (1 - _y) * safe_log(1 - _y_hat))
        return - total / len(y)
    
    def fit(self, X, y):
        self.w = np.zeros(X.shape[1])
        self.b = 0

        #grad descent        
        for _ in range(self.n):
            linear_pred = np.dot(X, self.w) + self.b
            probability = self._sigmoid(linear_pred)
            
            # Calculate derivatives
            partial_w = (1 / X.shape[0]) * (2 * np.dot(X.T, (probability - y)))
            partial_d = (1 / X.shape[0]) * (2 * np.sum(probability - y))
            
            # Update the coefficients
            self.w -= self.alpha * partial_w
            self.b -= self.alpha * partial_d

    def predict_proba(self, X):
        linear_pred = np.dot(X, self.w) + self.b
        return self._sigmoid(linear_pred)
    
    def predict(self, X, threshold=0.5):
        probabilities = self.predict_proba(X)
        return [1 if i > threshold else 0 for i in probabilities]

In [47]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

data = load_breast_cancer()
X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LogisticRegression()
model.fit(X_train, y_train)
preds = model.predict(X_test)

  return 1 / (1 + np.exp(-x))


In [48]:
from sklearn.metrics import accuracy_score, confusion_matrix

print(accuracy_score(y_test, preds))
print(confusion_matrix(y_test, preds))

0.9473684210526315
[[42  1]
 [ 5 66]]


In [49]:
from sklearn.linear_model import LogisticRegression

lr_model = LogisticRegression()
lr_model.fit(X_train, y_train)
lr_preds = lr_model.predict(X_test)

print(accuracy_score(y_test, lr_preds))
print(confusion_matrix(y_test, lr_preds))

0.956140350877193
[[39  4]
 [ 1 70]]


STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
