# **Logistic Regression**

## Theory

**What is logistic regression?**

Logistic regression models the probability of a dependent variable belonging to a particular class. It’s primarily used for binary classification, where the output is either 0 or 1.

The algorithm uses the sigmoid function to map any real-valued input to the range (0, 1). The output of the sigmoid function is interpreted as the probability of the positive class. The decision boundary is defined by selecting a threshold, typically 0.5.

**Advantages of Logistic Regression** <br>
* Simple and Interpretable: Logistic regression is easy to implement and provides clear interpretability of coefficients, showing the relationship between features and the target variable.
* Efficient: It works well for linearly separable data and is computationally less intensive compared to complex models.
* Probabilistic Output: Provides probabilities as outputs, which can be useful for understanding uncertainty.
* Regularization: Techniques like L1 (Lasso) and L2 (Ridge) regularization can prevent overfitting by penalizing large coefficients.
* Baseline Model: It often serves as a strong baseline in machine learning tasks due to its simplicity and efficiency.

**Disadvantages of Logistic Regression**
* Linear Decision Boundary: It assumes a linear relationship between the features and the log-odds of the target, which may not hold for complex datasets.
* Outlier Sensitivity: Logistic regression is sensitive to outliers, which can skew the model.
* Multicollinearity: High correlation among independent variables can reduce model interpretability.
* Not Suitable for Complex Relationships: For non-linear or intricate patterns in data, logistic regression fails to capture underlying trends without significant feature engineering.
* Data Imbalance: It struggles with highly imbalanced datasets, often predicting the majority class.

**Assumptions of Logistic Regression**
* Binary or Multinomial Response Variable: For binary logistic regression, the target variable must have two classes (0 and 1).
* Independence of Observations: Each observation must be independent of others.
* Linearity of Log-Odds: The relationship between independent variables and the log-odds of the target variable is assumed to be linear.\
* No Multicollinearity: Independent variables should not be highly correlated.
* Large Sample Size: Logistic regression requires a sufficiently large dataset for stable and reliable coefficient estimates.

**Extension of logisitc regression**
1. Multinomial logistic regression
2. Regularized logistic regression
3. logisitic regression with feature engineering

Imp topics in logestic regression

* Linear equation <br>
y = WX + b

* Sigmoid function <br>
Convert the linear equation value(y) into probability value(0 to 1)

* Cost function(loss function) <br>
formula:-(y * ln(p) + (1 - y) * ln(1 - p)) <br>
It compute how well model find correct class

* Gradient Descent <br>
It was an algorithms used to minimize cost function: min J(w,b) by finding correct value to w and b.<br>
Gradeint tell us, how the loss function changes with respect to parameter(w,b). Update the parameter based on this gradient.



* Learning rate <br>
determines how quickly the algorithm converges to the optimal solution

## Implementation

Implement Logestic regression from scratch

Steps:
* Initialize weight and bias
* Compute y
* Calculate loss
* Gradient descent (re-calculate weights)
* repeat until converge

In [75]:
import numpy as np

Implementing logistic regression

In [76]:
class LogesticRegression:
    '''Code for implementing Logistic regression'''

    def __init__(self, n_iter=100, alpha=0.01) -> None:
        self.W = None
        self.b = 0
        self.n_iter = n_iter
        self.alpha = alpha
        self.threshold = 0.1

    def sigmoid(self, pred):
        '''Convert regression value into probabilies using sigmoid function'''
        return 1 / (1 + np.exp(-pred))

    def _cost_function(self, pred):
        '''Calculate the cost value(measure how well model predictions matches actual prediction) for prediction'''
        return (np.sum(-(self.y * np.log(pred) + ((1 - self.y) * np.log(1 - pred))))) / self.n_sample

    def _gradient_descent(self):
        '''Calculate gradient of cost function with respect to w and b and update the weights and bias using learning_rate'''
        z = np.dot(self.X, self.W) + self.b # regrssion value
        pred = 1 / (1 + np.exp(-z)) # convert regression value into proability
        cost = self._cost_function(pred) # calculate cost value
        dw = np.dot(self.X.T, (pred - self.y)) / self.n_sample # calculate parital derivate of cost with respect to w(slope)
        db = np.mean(pred - self.y) # calculate partial derivative of cost with respect to b(slope)
        self.W -= self.alpha * dw # update weight using dw
        self.b -= self.alpha * db # update weight using db
        return cost

    def fit(self, X, y):
        '''Fit the training data into the model'''
        self.X = X
        self.y = y
        self.n_features = self.X.shape[1]
        self.n_sample = self.X.shape[0]
        self.W = np.zeros(shape=self.n_features) # Initialize weights with zero value
        for i in range(self.n_iter):
            cost = self._gradient_descent()
            if cost < self.threshold:
                break
        return cost

    def predict(self, test):
        '''Predcit class for new unseen values'''
        z = np.dot(test, self.W) + self.b
        p = self.sigmoid(z)
        if not isinstance(p, np.ndarray):
            return 0 if p < 0.5 else 1
        return [0 if i < 0.5 else 1 for i in p]

    def predict_proba(self, test):
        '''predict proabaility for new unseen value'''
        z = np.dot(test, self.W) + self.b
        return self.sigmoid(z)

Test Logesitc Regeression model

In [77]:
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

In [78]:
X,y  = make_classification(n_samples=1000,n_features=4,n_informative=2,n_classes=2,random_state=42)
# X = np.random.randn(1000, 1)
# y = (X[:, 0] > 0).astype(int)
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=42)

In [80]:
lr = LogesticRegression()
lr.fit(X_train,y_train)

0.5079247995155056

In [81]:
lr1 = LogisticRegression()
lr1.fit(X_train,y_train)

In [82]:
pred = lr1.predict(X_test)
accuracy_score(pred,y_test)

0.885

In [83]:
pred = lr.predict(X_test)
accuracy_score(pred,y_test)

0.865

In [84]:
def test(model, x):
    print(f"Value of x: {x}")
    print(f"Probability of x is {model.predict_proba(x)}")
    print(f"Class is {model.predict(x)}")

In [88]:
for i in range(5):
    test(lr,[X_test[i]])

Value of x: [array([-2.4766318 ,  0.81176843,  0.46168007,  0.23398858])]
Probability of x is [0.51453084]
Class is [1]
Value of x: [array([-0.41921964, -0.19445516,  0.83584285, -0.36593445])]
Probability of x is [0.59948564]
Class is [1]
Value of x: [array([ 2.08923409, -0.40319141, -1.03239735,  0.14673053])]
Probability of x is [0.40821425]
Class is [0]
Value of x: [array([ 0.30463008,  0.41722807, -1.23735285,  0.6030946 ])]
Probability of x is [0.3522295]
Class is [0]
Value of x: [array([-0.66971816,  0.53869068, -0.60388303,  0.45331218])]
Probability of x is [0.41198324]
Class is [0]


In [87]:
for i in range(5):
    test(lr1,[X_test[i]])

Value of x: [array([-2.4766318 ,  0.81176843,  0.46168007,  0.23398858])]
Probability of x is [[0.17460786 0.82539214]]
Class is [1]
Value of x: [array([-0.41921964, -0.19445516,  0.83584285, -0.36593445])]
Probability of x is [[0.13701538 0.86298462]]
Class is [1]
Value of x: [array([ 2.08923409, -0.40319141, -1.03239735,  0.14673053])]
Probability of x is [[0.90263491 0.09736509]]
Class is [0]
Value of x: [array([ 0.30463008,  0.41722807, -1.23735285,  0.6030946 ])]
Probability of x is [[0.904523 0.095477]]
Class is [0]
Value of x: [array([-0.66971816,  0.53869068, -0.60388303,  0.45331218])]
Probability of x is [[0.69929556 0.30070444]]
Class is [0]
