***
$\mathbf{\text{Logisitic Regression}}$<br>
***

Logisitic Regression is the supervised statistical technique to find the probability of dependent variable. Logistic Regression helps to find relationship between the dependent variable and independent variables by predicting the probabilities or chances of occurrence. Sigmoid functions convert the probabilities into binary values which could be further used for predictions.

From linear regression, we know that the equation for linear function is given as:
\begin{align}
f(w,b) = wx + b
\end{align}

However in logistic regression, we do not want continuous values, we want probabilities. To get the probabilities, we apply sigmoid function to the linear model.

\begin{align}
s(x) = \frac{1}{1 + e^{-wx+b}}
\end{align}

##### Cost Function

To get the optimal values of weights(w) and bias(b), we use gradient descent. To get to the gradient descent, let us first calculate the cost function. In logistic regression, cost function is the cross entropy function and is given as:

\begin{align}
J(w,b) = J(θ) = \frac{1}{N} \sum_{i=1}^{n} [y_{i} log(h_{θ}x_{i}) + (1 - y^{i})log(1 - h_{θ}x_{i})]
\end{align}

The update rules for w and b is given as:
\begin{align}
w = w - α dw
\end{align}
\begin{align}
b = b - α db
\end{align}

Formula for derivatives can be given as:

\begin{align}
\frac{dJ}{dw} = \frac{1}{N} \sum_{}^{} 2 x_{i}(ŷ - y_{i})
\end{align}

\begin{align}
\frac{dJ}{db} = \frac{1}{N} \sum_{}^{} 2 (ŷ - y_{i})
\end{align}

In [1]:
import numpy as np
import warnings
warnings.filterwarnings('ignore')

In [2]:
class LogisticRegressions():
    
    def __init__(self, learning_rate = 0.001, n_iters = 1000):
        self.learning_rate = learning_rate
        self.n_iters = n_iters
        self.weights = None
        self.bias = None
        
    def fit(self, X, y):
        
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0
        
        # gradient descent
        for _ in range(self.n_iters):
            # implementing the linear function equation
            linear_function = np.dot(X, self.weights) + self.bias
            
            # calculating y predicted by applying the sigmoid function
            y_predicted = self._sigmoid(linear_function)
            
            dw = (1/n_samples) * np.dot(X.T, (y_predicted - y))
            db = (1/n_samples) * np.sum(y_predicted - y)
            
            self.weights -= self.learning_rate * dw
            self.bias -= self.learning_rate * db
    
    def predict(self, X):
        
        linear_function = np.dot(X, self.weights) + self.bias
        y_predicted = self._sigmoid(linear_function)
        y_predicted_prob = [1 if i > 0.5 else 0 for i in y_predicted]
        return np.array(y_predicted_prob)
    
    def _sigmoid(self, x):
        return 1/(1 + np.exp(-x))

In [3]:
if __name__ == "__main__":
    from sklearn.datasets import load_breast_cancer
    from sklearn.model_selection import train_test_split
    
    def accuracy(y_true, y_pred):
        accuracy = np.sum(y_true == y_pred) / len(y_true)
        return accuracy
    
    bc = load_breast_cancer()
    
    X = bc.data
    y = bc.target
    
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1234)
    
    log_reg = LogisticRegressions(learning_rate = 0.0001, n_iters = 1000)
    log_reg.fit(X_train, y_train)
    
    predictions = log_reg.predict(X_test)
    
    print("The accuracy score is: ", accuracy(y_test, predictions))

The accuracy score is:  0.9298245614035088
