## Logistic Regression

Logistic Regression is used for binary classification by modeling the probability that a given input belongs to a particular class.

### Model Equation
The predicted probability is computed using the sigmoid function:

$$
\hat{y} = \frac{1}{1 + e^{-(w \cdot x + b)}}
$$

### Steps in Training

1. **Initialization:**
   - Initialize weights as zero.
   - Initialize bias as zero.

2. **For Each Data Point:**
   - **Prediction:**  
     Compute the predicted result using the sigmoid function:
     $$
     \hat{y} = \frac{1}{1 + e^{-(w \cdot x + b)}}
     $$
   - **Error Calculation:**  
     Calculate the error between the predicted result and the actual label.
   - **Parameter Update:**  
     Use gradient descent to update the weights and bias.
   - **Iteration:**  
     Repeat the process for a number of iterations until the model converges.


In [2]:
import numpy as np

In [23]:
# global functions
def sigmoid(x):
    return 1/(1+np.exp(-x))

In [24]:
class LogisticRegression:
    def __init__(self, lr=0.001, n_iters=1000):
        self.lr = lr
        self.n_iters = n_iters
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0

        for _ in range(self.n_iters):
            linear_pred = np.dot(X, self.weights) + self.bias
            predictions = sigmoid(linear_pred)

            dw = (1 / n_samples) * np.dot(X.T, (predictions - y))
            db = (1 / n_samples) * np.sum(predictions - y)

            self.weights = self.weights - self.lr * dw
            self.bias = self.bias - self.lr * db

    def predict(self, X):
        linear_pred = np.dot(X, self.weights) + self.bias
        y_pred = sigmoid(linear_pred)

        class_pred = [0 if y<=0.5 else 1 for y in y_pred]
        return class_pred

In [6]:
# Train

from sklearn.model_selection import train_test_split
from sklearn import datasets
import matplotlib.pyplot as plt

In [25]:
bc = datasets.load_breast_cancer()
X, y = bc.data, bc.target
X_train, X_test, y_train, y_test =  train_test_split(X, y, test_size=0.2, random_state=1234)

In [28]:
clf = LogisticRegression(lr=0.01)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)  

  return 1/(1+np.exp(-x))


In [29]:
def accuracy(y_pred, y_test):
    return np.sum(y_pred==y_test)/len(y_test)

acc = accuracy(y_pred, y_test)
print(acc)

0.9210526315789473
