## Logistic regression

Logistic regression is a popular machine learning algorithm used for binary classification tasks, where the goal is to predict the probability of a binary outcome (e.g., 0 or 1, yes or no). It is a type of regression analysis that models the relationship between the input features and the probability of the output variable.

The logistic regression algorithm works by estimating the probability of the output variable using a logistic function, also known as the sigmoid function. The sigmoid function maps any input value to a value between 0 and 1, which can be interpreted as the probability of the output variable being 1.

The loss function used in logistic regression is the cross-entropy loss, which measures the difference between the predicted probability and the true probability. The goal of the algorithm is to minimize the cross-entropy loss by adjusting the weights and bias of the logistic regression model.

In [None]:
import numpy as np

def _sigmoid(x):
    sig = 1 / (1 + np.exp(-x))
    return sig
    
class LogisticRegression():
    
    def __init__(self, lr=0.01, n_iters=1000):
        self.lr = lr
        self.n_iters = n_iters
        self.weights = None
        self.bias = None
        
    def fit(self, X, y):
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0
        
        for _ in range(self.n_iters):
            linear_pred = np.dot(X, self.weights) + self.bias
            predictions = _sigmoid(linear_pred)
            
            dw = (1 / n_samples) * np.dot(X.T, (predictions - y))
            db = (1 / n_samples) * np.sum(predictions - y)
            
            self.weights = self.weights - self.lr * dw
            self.bias = self.bias - self.lr * db

    def predict(self, X):
        linear_pred = np.dot(X, self.weights) + self.bias
        y_pred = _sigmoid(linear_pred)
        class_pred = [0 if y < 0.5 else 1 for y in y_pred]
        return class_pred
        

In [None]:
from sklearn.model_selection import train_test_split
from sklearn import datasets
import matplotlib.pyplot as plt

bc = datasets.load_breast_cancer()
X, y = bc.data, bc.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

clf = LogisticRegression()
clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)

def accuracy(y_pred, y_test):
    return np.sum(y_pred==y_test)/len(y_test)

acc = accuracy(y_pred, y_test)
print(acc)