# Logistic regression

## numpy

With logistic regression we get probability from 0 to 1
* Approxiamtion: 
$$ f(w, b) = wx + b $$
$$ \hat{y} = h_{\Theta}(x) = \frac{1}{1+e^{-wx+b}} $$
* Sigmoid function:
$$ s(x) = \frac{1}{1+e^{-x}} $$

* Cost function (cross entropy):
$$ J(w, b) = J(Θ) = \frac{1}{N}\sum_{i=1}^{n}[y^ilog(h_{Θ}(x^{i}))+(1-y^{i})log(1-h_{Θ}(x^{i}))] $$

* Update rules:
$$ w=w-\alpha*dw $$
$$ b=b-\alpha*db $$
where:
$$ \alpha - learning\; rate $$

* Derivative:
$$ J'(Θ) = \begin{bmatrix} \frac{dJ}{dw}\\ \frac{dJ}{db}\end{bmatrix} = [....] =  \begin{bmatrix} \frac{1}{N}\sum 2x_{i}(\hat{y} - y_{i}) \\ \frac{1}{N} \sum 2(\hat{y} - y_{i}) \end{bmatrix} $$ 


Logistic regression is a machine learning algorithm that is used for classification problems, where the goal is to predict a categorical target variable. It is called "logistic" because it uses a logistic function to make predictions.

The logistic function is a sigmoid function that maps the input to a value between 0 and 1, which can be interpreted as a probability. For example, if the logistic function outputs a value of 0.7 for a given input, that input can be classified as belonging to the positive class with a probability of 0.7.

To perform logistic regression, you need a dataset with a set of input features and a binary target variable (i.e., a variable that has only two possible values). You can then use a variety of techniques, such as gradient descent or Newton's method, to find the optimal coefficients for the input features. Once you have found the optimal coefficients, you can use the logistic regression model to make predictions on new data by plugging in the input features and using the learned coefficients to predict the probability of the target variable.

Logistic regression is a simple and widely used machine learning algorithm that is well-suited for binary classification problems. It is a good choice for many applications, including predicting whether an email is spam, whether a customer will churn, or whether a patient has a certain disease.

In [1]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import datasets

## very detailed

In [2]:
def sigmoid_func(x):
    return 1 / (1 + np.exp(-x))

In [3]:
# Some sample data from sklearn
bc = datasets.load_breast_cancer()
X, y = bc.data, bc.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1234)

In [4]:
print(X_train.shape)
print(y_train.shape)

(455, 30)
(455,)


In [5]:
# init parameters
learning_rate = 0.0001
n_iters = 1000
n_samples, n_features = X.shape

weights = np.zeros(n_features)
bias = 0

In [6]:
# gradient descent
for _ in range(n_iters):
    # approximate y with linear combination of weights and x, plus bias
    linear_model = np.dot(X_train, weights) + bias
    
    # apply sigmoid function
    y_predicted = sigmoid_func(linear_model)

    # compute gradients
    dw = (1 / n_samples) * np.dot(X_train.T, (y_predicted - y_train))
    db = (1 / n_samples) * np.sum(y_predicted - y_train)
    
    # update parameters
    weights -= learning_rate * dw
    bias -= learning_rate * db

In [7]:
# predict
linear_model = np.dot(X_test, weights) + bias
y_predicted = sigmoid_func(linear_model)
y_predicted_cls = [1 if i > 0.5 else 0 for i in y_predicted]
predictions = np.array(y_predicted_cls)

In [8]:
# accuracy
accuracy = np.sum(y_test == predictions) / len(y_test)

print("LR classification accuracy:", accuracy)

LR classification accuracy: 0.9035087719298246


## clean version

In [9]:
class LogisticRegression:
    def __init__(self, learning_rate=0.001, n_iters=1000):
        self.lr = learning_rate
        self.n_iters = n_iters
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        n_samples, n_features = X.shape

        # init parameters
        self.weights = np.zeros(n_features)
        self.bias = 0

        # gradient descent
        for _ in range(self.n_iters):
            # approximate y with linear combination of weights and x, plus bias
            linear_model = np.dot(X, self.weights) + self.bias
            # apply sigmoid function
            y_predicted = self._sigmoid(linear_model)

            # compute gradients
            dw = (1 / n_samples) * np.dot(X.T, (y_predicted - y))
            db = (1 / n_samples) * np.sum(y_predicted - y)
            # update parameters
            self.weights -= self.lr * dw
            self.bias -= self.lr * db

    def predict(self, X):
        linear_model = np.dot(X, self.weights) + self.bias
        y_predicted = self._sigmoid(linear_model)
        y_predicted_cls = [1 if i > 0.5 else 0 for i in y_predicted]
        return np.array(y_predicted_cls)

    def _sigmoid(self, x):
        return 1 / (1 + np.exp(-x))


In [10]:
# Testing
if __name__ == "__main__":
    # Imports
    from sklearn.model_selection import train_test_split
    from sklearn import datasets

    def accuracy(y_true, y_pred):
        accuracy = np.sum(y_true == y_pred) / len(y_true)
        return accuracy

    bc = datasets.load_breast_cancer()
    X, y = bc.data, bc.target

    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=1234
    )

    regressor = LogisticRegression(learning_rate=0.0001, n_iters=1000)
    regressor.fit(X_train, y_train)
    predictions = regressor.predict(X_test)

    print("LR classification accuracy:", accuracy(y_test, predictions))

LR classification accuracy: 0.9298245614035088


## sklearn

In [11]:
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn import datasets

In [12]:
bc_ = datasets.load_breast_cancer()
X_, y_ = bc_.data, bc_.target

X_train, X_test, y_train, y_test = train_test_split(X_, y_, test_size=0.2, random_state=1234)

In [13]:
clf = LogisticRegression(random_state=0)
clf.fit(X_train, y_train)
y_hat = clf.predict(X_test)

clf.predict_proba(X_test)
clf.score(X_test, y_test)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


0.9385964912280702

In [14]:
accuracy_ = np.sum(y_test == y_hat) / len(y_test)

print("LR classification accuracy:", accuracy_)

LR classification accuracy: 0.9385964912280702
