<a href="https://colab.research.google.com/github/parvvaresh/Logistic-Regression-from-scratch/blob/main/LogisticRegression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Logistic regression



---



**Sigmoid Function (Logistic Function)**

Logistic regression algorithm also uses a linear equation with independent predictors to predict a value. The predicted value can be anywhere between negative infinity to positive infinity. We need the output of the algorithm to be class variable, i.e 0-no, 1-yes. Therefore, we are squashing the output of the linear equation into a range of [0,1]. To squash the predicted value between 0 and 1, we use the sigmoid function.





We take the output(z) of the linear equation and give to the function g(x) which returns a squashed value h, the value h will lie in the range of 0 to 1. To understand how sigmoid function squashes the values within the range, let’s visualize the graph of the sigmoid function.


As you can see from the graph, the sigmoid function becomes asymptote to y=1 for positive values of x and becomes asymptote to y=0 for negative values of x.

![image](https://user-images.githubusercontent.com/89921883/222824767-a82f2052-6efe-46fa-9c21-c001fc89e476.png)

---



**Cost Function**
Since we are trying to predict class values, we cannot use the same cost function used in linear regression algorithm. Therefore, we use a logarithmic loss function to calculate the cost for misclassifying.

![image](https://user-images.githubusercontent.com/89921883/222825691-a3695da9-6cfa-4305-b38a-38a38c564b86.png)

---


**Calculating Gradients**
We take partial derivatives of the cost function with respect to each parameter(theta_0, theta_1, …) to obtain the gradients. with the help of these gradients, we can update the values of theta_0, theta_1, … To understand the equations below you would need some calculus.

![image](https://user-images.githubusercontent.com/89921883/222826236-eb01297b-4943-4b9e-b7e1-ebac1d4eac34.png)



In [1]:
import numpy as np

class LogisticRegression:

  def __init__(self, learning_rate = 10e-5, n_iters = 10e3):
    self.learning_rate = learning_rate
    self.n_iters = n_iters
    self.weights = None
    self.bias = None

  def sigmoid(self, x):
    return 1 / (1 + np.exp(-x))

  def fit(self, x, y):

    number_samples , number_features = x.shape
    self.weights = np.zeros(number_features)
    self.bias = 0
    for repeat in range(0, self.n_iters):

      y_predicted = self.sigmoid(np.dot(x, self.weights) + self.bias)
      dw = (1 / number_samples) * (np.dot(x.T,  (y_predicted - y)))
      db = (1 / number_samples) * (np.sum(y_predicted - y))

      self.weights -= self.learning_rate * dw
      self.bias -=  self.learning_rate * db

  def predict(self, x):
    
    y_predicted = self.sigmoid(np.dot(x, self.weights) + self.bias)
    y_predicted = [1 if element > 0.5 else 0 for element in y_predicted]
    return np.array(y_predicted)

  def accuracy(self, y, y_predicted):
    return (np.sum(y == y_predicted) /len(y)) * 100

In [2]:
from sklearn.model_selection import train_test_split
from sklearn import datasets


bc = datasets.load_breast_cancer()
X, y = bc.data, bc.target

X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=1234
    )

model = LogisticRegression(learning_rate=0.0001, n_iters=10000)
model.fit(X_train, y_train)
predictions = model.predict(X_test)

print("LR classification accuracy:", model.accuracy(y_test, predictions))

LR classification accuracy: 90.35087719298247
