## 1. Logistic Regression

**The Similarities between Linear Regression and Logistic Regression**
- Linear Regression and Logistic Regression both are supervised Machine Learning algorithms.
- Linear Regression and Logistic Regression, both the models are parametric regression i.e. both the models use linear equations for predictions

**The Differences between Linear Regression and Logistic Regression**

- Linear Regression is used to handle regression problems whereas Logistic regression is used to handle the classification problems.
- Linear regression provides a continuous output but Logistic regression provides discreet output.
- The purpose of Linear Regression is to find the best-fitted line while Logistic regression is one step ahead and fitting the line values to the sigmoid curve.
- The method for calculating loss function in linear regression is the mean squared error whereas for logistic regression it is maximum likelihood estimation.

__Logistic regression__ is a classification algorithm traditionally limited to only two-class classification problems.

If you have more than two classes then __Linear Discriminant Analysis__ is the preferred linear classification technique.

__Limitations of Logistic Regression__
Logistic regression is a simple and powerful linear classification algorithm. It also has limitations that suggest at the need for alternate linear classification algorithms.

- __Two-Class Problems__: Logistic regression is intended for two-class or binary classification problems. It can be extended for multi-class classification, but is rarely used for this purpose.
- __Unstable With Well Separated Classes__: Logistic regression can become unstable when the classes are well separated.
- __Unstable With Few Example__: Logistic regression can become unstable when there are few examples from which to estimate the parameters.

In [3]:
def logistic_regression(x,y,iterations=100,learning_rate=0.01):
  m,n = len(x), len(x[0])
  beta_0,beta_other = initialize_params(n)
  for _ in range(iterations):
    gradient_beta_0,gradient_beta_other = compute_gradients(x,y,
                                                            beta_0,
                                                            beta_other,
                                                            m,n)
    beta_0, beta_other = update_params(beta_0,beta_other,
                                       gradient_beta_0,
                                       gradient_beta_other,
                                       learning_rate)
  return beta_0, beta_other

In [4]:
import random
def initialize_params(n):
  beta_0 = 0
  beta_other = [random.random() for _ in range(n)]
  return beta_0, beta_other

In [5]:
def computer_gradients(x,y,beta_0,beta_other,m,n):
  gradient_beta_0 = 0
  gradient_other = [0] * n

  for i, point in enumerate(x):
    pred = logistic_function(point,beta_0,beta_other)

    for j, feature in enumerate(point):
      gradient_beta_other[j] += (pred - y[i]) * feature/m
    gradient_beta_0 += (pred - y[i])/m
  return gradient_beta_0, gradient_beta_other

In [6]:
import numpy as np
def logistic_function(point, beta_0,beta_other):
  return 1/(1+np.exp(-(beta_0 + point.dot(beta_other))))

In [7]:
def update_params(beta_0, beta_other, gradient_beta_0, gradient_beta_other,
                  learning_rate):
  beta_0 -= gradient_beta_0 * learning_rate
  for i in range(len(beta_other)):
    beta_other[i] -= gradient_beta_other[i] * learning_rate
  return beta_0, beta_other