### Logistic Regression

1) It is a supervised learning algorithm used for classification.<br>
2) Logistic regression is a classification algorithm used to assign observations to a discrete set of classes.<br>
3) It is used when the dependent variable (target) is categorical.<br>
4) Logistic regression transforms its output using the <b>logistic or sigmoid function</b> to return a probability value which can then be mapped to two or more discrete classes.<br>
5) It is appropriate algorithm when dependent variable is categorical and consists of two categories (binary).<br>
6) It is a special case of linear regression where the target variable is categorical in nature. It uses a log of odds as the dependent variable. Logistic Regression predicts
<b>
the probability of occurrence of a binary event utilizing a logit or sigmoid function.<br>
7) Odds Ratio = P(event happening) /1-P(event happening) <br>
8) Log of odds = log(P(event happening)/1-P(event happening)) = log(p/(1-p)) = y = ax + b
</b>

### Sigmoid Function

1) In order to map predicted values to probabilities, we use the sigmoid function. The function maps any real value into another value between 0 and 1. In machine learning, we use sigmoid to map predictions to probabilities.<br>
2) The sigmoid function, also called logistic function gives an ‘S’ shaped curve that can take any real-valued number and map it into a value between 0 and 1. If the curve goes to positive infinity, y predicted will become 1, and if the curve goes to negative infinity, y predicted will become 0. If the output of the sigmoid function is more than 0.5, we can classify the outcome as 1 or YES, and if it is less than 0.5, we can classify it as 0 or NO.

<img src="log_reg1.png" align="left">
<img src="log_reg2.png" align="middle">

 where z = ax + b

#### Derivation of Sigmoid Function

<b>log (p/(1-p)) = ax + b</b><br>
=> p / (1-p) = e^(ax + b)<br>
=> p = e^(ax + b) - p* e^(ax + b)<br> 
=> p + p* e^(ax + b) = e^(ax + b) <br>
=> p*(1 + e^(ax + b)) = e^(ax + b)<br>
=> p = e^(ax + b) / (1 + e^(ax + b))<br>
=> Dividing numerator and denominator of RHS by e^(ax + b)<br>
<b>=> p = 1 / (1 + e^-(ax + b)) <br>
=> Logistic or Sigmoid or Logit Function</b>

#### Pros/Advantages

1) Logistic regression is easier to implement, interpret, and very efficient to train.<br>
2) It provides good accuracy for many simple data sets and it <b>performs well when the dataset is linearly separable.</b><br>
3) Due to its simple probabilistic interpretation, the training time of logistic regression algorithm comes out to be far less than most complex algorithms.

#### Cons/Disadvantages

1) Only important and relevant features should be used to build a model otherwise the probabilistic predictions made by the model may be incorrect and the model's predictive value may degrade.<br>
2) Doesn't perfrom well with <b>non-linearly separable data</b>

#### For every Classiftcaion Model

1) Model only accepts int or float as input for x<br>
2) Model can accept input as strings/object (in certain models) as input for y<br>
2) Model will not accpet any null value<br>
3) x has to be either a DataFrame or a 2D numpy array or a list of list<br>
3) y has to be either a Series or a 1D numpy array or a list

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [4]:
# In Jupyter
# !pip install pandas

# In CMD
# pip install pandas

In [5]:
df = pd.read_excel('insurance_data.xlsx')
df.head()

Unnamed: 0,age,bought_insurance
0,22,0
1,25,0
2,47,1
3,52,0
4,46,1


#### Probelem Statement - Based on Age, predict whether the person bought the insurance or not

In [6]:
df.shape

(27, 2)

In [7]:
df.isnull().sum()

age                 0
bought_insurance    0
dtype: int64

In [20]:
x = df[['age']]
y = df['bought_insurance']
print(type(x))
print(type(y))
print(x.shape)
print(y.shape)

<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.series.Series'>
(27, 1)
(27,)


In [21]:
from sklearn.model_selection import train_test_split

In [38]:
# conventional size(train,test) = (70,30),(75,25),(80,20),(85,15)
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.25)
print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_test.shape)

(20, 1)
(7, 1)
(20,)
(7,)


In [39]:
# 0.25*27

In [40]:
from sklearn.linear_model import LogisticRegression

In [41]:
m1 = LogisticRegression()
m1.fit(x_train,y_train)

In [42]:
# Accuracy
print('Training Score',m1.score(x_train,y_train))
print('Testing Score',m1.score(x_test,y_test))

Training Score 0.9
Testing Score 0.8571428571428571


In [43]:
ypred_m1 = m1.predict(x_test)
print(ypred_m1)

[0 1 1 1 1 1 0]


In [47]:
from sklearn.metrics import confusion_matrix,classification_report,accuracy_score

In [46]:
cm = confusion_matrix(y_test,ypred_m1)
print(cm)
print(classification_report(y_test,ypred_m1))
# [TP=2, FN=1]
# [FP=0, TN=4]

[[2 1]
 [0 4]]
              precision    recall  f1-score   support

           0       1.00      0.67      0.80         3
           1       0.80      1.00      0.89         4

    accuracy                           0.86         7
   macro avg       0.90      0.83      0.84         7
weighted avg       0.89      0.86      0.85         7



In [49]:
print('Testing Score',m1.score(x_test,y_test))
print('Accuracy Score',accuracy_score(y_test,ypred_m1))

Testing Score 0.8571428571428571
Accuracy Score 0.8571428571428571


In [50]:
(2+4)/(2+4+0+1)

0.8571428571428571

In [51]:
m = m1.coef_
c = m1.intercept_
print('Coefficient or slope',m)
print('Intercept or constant',c)

Coefficient or slope [[0.14344089]]
Intercept or constant [-5.29000379]


In [54]:
# p = 1 / (1 + e^-(m*x + c)) 
def sigmoid(x,m,c):
    logit = 1/(1 + np.exp(-(m*x+c)))
    print(logit)

#### Predict whether the person bought the insurance or not when
1) age= 59<br>
2) age = 27

In [56]:
ypred_59 = m1.predict([[59]])
print(ypred_59)
sigmoid(59,m,c)

[1]
[[0.95980581]]




In [57]:
ypred_27 = m1.predict([[27]])
print(ypred_27)
sigmoid(27,m,c)

[0]
[[0.19511664]]


