### Classification
1) Dependent variable is categorical in nature eg - Diabetic or Non-diabetic, Yes or No, True or False etc

### Logistic Regression

1) It is a supervised learning algorithm used for classification.<br>
2) Logistic regression is a classification algorithm used to assign observations to a discrete set of classes.<br>
3) It is used when the dependent variable (target) is categorical.<br>
4) Logistic regression transforms its output using the <b>logistic or sigmoid function</b> to return a probability value which can then be mapped to two or more discrete classes.<br>
5) It is appropriate algorithm when dependent variable is categorical and consists of two categories (binary).<br>
6) It is a special case of linear regression where the target variable is categorical in nature. It uses a log of odds as the dependent variable. Logistic Regression predicts
<b>
the probability of occurrence of a binary event utilizing a logit or sigmoid function.<br>
7) Odds Ratio = P(event happening) /1-P(event happening) <br>
8) Log of odds = log(P(event happening)/1-P(event happening)) = log(p/(1-p)) = y = ax + b
</b>

### Sigmoid Function

1) In order to map predicted values to probabilities, we use the sigmoid function. The function maps any real value into another value between 0 and 1. In machine learning, we use sigmoid to map predictions to probabilities.<br>
2) The sigmoid function, also called logistic function gives an ‘S’ shaped curve that can take any real-valued number and map it into a value between 0 and 1. If the curve goes to positive infinity, y predicted will become 1, and if the curve goes to negative infinity, y predicted will become 0. If the output of the sigmoid function is more than 0.5, we can classify the outcome as 1 or YES, and if it is less than 0.5, we can classify it as 0 or NO.

<img src="log_reg1.png" align="left">
<img src="log_reg2.png" align="middle">

where z = ax + b

#### Derivation of Sigmoid Function

<b>log (p/(1-p)) = ax + b</b><br>
=> p / (1-p) = e^(ax + b)<br>
=> p = e^(ax + b) - p* e^(ax + b)<br> 
=> p + p* e^(ax + b) = e^(ax + b) <br>
=> p*(1 + e^(ax + b)) = e^(ax + b)<br>
=> p = e^(ax + b) / (1 + e^(ax + b))<br>
=> Dividing numerator and denominator of RHS by e^(ax + b)<br>
<b>=> p = 1 / (1 + e^-(ax + b)) </b><br>
=> Logistic or Sigmoid or Logit Function

#### Pros/Advantages

1) Logistic regression is easier to implement, interpret, and very efficient to train.<br>
2) It provides good accuracy for many simple data sets and it performs well when the dataset is linearly separable.<br>
3) Due to its simple probabilistic interpretation, the training time of logistic regression algorithm comes out to be far less than most complex algorithms.

#### Cons/Disadvantages

1) Only important and relevant features should be used to build a model otherwise the probabilistic predictions made by the model may be incorrect and the model's predictive value may degrade.<br>
2) It doesnt perform well on non-linearly separable data<br>

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [3]:
df = pd.read_excel('insurance_data.xlsx')
df.head() # top 5 rows

Unnamed: 0,age,bought_insurance
0,22,0
1,25,0
2,47,1
3,52,0
4,46,1


#### Problem Statement - Based on person's Age, predict whether the person bought the insurance or not
Make predictions when age is <br>
a) 59<br>
b) 25<br>

In [5]:
df.shape
# (no of rows,no of columns)

(27, 2)

In [6]:
df['bought_insurance'].value_counts()

1    14
0    13
Name: bought_insurance, dtype: int64

In [7]:
df.isnull().sum()

age                 0
bought_insurance    0
dtype: int64

In [10]:
x = df[['age']]             # dataframe
y = df['bought_insurance']  # series
print(type(x))
print(type(y))

<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.series.Series'>


In [11]:
x.head()

Unnamed: 0,age
0,22
1,25
2,47
3,52
4,46


In [12]:
y.head()

0    0
1    0
2    1
3    0
4    1
Name: bought_insurance, dtype: int64

In [16]:
from sklearn.model_selection import train_test_split

In [15]:
print(x.shape)
print(y.shape)

(27, 1)
(27,)


#### Training data - ML model will be trained on training data
#### Testing data - ML model which has been trained on training data will be used to generating predictions on the test data

In [23]:
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.25)
print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_test.shape)

(20, 1)
(7, 1)
(20,)
(7,)


In [20]:
x_train.head()

Unnamed: 0,age
21,26
17,58
24,50
5,56
15,55


#### Applying Logistic Regression Model

In [24]:
from sklearn.linear_model import LogisticRegression

In [26]:
m1 = LogisticRegression()
m1.fit(x_train,y_train)

# ML model will be trained on training data

LogisticRegression()

In [28]:
ypred_m1 = m1.predict(x_test)
print(ypred_m1)
# ML model trained on training data is used to generate the predictions on the test data

[1 0 0 0 0 1 0]


In [29]:
from sklearn.metrics import confusion_matrix,classification_report

In [30]:
cm = confusion_matrix(y_test,ypred_m1)
print(cm)
print(classification_report(y_test,ypred_m1))

[[4 0]
 [1 2]]
              precision    recall  f1-score   support

           0       0.80      1.00      0.89         4
           1       1.00      0.67      0.80         3

    accuracy                           0.86         7
   macro avg       0.90      0.83      0.84         7
weighted avg       0.89      0.86      0.85         7



In [32]:
# y = mx + c = log(p/(1-p))
m = m1.coef_
c = m1.intercept_
print('Slope or Coefficient',m)
print('Intercept',c)

Slope or Coefficient [[0.15533134]]
Intercept [-6.48249443]


#### Q) Make predictions when age is
a) 59<br>
b) 25

In [33]:
def sigmoid(x,m,c):
    logistic = 1/(1 + np.exp(-(m*x + c)))
    print(logistic)  # between 0 and 1

In [35]:
ypred59 = m1.predict([[59]])
print(ypred59)
sigmoid(59,m,c)

[1]
[[0.9359594]]


In [36]:
ypred25 = m1.predict([[25]])
print(ypred25)
sigmoid(25,m,c)

[0]
[[0.06918923]]
