# Logistic Regression
Logistic regression in R Programming is a classification algorithm used to find the
probability of event success and event failure. Logistic regression is used when the dependent
variable is binary (0/1, True/False, Yes/No) in nature. Logistic regression can be treated as a
special case of linear regression where the outcome variable is categorical and it is part of a
larger class of algorithms known as Generalized Linear Model (glm)

**Mathematically:**
In Linear regression we have ùë¶ = ùëé0 + ùëé1ùë•<br>
Then $p = \frac{e^y}{1+e^y}$ is the probability of success.<br>
Since $\frac{p}{1-p}=e^y$ therefore the `logit()` function is given by $ln(\frac{p}{1-p})=y$ logit is the link function for Logistic Regression. This link function follows a sigmoid (shown
below) function which limits its range of probabilities between 0 and 1.
![Screenshot (83).png](attachment:a7c0cedc-251e-456e-8649-d45f804f0385.png)

### Types of Logistic Regression
Logistic regression can be divided into following types:
1. **Binary or Binomial:** In such a kind of classification, a dependent variable will have only two
possible types either 1 and 0. For example, these variables may represent success or failure, yes
or no, win or loss etc.
2. **Multinomial:** In such a kind of classification, dependent variable can have 3 or more
possible unordered types or the types having no quantitative significance. For example, these
variables may represent ‚ÄúType A‚Äù or ‚ÄúType B‚Äù or ‚ÄúType C‚Äù.
3. **Ordinal:** In such a kind of classification, dependent variable can have 3 or more
possible ordered types or the types having a quantitative significance. For example, these
variables may represent ‚Äúpoor‚Äù or ‚Äúgood‚Äù, ‚Äúvery good‚Äù, ‚ÄúExcellent‚Äù and each category can have
the scores like 0,1,2,3.

### Accuracy of Logistic Regrresion Model
To evaluate the performance of a logistic regression model always look for:
1. **AIC (Akaike Information Criteria):** The analogous metric of adjusted R¬≤ in logistic regression is AIC.
AIC is the measure of fit which penalizes model for the number of model coefficients. Therefore, we
always prefer model with minimum AIC value.
2. **Null Deviance and Residual Deviance:** Null Deviance indicates the response predicted by a model
with nothing but an intercept. Lower the value, better the model. Residual deviance indicates the
response predicted by a model on adding independent variables. Lower the value, better the model.
3. **Confusion Matrix:** It is nothing but a tabular representation of Actual vs Predicted values. This helps
us to find the accuracy of the model and avoid overfitting.

### Assumptions of Logistic Regression
1. Logistic regression does not require a linear relationship between the dependent and
independent variables.
2. It requires that the independent variables are linearly related to the log odds.
3. The error terms (residuals) do not need to be normally distributed.
4. Homoscedasticity is not required.
5. The dependent variable in logistic regression is not measured on an interval or ratio scale.

#### Fisher Scoring Iterations.
This is the number of iterations to fit the model. The logistic
regression uses an iterative maximum likelihood algorithm to fit the data. The Fisher method is
the same as fitting a model by iteratively re-weighting the least squares. It indicates the optimal
number of iterations.

### Creating Logistic Regression Model
We use glm() function to create a logistic regression model in R.

In [6]:
import numpy as np

In [7]:
import pandas as pd 

In [9]:
pip install ISLP

Collecting ISLP
  Downloading islp-0.4.1-py3-none-any.whl.metadata (7.1 kB)
Collecting lxml (from ISLP)
  Downloading lxml-6.0.2-cp313-cp313-win_amd64.whl.metadata (3.7 kB)
Collecting statsmodels>=0.13 (from ISLP)
  Downloading statsmodels-0.14.6-cp313-cp313-win_amd64.whl.metadata (9.8 kB)
Collecting lifelines (from ISLP)
  Downloading lifelines-0.30.1-py3-none-any.whl.metadata (3.4 kB)
Collecting pygam (from ISLP)
  Downloading pygam-0.12.0-py3-none-any.whl.metadata (9.8 kB)
Collecting torch (from ISLP)
  Downloading torch-2.10.0-cp313-cp313-win_amd64.whl.metadata (31 kB)
Collecting pytorch_lightning (from ISLP)
  Downloading pytorch_lightning-2.6.1-py3-none-any.whl.metadata (21 kB)
Collecting torchmetrics (from ISLP)
  Downloading torchmetrics-1.8.2-py3-none-any.whl.metadata (22 kB)
Collecting patsy>=0.5.6 (from statsmodels>=0.13->ISLP)
  Downloading patsy-1.0.2-py2.py3-none-any.whl.metadata (3.6 kB)
Collecting autograd>=1.5 (from lifelines->ISLP)
  Downloading autograd-1.8.0-py3-non


[notice] A new release of pip is available: 25.3 -> 26.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [10]:
install.packages("ISLR")

NameError: name 'install' is not defined

In [None]:
#Installing the required packages for the data
install.packages("ISLR")
library("ISLR")


In [None]:
#Loading the data
attach(Smarket)
head(Smarket)

In [None]:
#Fitting the model
LR=glm(Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + Volume, data = Smarket, family =binomial)

#Summary
summary(LR)

**Note:** The option family = binomial, tells to R that we want to fit logistic regression. And that
the dependent variable is binary.

### Observations
1. Estimate, standard errors, z-score, and p-values on each of the coefficients. Look like none of
the coefficients are significant here
2. null deviance (the deviance just for the mean) and the residual deviance (the deviance for the
model with all the predictors). There's a very small difference between the 2, along with 6
degrees of freedom.<br>
**Remark:** Evaluation of the Regression. The null deviance and the residual deviance are used to test
whether the independent variables provide statistically significant explanation. A chi-square test, using the
difference between the two residuals, indicates the overall significance of the model.

### Confusion Matrix
The R function table() can be used to produce a confusion matrix in order to determine
how many observations were correctly or incorrectly classified. It compares the observed and the
predicted outcome values and shows the number of correct and incorrect predictions categorized
by type of outcome.<br>
Confusion matrix, proportion of cases<br>
`table(observed.classes, predicted.classes)`

![Screenshot (84).png](attachment:965f4302-b70c-4220-bdf5-79b45aed4269.png)

The diagonal elements of the confusion matrix indicate correct predictions, while the off-diagonals
represent incorrect predictions. So, the correct classification rate is the sum of the number on the
diagonal divided by the sample size in the test data. In our example, that is (48 + 15)/78 = 81%.

### Making Predictions using Logistic Regression

**Example: 1**

In [None]:
library("ISLR")
head(Smarket)

In [None]:
# Fitting the model 
LGR = glm(Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + Volume, data = Smarket, family = binomial)

# Prediction Probability
Prob = predict(LGR, type="response")

# Predictions
Pred = ifelse(Prob>0.5,"Up","Down")
#Pred is a vector of trues and falses. If PROB is bigger than 0.5, PRED calls "Up"; otherwise, it calls "False".

attach(Smarket)

#Confusion Matrix
table(Pred, Direction)

In [None]:
# Attach the data frame Smarket and make a table of PRED, which is the ups and downs from the previous direction.
attach(Smarket)

# Mean of the Predictions i.e. the Prediction Accuracy
mean(Pred==Direction)

**Observation:**<br>
From the table, instances on the diagonals are where you get the correct classification, and off
the diagonals are where you make mistake. The mean gives a proportion of 0.52 which is
classification rate. The classification prediction accuracy is about 52%. The misclassification
error rate is 48%.

**Example 2:**

In [None]:
#Loading the data
input =  mtcars[c("am","cyl","hp","wt")]

#Fitting the model
LGR = glm(am ~ cyl + hp + wt, data=input, family=binomial)

#Model summary
summary(LGR)

**Note:** Coefficients. This part of the output shows the coefficients, their standard errors, the zstatistic (sometimes called a Wald z-statistic), and associated P-values. The Z statistics are analogous to the F in the linear regression. For information about P-values

**Observation:** The p-value in the last column is more than 0.05 for the variables "cyl" and "hp",
we consider them to be insignificant in contributing to the value of the variable "am". Only
weight (wt) impacts the "am" value in this regression model.`m

**Example 3:**

In [None]:
# Loading the data
install.packages("TH.data")
library("TH.data")
head(GBSG2)

In [None]:
# Fitting the model
fit = glm(cens~pnodes*horTh, data=GBSG2,family=binomial)

# Model summary
summary(fit)

In [None]:
# Visualizing the predictions
library(ggiraphExtra)
ggPredict(fit, se=TRUE, digits=3)
#Set interactive=TRUE for interactive visuals

Give it an upvote if you find this notebook useful
##### The End