# Day 11: Logistic Regression (Binary Classification)

Today, we will explore Logistic Regression, one of the most popular algorithms for binary classification problems. Unlike regression models that predict continuous values, logistic regression predicts the probability of a binary outcome.

# Topics Covered:
- Introduction to Binary Classification
- Theory of Logistic Regression
- Differences Between Logistic and Linear Regression
- Key Concepts: Odds Ratios and Probabilities
- Example: Customer Churn Prediction
- Evaluating Model Performance: Accuracy, Precision, Recall, F1-Score, ROC-AUC

## 1. Introduction to Binary Classification

Binary classification is a type of classification where the target variable can take one of two possible outcomes (often referred to as "classes"). Logistic Regression is particularly useful for this because it predicts the probability of an outcome falling into a specific class

### Examples of Binary Classification:


- Spam Detection: Classifying emails as "spam" or "not spam".
- Customer Churn: Predicting whether a customer will leave a service ("churn") or stay.
- Fraud Detection: Detecting if a transaction is "fraudulent" or "legitimate".

In binary classification, the two classes are typically labeled as 0 and 1. Logistic regression estimates the probability that the target is 1, given the input features.

## 2. Theory Behind Logistic Regression

Unlike linear regression, logistic regression does not predict continuous outcomes but rather the probability of a certain class. 

It uses the logistic (sigmoid) function to convert the output of the linear model into a probability between 0 and 1.

### Logistic (Sigmoid) Function

$$
P(y=1 \mid X) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_n X_n)}}
$$


Where:

- $P(y=1 \mid X)$ is the probability of the positive class (e.g., customer churning).
- $\beta_0, \beta_1, ..., \beta_n$ are the coefficients of the model.
- $X_1, X_2, ..., X_n$ are the features (input variables).

### Log Odds

In logistic regression, the predicted probabilities are transformed into log-odds, which are then mapped to a probability:

$$
\log\left(\frac{P(y=1)}{1 - P(y=1)}\right) = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_n X_n
$$


This makes logistic regression a linear classifier at its core, but it outputs probabilities instead of continuous values.

## 3. Differences Between Logistic and Linear Regressio

| Logistic Regression                            | Linear Regression                               |
|------------------------------------------------|------------------------------------------------|
| Used for binary classification problems (two outcomes: 0 or 1). | Used for regression problems (predicting continuous values). |
| Outputs probabilities between 0 and 1.         | Outputs continuous numeric values.              |
| Uses the sigmoid function to map the predictions to probabilities. | Uses a straight line to fit the data points.    |
| Predictions are non-linear in nature.          | Predictions are linear in nature.               |
| Loss function: Log Loss (Cross-Entropy).       | Loss function: Mean Squared Error (MSE).        |


## 4. Key Concepts: Odds Ratios and Probabilities

### Odds Ratio: 
- This measures the odds of an event happening versus not happening. 

In logistic regression, we model the logarithm of the odds (also called log-odds).

Odds: The ratio of the probability of an event happening to the probability of it not happening.

$$ Odds = \frac{P}{(1-P)} $$

Where $ P $ is the probability of the event happening.

#### Example: 

If $ P $ = 0.8 (is chance of happening a event), 

then Odds will be 

$$ Odds = \frac{0.8}{1-0.8} = 4 $$

meaning that the Odds of happeing that event is 4 times compared to not happening 

### Probability

Logistic regression outputs a probability between 0 and 1, representing the likelihood of an event happening. If the probability is:

    - Greater than 0.5 → Predicts class 1 (event happens).
    - Less than 0.5 → Predicts class 0 (event does not happen).

### In simpler terms:

- Logistic regression transforms odds into a probability.
- Odds tell you how much more likely something is to happen compared to it not happening.
- The probability (output of the logistic model) is what you use to make the final prediction:
    - If it's over 0.5, the event is predicted to occur.
    - If it's under 0.5, the event is predicted not to occur.
