# Logistic Regression

## Overview

- **Type:** Classification (binary or multiclass)
- **Purpose:** Predict probability of belonging to a class
- **Algorithm:** Models the log-odds of the probability as a linear function of input features

## Key Equation

$$
P(y=1|x) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + ... + \beta_n x_n)}}
$$

- Output: Probability between 0 and 1
- Threshold rule (default):  
  - If \( P ≥ 0.5 \): predict class 1  
  - If \( P < 0.5 \): predict class 0

## Sigmoid Function

- Maps real values to range (0, 1):
  $$
  \sigma(z) = \frac{1}{1 + e^{-z}}
  $$

## Loss Function

- **Binary Cross-Entropy (Log Loss):**
  $$
  Loss = -\frac{1}{n} \sum_{i=1}^{n} \Big[ y_i \log(\hat{p}_i) + (1-y_i)\log(1-\hat{p}_i) \Big]
  $$

## Model Coefficients

- **Intercept (\( \beta_0 \))**: log-odds when all inputs are 0
- **Coefficients (\( \beta_i \))**: impact of each feature on the log-odds

## Performance Metrics

| Metric       | Description                                 |
|--------------|---------------------------------------------|
| Accuracy     | % correctly predicted labels                |
| Precision    | % positive predictions that are correct     |
| Recall       | % actual positives identified correctly     |
| F1-score     | Harmonic mean of precision & recall         |
| ROC-AUC      | Discrimination ability of the classifier    |
| Confusion Matrix | Table with TP, FP, TN, FN              |

## Assumptions

- The relationship between features and the log-odds is linear
- No or little multicollinearity between features
- Independent observations

## Pros and Cons

**Advantages:**
- Interpretable, simple to implement
- Provides probabilities, not just labels
- Fast to train and use

**Limitations:**
- Assumes linearity in log-odds
- Can struggle with complex or highly non-linear boundaries

## Application Steps

1. Data Exploration and Preprocessing  
2. Splitting Data (Train/Test)  
3. Fitting Logistic Regression Model  
4. Evaluating Performance (metrics above)  
5. Predicting New Labels/Probabilities

