# Logistic Regression

## Classification
- **Definition:** Predicting a discrete category rather than a continuous value.
- **Examples:**
  - **Spam detection:** Is an email spam? (**Yes/No**)
  - **Fraud detection:** Is a transaction fraudulent? (**Yes/No**)
  - **Medical diagnosis:** Is a tumor malignant? (**Yes/No**)


### Binary Classification
- **Only two possible outcomes:** 
  - **0 (Negative class)** → Absence of a property  
  - **1 (Positive class)** → Presence of a property  


### Logistic Regression

**Logistic Regression** is one of the most widely used classification algorithms. It is often applied in medical diagnostics, spam detection, and online advertising. Unlike linear regression, logistic regression predicts a probability value and maps it to discrete class labels (0 or 1).  

- **Linear regression:** Predicts continuous values.
- **Logistic regression:** Predicts probabilities.
- **Despite its name, Logistic Regression is used for Classification.**  
- **Output:** Probability of the input data belonging to a certain category.



## Sigmoid Function

The **Sigmoid Function** is used in Logistic Regression to map predictions to probabilities. It is an S-shaped curve that maps any real value to the range [0, 1]. The function is defined as:

$$
\sigma(z) = \frac{1}{1 + e^{-z}}
$$

Where:
- $z$ is the input to the function. A linear combination of the input features.
- $\sigma(z)$ is the output, which is the probability of the input data belonging to the positive class.

### 📉 Properties of Sigmoid  
| **Value of $z$**  | **$\sigma(z)$ Output** |
|--------------------|----------------|
| $z \to +\infty$  | $\sigma(z) \to 1$  |
| $z = 0$  | $\sigma(z) = 0.5$ |
| $z \to -\infty$  | $\sigma(z) \to 0$  |

The sigmoid function **compresses** any input $z$ into a probability range of **(0,1)**.



## Logistic Regression Model

The Logistic Regression model follows a 2-step process:
1. **Linear Combination:** Compute the linear combination of the input features and weights.

$$
z = w \cdot x + b
$$



2. **Sigmoid Activation:** Apply the sigmoid function to the linear combination to get the probability.

$$
f(x) = \sigma(z) = \frac{1}{1 + e^{-z}}
$$

Therefore, the Logistic Regression model can be represented as:

$$
f(x) = \frac{1}{1 + e^{-(w \cdot x + b)}}
$$

Where:
- $f(x)$ is the predicted probability of the input data belonging to the positive class.
- $w$ is the weight vector.
- $x$ is the input feature vector.
- $b$ is the bias term.

The output of logistic regression, $f(x)$, represents the **probability** of a class label being **1**:  
$$
P(y = 1 \mid x) = f(x)
$$

## Classification Decision: Choosing $\hat{y}$

To convert the probability into a class label, we apply a **threshold**:

$$
\hat{y} =
\begin{cases} 
1 & \text{if } f(x) \geq 0.5 \\
0 & \text{if } f(x) < 0.5
\end{cases}
$$

- If $f(x) \geq 0.5$, predict **$y = 1$**.
- If $f(x) < 0.5$, predict **$y = 0$**.

This thresholding mechanism allows logistic regression to separate data into distinct classes.

## Decision Boundary

The decision boundary is the line that separates the classes in a classification problem. In Logistic Regression. It is the region where the model is **equally confident** about classifying a point as either class 0 or class 1.

the model predicts **$y = 1$** whenever:

$$
w \cdot x + b \geq 0
$$

and **$y = 0$** whenever:

$$
w \cdot x + b < 0
$$


## 🔍 Example: Visualizing a Linear Decision Boundary

Consider a dataset with **two features** ($x_1$, $x_2$). The logistic regression model computes:

$$
z = w_1 x_1 + w_2 x_2 + b
$$

If we assume:

$$
w_1 = 1, \quad w_2 = 1, \quad b = -3
$$

Then, the decision boundary occurs where:

$$
x_1 + x_2 - 3 = 0
$$

which simplifies to:

$$
x_1 + x_2 = 3
$$

🔹 **Interpretation**:
- **Points where** $x_1 + x_2 > 3$ → Predict $y = 1$
- **Points where** $x_1 + x_2 < 3$ → Predict $y = 0$



# Logistic Regression and Decision Boundaries

## 1. Logistic Regression Recap

### Model Overview
- **Objective:** Estimate the probability that $y=1$ given input features $\mathbf{x}$.
- **Two-step Process:**
  1. **Linear Combination:**
     - Compute:

$$z = \mathbf{w} \cdot \mathbf{x} + b$$

  2. **Activation via Sigmoid Function:**
     - Apply the Sigmoid (or logistic) function:

$$f(x) = g(z) = \frac{1}{1 + e^{-z}}$$

- **Interpretation:**  
  $f(x)$ is interpreted as the probability $\Pr(y=1 \mid \mathbf{x}; \mathbf{w}, b)$, yielding a value between 0 and 1 (e.g., 0.7 or 0.3).

---

## 2. Prediction Threshold and Decision Rule

### How to Decide the Class?
- **Thresholding:**  
  A common threshold is $0.5$.  
  - If $f(x) \geq 0.5$, **predict** $\hat{y} = 1$.
  - If $f(x) < 0.5$, **predict** $\hat{y} = 0$.

### Mathematical Explanation
- Since $f(x) = g(z)$ and the sigmoid function satisfies:
  $$g(z) \geq 0.5 \quad \text{if and only if} \quad z \geq 0,$$
  the prediction rule becomes:
  - **Predict 1:** when 
    $$\mathbf{w} \cdot \mathbf{x} + b \geq 0$$
  - **Predict 0:** when 
    $$\mathbf{w} \cdot \mathbf{x} + b < 0$$

> **Note:** The condition $\mathbf{w} \cdot \mathbf{x} + b = 0$ defines the **decision boundary** — the point of neutrality between the two classes.

---

## 3. Visualizing Decision Boundaries with Two Features

### Example: Linear Decision Boundary
- **Scenario:** Classification problem with features $x_1$ and $x_2$.
- **Training Data:**  
  - **Positive examples ($y=1$):** Red crosses.
  - **Negative examples ($y=0$):** Blue circles.
  
- **Logistic Regression Function:**
  $$f(x) = g(z), \quad \text{with} \quad z = w_1 x_1 + w_2 x_2 + b$$
  
- **Given Parameters:**
  - $w_1 = 1$, $w_2 = 1$, $b = -3$
  
- **Decision Boundary Calculation:**
  - Set $z = 0$:
    $$w_1 x_1 + w_2 x_2 + b = 0 \quad \Rightarrow \quad x_1 + x_2 - 3 = 0$$
  - **Boundary Line:**  
    $$x_1 + x_2 = 3$$
  
- **Interpretation:**
  - **Right of the line ($\mathbf{w} \cdot \mathbf{x} + b \geq 0$):** Predict $y = 1$.
  - **Left of the line ($\mathbf{w} \cdot \mathbf{x} + b < 0$):** Predict $y = 0$.

---

## 4. Non-Linear Decision Boundaries with Polynomial Features

### Extending Logistic Regression
- **Idea:** Incorporate polynomial features to allow for non-linear decision boundaries.
- **Example Function:**
  $$z = w_1 x_1^2 + w_2 x_2^2 + b$$
  
- **Given Parameters:**
  - $w_1 = 1$, $w_2 = 1$, $b = -1$
  
- **Decision Boundary Calculation:**
  - Set $z = 0$:
    $$x_1^2 + x_2^2 - 1 = 0 \quad \Rightarrow \quad x_1^2 + x_2^2 = 1$$
  - **Boundary Curve:**  
    A circle with radius 1.
  
- **Interpretation:**
  - **Outside the Circle ($x_1^2 + x_2^2 \geq 1$):** Predict $y = 1$.
  - **Inside the Circle ($x_1^2 + x_2^2 < 1$):** Predict $y = 0$.

> **Tip:** By incorporating higher-order polynomial terms (e.g., $x_1x_2$, $x_1^2$, $x_2^2$, etc.), logistic regression can model complex decision boundaries such as ellipses or even more intricate shapes.

---

## 5. Complex Decision Boundaries

### Combining Multiple Polynomial Terms
- **General Form Example:**
  $$z = w_1 x_1 + w_2 x_2 + w_3 x_1^2 + w_4 (x_1x_2) + w_5 x_2^2$$
- **Result:**  
  This can yield highly non-linear decision boundaries, which might take forms like ellipses or other irregular shapes.

### Key Insight
- **Without Polynomial Features:**  
  Logistic regression will always produce a **linear (straight-line)** decision boundary.
- **With Polynomial Features:**  
  The model is capable of fitting **complex boundaries** to better separate classes.

