# Logistic Regression

## Classification
- **Definition:** Predicting a discrete category rather than a continuous value.
- **Examples:**
  - **Spam detection:** Is an email spam? (**Yes/No**)
  - **Fraud detection:** Is a transaction fraudulent? (**Yes/No**)
  - **Medical diagnosis:** Is a tumor malignant? (**Yes/No**)


### Binary Classification
- **Only two possible outcomes:** 
  - **0 (Negative class)** → Absence of a property  
  - **1 (Positive class)** → Presence of a property  


### Logistic Regression

**Logistic Regression** is one of the most widely used classification algorithms. It is often applied in medical diagnostics, spam detection, and online advertising. Unlike linear regression, logistic regression predicts a probability value and maps it to discrete class labels (0 or 1).  

- **Linear regression:** Predicts continuous values.
- **Logistic regression:** Predicts probabilities.
- **Despite its name, Logistic Regression is used for Classification.**  
- **Output:** Probability of the input data belonging to a certain category.



## Sigmoid Function

The **Sigmoid Function** is used in Logistic Regression to map predictions to probabilities. It is an S-shaped curve that maps any real value to the range [0, 1]. The function is defined as:

$$
\sigma(z) = \frac{1}{1 + e^{-z}}
$$

Where:
- $z$ is the input to the function. A linear combination of the input features.
- $\sigma(z)$ is the output, which is the probability of the input data belonging to the positive class.

### 📉 Properties of Sigmoid  
| **Value of $z$**  | **$\sigma(z)$ Output** |
|--------------------|----------------|
| $z \to +\infty$  | $\sigma(z) \to 1$  |
| $z = 0$  | $\sigma(z) = 0.5$ |
| $z \to -\infty$  | $\sigma(z) \to 0$  |

The sigmoid function **compresses** any input $z$ into a probability range of **(0,1)**.



## Logistic Regression Model

The Logistic Regression model follows a 2-step process:
1. **Linear Combination:** Compute the linear combination of the input features and weights.

$$
z = w \cdot x + b
$$



2. **Sigmoid Activation:** Apply the sigmoid function to the linear combination to get the probability.

$$
f(x) = \sigma(z) = \frac{1}{1 + e^{-z}}
$$

Therefore, the Logistic Regression model can be represented as:

$$
f(x) = \frac{1}{1 + e^{-(w \cdot x + b)}}
$$

Where:
- $f(x)$ is the predicted probability of the input data belonging to the positive class.
- $w$ is the weight vector.
- $x$ is the input feature vector.
- $b$ is the bias term.

The output of logistic regression, $f(x)$, represents the **probability** of a class label being **1**:  
$$
P(y = 1 \mid x) = f(x)
$$

## Classification Decision: Choosing $\hat{y}$

To convert the probability into a class label, we apply a **threshold**:

$$
\hat{y} =
\begin{cases} 
1 & \text{if } f(x) \geq 0.5 \\
0 & \text{if } f(x) < 0.5
\end{cases}
$$

- If $f(x) \geq 0.5$, predict **$y = 1$**.
- If $f(x) < 0.5$, predict **$y = 0$**.

This thresholding mechanism allows logistic regression to separate data into distinct classes.

## Decision Boundary

The decision boundary is the line that separates the classes in a classification problem. In Logistic Regression. It is the region where the model is **equally confident** about classifying a point as either class 0 or class 1.

the model predicts **$y = 1$** whenever:

$$
w \cdot x + b \geq 0
$$

and **$y = 0$** whenever:

$$
w \cdot x + b < 0
$$


## 🔍 Example: Visualizing a Linear Decision Boundary

Consider a dataset with **two features** ($x_1$, $x_2$). The logistic regression model computes:

$$
z = w_1 x_1 + w_2 x_2 + b
$$

If we assume:

$$
w_1 = 1, \quad w_2 = 1, \quad b = -3
$$

Then, the decision boundary occurs where:

$$
x_1 + x_2 - 3 = 0
$$

which simplifies to:

$$
x_1 + x_2 = 3
$$

🔹 **Interpretation**:
- **Points where** $x_1 + x_2 > 3$ → Predict $y = 1$
- **Points where** $x_1 + x_2 < 3$ → Predict $y = 0$



## Polynomial Features

In practice, the decision boundary is not always linear. Logistic Regression can be extended to capture non-linear relationships by adding **polynomial features**. This allows the model to capture more complex patterns in the data.

For example, consider a dataset with a single feature $x$. The logistic regression model with polynomial features can be represented as:

$$
f(x) = \sigma(w_1 x + w_2 x^2 + w_3 x^3 + b)
$$

By adding polynomial features, the model can capture non-linear relationships between the input features and the target variable.
