#### 1. **Introduction to Logistic Regression**

**Logistic regression** is a supervised learning algorithm used for binary classification problems, where the target variable has two possible outcomes (e.g., "yes" or "no," "0" or "1").<br>
Unlike linear regression, logistic regression predicts the probability that an observation belongs to a certain class using a logistic function (also called the sigmoid function).

**Examples of Logistic Regression Use Cases**:
- Predicting whether an email is spam or not spam.
- Determining if a customer will buy a product (yes/no).
- Classifying if a patient has a disease (positive/negative).

---

#### 2. **Logistic Function (Sigmoid Function)**

The logistic function is used to map the predicted values to a probability between 0 and 1. It has the following form:

$$
\sigma(z) = \frac{1}{1 + e^{-z}}
$$

Where:
- $ z = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + ... + \beta_n X_n $
- $ \beta_0 $ is the intercept, $ \beta_1, \beta_2, ..., \beta_n $ are the coefficients, and $ X_1, X_2, ..., X_n $ are the feature values.

The output of the sigmoid function is interpreted as the probability that the observation belongs to the positive class (1). For example, if $ \sigma(z) = 0.8 $, it means there is an 80% chance that the observation belongs to the positive class.

---

#### 3. **Decision Boundary**

In logistic regression, a **decision boundary** is established at 0.5 probability. If the predicted probability $ P(y=1) $ is greater than 0.5, the model classifies the observation as belonging to class 1. If the probability is less than 0.5, the observation is classified as class 0.

$$
P(y = 1 | X) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 X)}}
$$
Thus, the predicted class $ \hat{y} $ is:
- $ \hat{y} = 1 $, if $ P(y=1) > 0.5 $
- $ \hat{y} = 0 $, if $ P(y=1) \leq 0.5 $

---

#### 4. **Log-Loss (Cost Function)**

The cost function in logistic regression is **log-loss** (or **binary cross-entropy**). It penalizes incorrect classifications, especially those with high confidence.

$$
Cost(h_\theta(x), y) = -[y \log(h_\theta(x)) + (1 - y) \log(1 - h_\theta(x))]
$$

Where:
- $ h_\theta(x) $ is the predicted probability for class 1.
- $ y $ is the actual label (0 or 1).

Logistic regression tries to minimize this cost function to find the best-fit parameters (coefficients).

---

### 5. **Step-by-Step Example**

Consider a dataset where we want to predict whether a person will buy a product (Buy = 1) or not (Buy = 0) based on their age and income.

| Age  | Income  | Buy (y) |
|------|---------|---------|
| 25   | 50000   | 0       |
| 45   | 100000  | 1       |
| 35   | 75000   | 1       |
| 30   | 60000   | 0       |
| 50   | 120000  | 1       |

We’ll use this small dataset to demonstrate how logistic regression works, focusing on calculating the probabilities, making predictions, and understanding the decision boundary.

### **Step-by-Step Process**

#### **Step 1: Hypothesis Function**
Logistic regression models the probability that \( y = 1 \) (i.e., the person buys the product) using the logistic function:

$$
P(y = 1 | X) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 X_1 + \beta_2 X_2)}}
$$

Where:
- $ \beta_0 $ is the intercept,
- $ \beta_1 $ and $ \beta_2 $ are the coefficients for age and income,
- $ X_1 $ and $ X_2 $ represent the values for age and income, respectively.

#### **Step 2: Train the Model**

The logistic regression model will find the best values for \( \beta_0 \), \( \beta_1 \), and \( \beta_2 \) by minimizing the cost function (log-loss).

For this small dataset, the model might output something like:

$$
P(y = 1 | X) = \frac{1}{1 + e^{-(2 + 0.04 \cdot Age + 0.00005 \cdot Income)}}
$$

These numbers (2, 0.04, 0.00005) are the coefficients and intercept estimated by the model.

#### **Step 3: Predict Probabilities**

Let’s calculate the probability that a 35-year-old person with an income of 75,000 will buy the product:

$$
P(y = 1 | X) = \frac{1}{1 + e^{-(2 + 0.04 \cdot 35 + 0.00005 \cdot 75000)}}
$$
First, calculate the linear combination:

$$
z = 2 + (0.04 \cdot 35) + (0.00005 \cdot 75000) = 2 + 1.4 + 3.75 = 7.15
$$

Now apply the logistic function:

$$
P(y = 1 | X) = \frac{1}{1 + e^{-7.15}} \approx \frac{1}{1 + 0.0008} \approx 0.9992
$$

The model predicts that there is a 99.92% chance that this person will buy the product.

#### **Step 4: Make the Final Prediction**

Based on the calculated probability, we apply the decision boundary (usually 0.5). Since 0.9992 is greater than 0.5, the model predicts **Class 1** (the person will buy the product).

---

#### 6. **Python Code Example**

Here's how you can implement logistic regression using Python and the `scikit-learn` library:

In [1]:
# Import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Step 1: Create a dataset
data = {'Age': [25, 45, 35, 30, 50],
        'Income': [50000, 100000, 75000, 60000, 120000],
        'Buy': [0, 1, 1, 0, 1]}

df = pd.DataFrame(data)

# Step 2: Define features (Age, Income) and target (Buy)
X = df[['Age', 'Income']]  # Features
y = df['Buy']  # Target variable

# Step 3: Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Step 4: Initialize the Logistic Regression model
log_reg = LogisticRegression()

# Step 5: Train the model
log_reg.fit(X_train, y_train)

# Step 6: Make predictions on the test set
y_pred = log_reg.predict(X_test)

# Step 7: Evaluate the model's performance
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')

# Step 8: Predict for a new observation (Age = 40, Income = 85000)
new_data = pd.DataFrame({'Age': [40], 'Income': [85000]})
prediction = log_reg.predict(new_data)
print(f'Predicted class for new observation: {prediction[0]}')

Accuracy: 1.00
Predicted class for new observation: 1


**Explanation**:
- **Step 1**: We create a dataset with age, income, and whether the person buys the product.
- **Step 2**: We define features (age and income) and the target variable (buy).
- **Step 3**: The dataset is split into training and testing sets.
- **Step 4**: We initialize the logistic regression model.
- **Step 5**: We train the model on the training set.
- **Step 6**: We use the trained model to make predictions on the test set.
- **Step 7**: We evaluate the accuracy of the model.
- **Step 8**: We predict the probability for a new data point (Age = 40, Income = 85000).

---

#### 7. **Conclusion**

Logistic regression is a widely used algorithm for binary classification tasks. By modeling the probability of a binary outcome using the logistic function, it provides a powerful and interpretable solution to classification problems. 

**Homework**:  
Train a logistic regression model on a larger dataset with multiple features, and analyze the results. Try adjusting the threshold from 0.5 to 0.6 or 0.7 and observe how it affects the classification results.