# Logistic Regression

### 1. What is Logistic Regression, and how does it differ from Linear Regression?

### 🔹 What is Logistic Regression?

**Logistic Regression** is a supervised learning algorithm used for **classification tasks** 🎯 — most commonly binary classification (i.e., classifying data into two categories: yes/no, 0/1, spam/not spam, etc.).

* It models the **probability** that a given input belongs to a particular category.
* It uses the **logistic (sigmoid) function** to map predicted values to probabilities between 0 and 1:

$$P(y = 1 | x) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + \dots + \beta_n x_n)}}$$

* Based on a **threshold** (typically 0.5), it classifies the input.

---

### 🔹 What is Linear Regression?

**Linear Regression** is a supervised learning algorithm used for **regression tasks**, where the output is a **continuous numeric value** 📈.

* It tries to model the **linear relationship** between the independent variables (features) and the dependent variable (target):

$$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_n x_n + \epsilon$$

* The goal is to minimize the error between the predicted values and the actual values, usually using **least squares**.

---

### 🔸 Key Differences

| Feature | Linear Regression | Logistic Regression |
| :--- | :--- | :--- |
| **Type of Problem** | Regression (predicting continuous values) | Classification (predicting categories) |
| **Output** | Real number | Probability (0 to 1) |
| **Function Used** | Linear function | Sigmoid/logistic function |
| **Prediction** | Direct numerical value | Probability → classified into 0 or 1 |
| **Loss Function** | Mean Squared Error (MSE) | Log Loss / Binary Cross-Entropy |



### 2.  Explain the role of the Sigmoid function in Logistic Regression

### 🔍 Role of the Sigmoid Function in Logistic Regression

The **sigmoid function** plays a central role in Logistic Regression by converting the output of a linear model into a **probability score**, which is crucial for classification tasks.

---

### ⚙️ What is the Sigmoid Function?

The sigmoid (or logistic) function is a mathematical function that maps any real-valued number to a value between 0 and 1.

[Image of Sigmoid function]


$$
\sigma(z) = \frac{1}{1 + e^{-z}}
$$

Where $z = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_n x_n$ is the linear combination of inputs.

---

### 🎯 Why is it Used in Logistic Regression?

Logistic Regression first computes a linear combination of the input features, like in Linear Regression, but instead of outputting that value directly, it passes it through the sigmoid function to get a probability:

$$
P(y = 1 | x) = \sigma(z) = \frac{1}{1 + e^{-z}}
$$

This ensures the output is always between 0 and 1 and is interpretable as a probability.

---

### ✅ Classification Using the Sigmoid Output

After calculating the probability $P(y=1|x)$, we can make predictions based on a threshold (typically 0.5):

* If $P \ge 0.5$, predict class 1.
* If $P < 0.5$, predict class 0.

You can change this threshold depending on the specific application's needs.

---

### 📊 Visual Intuition

The sigmoid curve has an "S" shape. As the linear output $z$ gets larger, the function's value approaches 1. As $z$ gets smaller, the value approaches 0. At $z=0$, the value is exactly 0.5. This smooth transition makes it ideal for modeling probabilities in binary classification.

---

### 🧠 Summary

| Feature | Description |
| :--- | :--- |
| **Function** | $\sigma(z) = \frac{1}{1 + e^{-z}}$ |
| **Purpose** | Converts linear output to a probability |
| **Output Range** | Between 0 and 1 |
| **Used For** | Binary classification decisions |
| **Why Not Linear Output?**| Linear output can exceed [0,1], which is not valid as a probability. |

### 3. What is Regularization in Logistic Regression and why is it needed?

### 🔐 What is Regularization in Logistic Regression?

**Regularization** is a technique used in logistic regression to **prevent overfitting** by penalizing large coefficients (weights) in the model. This encourages simpler models that generalize better to new, unseen data.

---

### 🚨 Why is Regularization Needed?

In real-world datasets, models can become too complex and might fit the training data too well, even capturing noise as if it were a meaningful pattern. This is known as **overfitting**, and it leads to poor performance on new data. Regularization combats this by controlling the model's complexity, discouraging it from assigning overly large weights to specific features.

---

### 🔧 How Regularization Works

Regularization adds a penalty term to the model's loss function. The model then tries to minimize this new, combined loss function.

$$
\text{Loss} = \text{Log Loss} + \lambda \cdot \text{Regularization Term}
$$

* **Log Loss:** The original loss function of logistic regression, which measures the model's prediction error.
* **Regularization Term:** The penalty for large coefficients.
* **$\lambda$ (Lambda):** A hyperparameter that controls the strength of the regularization. A larger $\lambda$ means a stronger penalty, leading to smaller coefficients.

---

### 📘 Types of Regularization

| Type | Formula | Effect on Weights | Common Use |
| :--- | :--- | :--- | :--- |
| **L1 (Lasso)** | $\sum |\beta_j|$ | Can shrink some coefficients to exactly zero, performing **feature selection**. | When you want a simpler, more interpretable model with fewer features. |
| **L2 (Ridge)** | $\sum \beta_j^2$ | Shrinks all coefficients toward zero, but doesn't make them exactly zero. | When all features are relevant and you want a more stable model. |
| **Elastic Net** | Combination of L1 and L2 | Provides a balance between L1's sparsity and L2's stability. | In cases where you have many features and some may be correlated. |

### 4.What are some common evaluation metrics for classification models, and why are they important?

# 📊 Common Evaluation Metrics for Classification Models — and Why They Matter

Evaluation metrics are essential for understanding how well a classification model performs, especially when dealing with imbalanced datasets or different costs for different types of errors.

***

### ✅ 1. Accuracy

* **Definition:** The ratio of correctly predicted observations to the total observations.
$$
\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}
$$

* **Use When:** Classes are balanced (an equal number of 0s and 1s).
* **Limitation:** Can be misleading if the data is imbalanced (e.g., 95% of one class).

***

### 🔁 2. Precision

* **Definition:** Out of all predicted positives, how many were actually positive?
    $$
    \text{Precision} = \frac{TP}{TP + FP}
    $$
* **Use When:** False positives are costly (e.g., spam detection, cancer diagnosis).

***

### 🔎 3. Recall (Sensitivity or True Positive Rate)

* **Definition:** Out of all actual positives, how many did the model correctly identify?
    $$
    \text{Recall} = \frac{TP}{TP + FN}
    $$
* **Use When:** False negatives are costly (e.g., fraud detection, medical tests).

***

### ⚖️ 4. F1 Score

* **Definition:** The harmonic mean of precision and recall. It balances both.
    $$
    \text{F1 Score} = 2 \cdot \frac{\text{Precision} \cdot \text{Recall}}{\text{Precision} + \text{Recall}}
    $$
* **Use When:** You need a balance between precision and recall, especially when the dataset is imbalanced.

***

### 📈 5. ROC Curve and AUC (Area Under the Curve)

* **ROC Curve:** Plots the True Positive Rate vs. the False Positive Rate at various classification thresholds.
* **AUC:** Measures the entire two-dimensional area under the ROC curve.
    $$
    \text{AUC} = 1.0 \text{ (Perfect)} \quad \text{AUC} = 0.5 \text{ (Random guessing)}
    $$
* **Use When:** Comparing different models or evaluating performance across various thresholds.

***

### 🔢 6. Confusion Matrix

A 2x2 matrix that shows:
* True Positives (TP)
* True Negatives (TN)
* False Positives (FP)
* False Negatives (FN)
* **Purpose:** Helps visualize and derive all other metrics.

***

### 🎯 Why Are These Metrics Important?

| Metric | Tells You About | Importance When... |
| :--- | :--- | :--- |
| **Accuracy** | Overall correctness | Classes are balanced. |
| **Precision** | False positive control | The cost of a false alarm is high. |
| **Recall** | False negative control | Missing a positive case is dangerous. |
| **F1 Score** | Precision/Recall balance | You need a trade-off or have class imbalance. |
| **AUC-ROC** | Model discrimination | Comparing models and evaluating threshold behavior. |

I'm unable to directly convert the provided text into a Google Colab Markdown format. However, I can provide the information in a clean, Markdown-formatted text that would be suitable for copying and pasting into a Colab notebook's text cells.

### ✅ Example Use Cases:

* **Medical Diagnosis:** Prioritize **Recall** ⚕️. This is because a false negative (failing to detect a disease) can have severe consequences. It is better to have a few false positives (flagging a healthy person as sick) and run further tests than to miss a true case.
* **Email Spam Filter:** Prioritize **Precision** 📧. A false positive in this case means a legitimate email is marked as spam. Users find it more frustrating to have important emails sent to the spam folder than to have a few spam emails get into their inbox.
* **Search Ranking:** Use **AUC** or **F1** 🔎. In this scenario, you need a balanced metric that considers both relevant and irrelevant results. The AUC provides an overall measure of a model's ability to discriminate between classes, while the F1 score offers a balance between precision and recall, which is often crucial for search relevance.

### 5. Write a Python program that loads a CSV file into a Pandas DataFrame,splits into train/test sets, trains a Logistic Regression model, and prints its accuracy. (Use Dataset from sklearn package)

In [None]:
import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Load dataset from sklearn
data = load_breast_cancer()

# Convert to Pandas DataFrame
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target

# Features and target
X = df.drop('target', axis=1)
y = df['target']

# Split into train and test sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a Logistic Regression model
model = LogisticRegression(max_iter=10000)  # Increased max_iter for convergence
model.fit(X_train, y_train)

# Predict and calculate accuracy
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy of Logistic Regression model: {accuracy:.4f}")

Accuracy of Logistic Regression model: 0.9561


### 6.Write a Python program to train a Logistic Regression model using L2 regularization (Ridge) and print the model coefficients and accuracy. (Use Dataset from sklearn package)

In [None]:
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = pd.Series(iris.target)

# Convert to a binary classification problem (e.g., classifying if species is 'setosa')
# 'setosa' = 0, others = 1
y_binary = (y != 0).astype(int)

# Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y_binary, test_size=0.2, random_state=42)

# Train logistic regression model with L2 regularization (default)
model = LogisticRegression(penalty='l2', solver='lbfgs', max_iter=1000)
model.fit(X_train, y_train)

# Print model coefficients
print("Model Coefficients:")
for feature, coef in zip(X.columns, model.coef_[0]):
    print(f"{feature}: {coef:.4f}")

# Predict and evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"\nAccuracy on test set: {accuracy:.4f}")

Model Coefficients:
sepal length (cm): 0.4276
sepal width (cm): -0.8877
petal length (cm): 2.2147
petal width (cm): 0.9161

Accuracy on test set: 1.0000


### 7.Write a Python program to train a Logistic Regression model for multiclass classification using multi_class='ovr' and print the classification report. (Use Dataset from sklearn package)

In [None]:
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.multiclass import OneVsRestClassifier

# Load Iris dataset
iris = load_iris()
X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = pd.Series(iris.target)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Wrap Logistic Regression with OneVsRestClassifier
model = OneVsRestClassifier(LogisticRegression(solver='lbfgs', max_iter=1000))
model.fit(X_train, y_train)

# Predict
y_pred = model.predict(X_test)

# Report
print("Classification Report:\n")
print(classification_report(y_test, y_pred, target_names=iris.target_names))


Classification Report:

              precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        10
  versicolor       1.00      0.89      0.94         9
   virginica       0.92      1.00      0.96        11

    accuracy                           0.97        30
   macro avg       0.97      0.96      0.97        30
weighted avg       0.97      0.97      0.97        30



### 8.Write a Python program to apply GridSearchCV to tune C and penalty hyperparameters for Logistic Regression and print the best parameters and validation accuracy. (Use Dataset from sklearn package)

In [None]:
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LogisticRegression
from sklearn.multiclass import OneVsRestClassifier
from sklearn.model_selection import train_test_split

# Load dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split into training and test sets (optional if only tuning)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define parameter grid
param_grid = {
    'estimator__C': [0.01, 0.1, 1, 10, 100],
    'estimator__penalty': ['l1', 'l2']
}

# Base model with solver that supports both L1 and L2
base_model = LogisticRegression(solver='liblinear', max_iter=1000)

# Wrap with OneVsRestClassifier for multi-class
model = OneVsRestClassifier(base_model)

# Grid search with 5-fold cross-validation
grid_search = GridSearchCV(estimator=model,
                           param_grid=param_grid,
                           cv=5,
                           scoring='accuracy',
                           n_jobs=-1)

# Fit GridSearchCV
grid_search.fit(X_train, y_train)

# Output best parameters and accuracy
print("✅ Best Parameters:", grid_search.best_params_)
print(f"✅ Best Cross-Validation Accuracy: {grid_search.best_score_:.4f}")


✅ Best Parameters: {'estimator__C': 10, 'estimator__penalty': 'l1'}
✅ Best Cross-Validation Accuracy: 0.9583


### 9. Write a Python program to standardize the features before training Logistic Regression and compare the model's accuracy with and without scaling.(Use Dataset from sklearn package)

In [None]:
import warnings
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score

# Ignore convergence and deprecation warnings
warnings.filterwarnings('ignore')

# Load dataset
data = load_wine()
X, y = data.data, data.target

# Split into train/test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# ------------------------------
# Model WITHOUT Scaling
# ------------------------------
model_no_scaling = LogisticRegression(max_iter=1000, solver='lbfgs', multi_class='auto')
model_no_scaling.fit(X_train, y_train)
y_pred_no_scaling = model_no_scaling.predict(X_test)
accuracy_no_scaling = accuracy_score(y_test, y_pred_no_scaling)

# ------------------------------
# Model WITH Scaling
# ------------------------------
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

model_scaled = LogisticRegression(max_iter=1000, solver='lbfgs', multi_class='auto')
model_scaled.fit(X_train_scaled, y_train)
y_pred_scaled = model_scaled.predict(X_test_scaled)
accuracy_scaled = accuracy_score(y_test, y_pred_scaled)

# ------------------------------
# Results
# ------------------------------
print(f"🔸 Accuracy without scaling: {accuracy_no_scaling:.4f}")
print(f"✅ Accuracy with scaling   : {accuracy_scaled:.4f}")


🔸 Accuracy without scaling: 1.0000
✅ Accuracy with scaling   : 1.0000


### 10.  Imagine you are working at an e-commerce company that wants to predict which customers will respond to a marketing campaign. Given an imbalanced dataset (only 5% of customers respond), describe the approach you’d take to build a Logistic Regression model — including data handling, feature scaling, balancing classes, hyperparameter tuning, and evaluating the model for this real-world business use case.

# Logistic Regression for Imbalanced Marketing Response Prediction

Imagine you work at an e-commerce company predicting which customers will respond to a marketing campaign. The dataset is highly imbalanced: only 5% respond. Here's a recommended approach:

---

## 1️⃣ Data Understanding & Preprocessing

- Explore the dataset and target distribution (5% responders = heavy imbalance).  
- Clean missing values, handle outliers, and inconsistencies.  
- Feature engineering: e.g., customer recency, frequency, monetary value, demographics, past campaign interactions.

---

## 2️⃣ Feature Scaling

- Use `StandardScaler` or `MinMaxScaler` on numeric features.  
- Helps Logistic Regression converge faster and improves regularization.

---

## 3️⃣ Handling Class Imbalance

- **Resampling Techniques:**  
  - Oversample minority class with SMOTE or RandomOverSampler.  
  - Undersample majority class (caution: may lose info).  
  - Combine oversampling and undersampling if needed.

- **Class Weights:**  
  - Use `class_weight='balanced'` in Logistic Regression to penalize errors on minority class more.  
  - Helps without altering data distribution.

---

## 4️⃣ Train/Test Split

- Use **stratified splitting** to keep imbalance ratio consistent in train and test sets.

---

## 5️⃣ Model Building & Hyperparameter Tuning

- Train Logistic Regression with regularization (L1 or L2).  
- Tune hyperparameters via `GridSearchCV` or `RandomizedSearchCV`:  
  - `C` (inverse regularization strength)  
  - `penalty` (`l1` or `l2`)  
  - Solver compatible with chosen penalty  
  - Optionally, `class_weight` if not using resampling.

- Tune classification **thresholds** since default 0.5 may not be optimal.

---

## 6️⃣ Evaluation Metrics

> Accuracy is misleading with imbalanced data. Use:  

- **Precision, Recall, F1-Score** (balance false positives vs false negatives).  
- **ROC-AUC** (overall separability).  
- **Precision-Recall Curve / AUPRC** (focus on minority class performance).  
- **Confusion Matrix** (detailed TP, FP, TN, FN).  
- **Business metrics** like expected profit or lift from targeting predicted responders.

---

## 7️⃣ Model Interpretation

- Analyze coefficients to understand feature impact.  
- Use SHAP or LIME for explainability if needed.

---

## 8️⃣ Deployment & Monitoring

- Deploy model in production pipeline.  
- Monitor real-time performance and retrain as customer behavior changes.  
- Track campaign ROI to ensure business value.

---

## Summary Table

| Step                  | Description                                           |
|-----------------------|-----------------------------------------------------|
| Data preprocessing    | Clean, feature engineer, scale                       |
| Handling imbalance    | Class weighting and/or resampling                    |
| Model tuning          | Grid search `C`, `penalty`, `class_weight`          |
| Metrics & evaluation  | Precision, recall, F1, AUC, PR curves                |
| Business alignment    | Connect predictions to marketing KPIs and profit    |

