In [None]:
#1 What is Logistic Regression, and how does it differ from Linear Regression.
#ans **Logistic Regression vs. Linear Regression**  

# **1. Logistic Regression**  
Logistic Regression is a statistical method used for **classification problems** where the output is categorical (e.g., Yes/No, 0/1, Spam/Not Spam). Instead of predicting a continuous value, it predicts the probability that a given input belongs to a particular class using the **Sigmoid function**.  

- **Formula:**  
  \[
  P(Y=1) = \frac{1}{1 + e^{-(b_0 + b_1X_1 + b_2X_2 + ... + b_nX_n)}}
  \]
- The output is a probability between **0 and 1**, which is then converted into a class (e.g., if \( P(Y) > 0.5 \), classify as 1).  

#### **2. Linear Regression**  
Linear Regression is used for **regression problems**, where the output is a continuous value (e.g., predicting salary, house price). It finds the best-fit line that minimizes the difference between actual and predicted values.  

- **Formula:**  
  \[
  Y = b_0 + b_1X_1 + b_2X_2 + ... + b_nX_n
  \]
- The output is a continuous numeric value.  

### **Key Differences:**  

| Feature                | Logistic Regression      | Linear Regression  |
|------------------------|------------------------|--------------------|
| **Type of Problem**    | Classification (Yes/No, 0/1) | Regression (Continuous output) |
| **Output**             | Probability (0 to 1) → Categorical | Continuous numerical value |
| **Mathematical Function** | Sigmoid Function | Linear Equation |
| **Error Function**     | Log Loss (Cross-Entropy Loss) | Mean Squared Error (MSE)
|
| **Interpretation**     | Predicts the probability of a class | Predicts exact numerical value |


In [None]:
#2 What is the mathematical equation of Logistic Regression.
#ans.The mathematical equation of **Logistic Regression** is derived from the **linear regression equation** but passed through a **sigmoid function** to constrain the output between 0 and 1.

### **1. Linear Regression Equation**  
A linear model is given by:  
\[
Z = b_0 + b_1X_1 + b_2X_2 + ... + b_nX_n
\]
where:  
- \( Z \) is the linear combination of input features.
- \( X_1, X_2, ..., X_n \) are independent variables (features).
- \( b_0 \) is the intercept (bias).
- \( b_1, b_2, ..., b_n \) are coefficients (weights).

### **2. Applying the Sigmoid Function**  
To convert the linear regression output into a probability between **0 and 1**, we apply the **Sigmoid function**:

\[
P(Y=1) = \frac{1}{1 + e^{-Z}}
\]

Substituting \( Z \):

\[
P(Y=1) = \frac{1}{1 + e^{-(b_0 + b_1X_1 + b_2X_2 + ... + b_nX_n)}}
\]

### **3. Decision Boundary**  
- If \( P(Y=1) > 0.5 \), classify as **1**  
- If \( P(Y=1) \leq 0.5 \), classify as **0**


In [None]:
#3 Why do we use the Sigmoid function in Logistic Regression.
#ans. ### **Why Do We Use the Sigmoid Function in Logistic Regression?**  

The **Sigmoid function** is used in Logistic Regression because it helps transform any real-valued number into a **probability between 0 and 1**, which is essential for classification tasks.  

### **1. Definition of Sigmoid Function**  
The Sigmoid function is given by:  

\[
S(Z) = \frac{1}{1 + e^{-Z}}
\]

where \( Z = b_0 + b_1X_1 + b_2X_2 + ... + b_nX_n \) is the linear equation.

### **2. Why is it Used?**  
✅ **Probability Output:**  
   - The Sigmoid function converts raw scores into probabilities between **0 and 1**, making it suitable for binary classification.  

✅ **Decision Boundary for Classification:**  
   - If \( S(Z) > 0.5 \), classify as **1**  
   - If \( S(Z) \leq 0.5 \), classify as **0**  

✅ **Differentiability (Gradient Descent Friendly):**  
   - The function is **smooth and differentiable**, making it easy to optimize using **gradient descent**.  

✅ **Handles Outliers:**  
   - The exponential function ensures that extreme values of \( Z \) (very high or very low) are squashed between **0 and 1**, preventing large deviations.

### **3. Graphical Representation**  
- The Sigmoid curve is **S-shaped**, with values approaching **0** for large negative inputs and **1** for large positive inputs.  
- At \( Z = 0 \), \( S(Z) = 0.5 \), which acts as a natural threshold.  


In [None]:
#4.#C What is the cost function of Logistic Regression
#### **Cost Function of Logistic Regression**  

In Logistic Regression, we use the **Log Loss (Logistic Loss)** or **Binary Cross-Entropy Loss** as the cost function instead of Mean Squared Error (MSE), because the Sigmoid function is non-linear, and MSE would lead to a non-convex loss function, making optimization difficult.

---

### **1. Binary Cross-Entropy (Log Loss) Formula**  
For a binary classification problem where \( Y \in \{0,1\} \) and the predicted probability is \( \hat{Y} = P(Y=1) \), the cost function is:

\[
J(\theta) = -\frac{1}{m} \sum_{i=1}^{m} \left[ Y^{(i)} \log \hat{Y}^{(i)} + (1 - Y^{(i)}) \log (1 - \hat{Y}^{(i)}) \right]
\]

where:
- \( m \) = number of training examples  
- \( Y^{(i)} \) = actual class label (0 or 1) for the \( i \)th sample  
- \( \hat{Y}^{(i)} \) = predicted probability for class 1  
- \( \log \) ensures high confidence correct predictions have low cost  
- The negative sign ensures the function is minimized when predictions are accurate  

---

### **2. Intuition Behind Log Loss**  
- If **\( Y = 1 \)**: The loss simplifies to \( -\log (\hat{Y}) \), meaning high probability predictions (\( \hat{Y} \approx 1 \)) result in lower loss.  
- If **\( Y = 0 \)**: The loss simplifies to \( -\log (1 - \hat{Y}) \), meaning lower probability predictions for class 1 (\( \hat{Y} \approx 0 \)) result in lower loss.  

---

### **3. Why Not Mean Squared Error (MSE)?**  
- MSE creates a **non-convex function** for logistic regression, leading to multiple local minima.  
- Log Loss is **convex**, ensuring efficient optimization with **Gradient Descent**.  


In [None]:
#5.What is Regularization in Logistic Regression? Why is it needed.
#ans.### **Regularization in Logistic Regression**  

**Regularization** is a technique used to **prevent overfitting** in Logistic Regression by adding a penalty term to the cost function. It helps control the complexity of the model by discouraging large coefficients (weights), ensuring better generalization to unseen data.

---

### **1. Why is Regularization Needed?**  
✅ **Prevents Overfitting:** If the model is too complex (too many features or large weights), it memorizes the training data instead of generalizing well to new data.  

✅ **Reduces Variance:** Regularization reduces model variance, ensuring it performs well on both training and test data.  

✅ **Controls Large Coefficients:** Without regularization, the logistic regression model might assign excessively high values to coefficients, leading to unstable predictions.

---

### **2. Types of Regularization**  
Regularization is applied by adding a penalty term to the **Log Loss function**.

#### **A. L2 Regularization (Ridge Regression)**
- Adds a **squared sum of weights** to the cost function.
- Helps in **shrinking coefficients** without making them exactly zero.

\[
J(\theta) = -\frac{1}{m} \sum_{i=1}^{m} \left[ Y^{(i)} \log \hat{Y}^{(i)} + (1 - Y^{(i)}) \log (1 - \hat{Y}^{(i)}) \right] + \frac{\lambda}{2m} \sum_{j=1}^{n} \theta_j^2
\]

✅ **Used by default in Logistic Regression (e.g., in Scikit-learn)**  
✅ **Prevents large coefficients while keeping all features**  

---

#### **B. L1 Regularization (Lasso Regression)**
- Adds an **absolute sum of weights** to the cost function.
- Helps in **feature selection** by shrinking some coefficients to **exactly zero**.

\[
J(\theta) = -\frac{1}{m} \sum_{i=1}^{m} \left[ Y^{(i)} \log \hat{Y}^{(i)} + (1 - Y^{(i)}) \log (1 - \hat{Y}^{(i)}) \right] + \frac{\lambda}{m} \sum_{j=1}^{n} |\theta_j|
\]

✅ **Used when feature selection is needed**  
✅ **Removes less important features automatically**  

---

### **3. Choosing Between L1 and L2**
- **L2 (Ridge):** If all features are important but should have controlled impact.  
- **L1 (Lasso):** If you want **automatic feature selection** by removing irrelevant features.  
- **Elastic Net:** A mix of **L1 and L2** regularization, useful when features are correlated.

---

### **4. Hyperparameter \( \lambda \) (Regularization Strength)**
- **High \( \lambda \)** → More regularization (simpler model, avoids overfitting).  
- **Low \( \lambda \)** → Less regularization (complex model, risk of overfitting).  


In [None]:
#6. Explain the difference between Lasso, Ridge, and Elastic Net regression
#ans.### **Difference Between Lasso, Ridge, and Elastic Net Regression**  

Regularization techniques like **Lasso, Ridge, and Elastic Net** are used to **prevent overfitting** by adding a penalty term to the cost function, controlling the size of model coefficients. The key difference lies in **how they penalize** the coefficients.  

---

## **1. Ridge Regression (L2 Regularization)**
✅ **Adds the sum of squared coefficients as a penalty**  

\[
J(\theta) = \text{MSE} + \lambda \sum_{j=1}^{n} \theta_j^2
\]

📌 **Key Features:**  
- Shrinks **all coefficients** but does not make any exactly **zero**.  
- Helps when **all features are important** but need controlled influence.  
- Works well when features are **highly correlated** (prevents overfitting).  

📌 **Use Case:**  
- When **all variables contribute** and you don’t want to remove any features.  

---

## **2. Lasso Regression (L1 Regularization)**
✅ **Adds the sum of absolute values of coefficients as a penalty**  

\[
J(\theta) = \text{MSE} + \lambda \sum_{j=1}^{n} |\theta_j|
\]

📌 **Key Features:**  
- Shrinks some coefficients **completely to zero**, effectively performing **feature selection**.  
- Good for datasets with **many irrelevant features**.  
- Can struggle with **highly correlated features** (randomly picks one).  

📌 **Use Case:**  
- When you want **automatic feature selection** and a **sparse model** (some coefficients = 0).  

---

## **3. Elastic Net Regression (L1 + L2 Regularization)**
✅ **Combines L1 and L2 regularization**  

\[
J(\theta) = \text{MSE} + \lambda_1 \sum_{j=1}^{n} |\theta_j| + \lambda_2 \sum_{j=1}^{n} \theta_j^2
\]

📌 **Key Features:**  
- **Balances Ridge and Lasso**, allowing both **feature selection** and **weight shrinking**.  
- Works well when features are **highly correlated**.  
- Prevents Lasso’s issue of selecting only one feature among correlated features.  

📌 **Use Case:**  
- When you need both **regularization and feature selection** but have correlated features.  

---

## **Comparison Table**

| Feature           | Ridge Regression (L2) | Lasso Regression (L1) | Elastic Net (L1 + L2) |
|------------------|--------------------|--------------------|-------------------|
| **Penalty Term**  | \( \sum \theta^2 \) (Squared) | \( \sum |\theta| \) (Absolute) | Combination of both |
| **Effect on Coefficients** | Shrinks but keeps all | Shrinks, some become **zero** | Shrinks, some become **zero** |
| **Feature Selection?** | ❌ No | ✅ Yes | ✅ Yes (better than Lasso) |
| **Handles Multicollinearity?** | ✅ Yes | ❌ No (randomly selects one) | ✅ Yes |
| **Use Case** | When all features are useful | When feature selection is needed | When features are correlated and selection is needed |

---

### **Conclusion**
- Use **Ridge (L2)** if **all features matter** and you just want to **control their impact**.  
- Use **Lasso (L1)** if you want **automatic feature selection**.  
- Use **Elastic Net** if you need **feature selection + multicollinearity handling**.  


In [None]:
#7.When should we use Elastic Net instead of Lasso or Ridge.
Elastic Net is a combination of **Lasso (L1 regularization)** and **Ridge (L2 regularization)** and is useful in situations where neither Lasso nor Ridge alone performs optimally. Here’s when to use Elastic Net instead of just Lasso or Ridge:

### 1. **When Features are Highly Correlated**
   - Lasso tends to select only one feature among a group of correlated features and ignores the rest.
   - Ridge spreads the weights across all correlated features but does not perform feature selection.
   - **Elastic Net combines both approaches**, selecting groups of correlated features while also shrinking coefficients to prevent overfitting.

### 2. **When Lasso is Too Aggressive in Feature Selection**
   - Lasso may force some coefficients to zero, removing important features.
   - If you suspect that multiple features contribute to the outcome, Elastic Net ensures **better stability** in feature selection.

### 3. **When You Have More Features Than Observations (High-Dimensional Data)**
   - In cases where the number of features **(p) exceeds the number of samples (n)** (e.g., genetics, text data), Lasso alone might fail, and Ridge might be too weak in feature selection.
   - **Elastic Net performs well in high-dimensional settings** by balancing feature selection and coefficient shrinkage.

### 4. **When You Want a Balance Between Lasso and Ridge**
   - Elastic Net introduces a mixing parameter **α (alpha)** to control the ratio between Lasso and Ridge penalties.
   - If **α = 1**, it behaves like Lasso.
   - If **α = 0**, it behaves like Ridge.
   - You can **tune α** to find the best mix for your dataset.


In [None]:
#8. What is the impact of the regularization parameter (λ) in Logistic Regression.
#ans.In **Logistic Regression**, the regularization parameter **λ (lambda)** controls the strength of regularization applied to the model. It impacts the model’s performance, complexity, and generalization ability. Here’s how **λ** affects Logistic Regression:

### **1. Controlling Overfitting and Underfitting**
- **Large λ (High Regularization)**  
  - Increases the penalty on large coefficients, forcing them to be smaller.  
  - Prevents overfitting by reducing model complexity.  
  - Can lead to **underfitting** if λ is too high, making the model too simple and less accurate.

- **Small λ (Low Regularization)**  
  - Allows the model to learn more from the data.  
  - If too small, the model may overfit, capturing noise rather than general trends.  

### **2. Impact on Coefficients**
- **Higher λ → Smaller Coefficients:** Shrinks feature weights closer to zero.  
- **Lower λ → Larger Coefficients:** Allows higher variance in feature weights.  
- **Extreme λ → Zero Coefficients (L1 Regularization):** Lasso (L1) can shrink some coefficients **exactly to zero**, effectively performing feature selection.

### **3. Model Generalization**
- A well-tuned **λ** ensures the model generalizes well to unseen data.  
- **Cross-validation** is often used to find the optimal λ that minimizes validation error.

### **4. Types of Regularization in Logistic Regression**
- **L1 Regularization (Lasso)**: Encourages sparsity (some coefficients become zero).  
- **L2 Regularization (Ridge)**: Shrinks coefficients but keeps them small, preventing extreme weights.  
- **Elastic Net**: A mix of L1 and L2 regularization.


In [None]:
#9 What are the key assumptions of Logistic Regression.
#ans.Logistic Regression is a widely used classification algorithm, but it relies on certain key assumptions for optimal performance. Here are the key assumptions:

### **1. The Relationship Between Independent and Dependent Variables is Log-Linear**  
   - Logistic Regression assumes that the **log-odds (logit transformation)** of the dependent variable has a **linear relationship** with the independent variables.  
   - Unlike Linear Regression, it does **not assume a direct linear relationship** between the independent and dependent variables.

### **2. No Multicollinearity**  
   - Independent variables should **not be highly correlated** with each other.  
   - High correlation (multicollinearity) can make coefficient estimation unstable.  
   - **Solution:** Use **Variance Inflation Factor (VIF)** or **Principal Component Analysis (PCA)** to detect and resolve multicollinearity.

### **3. Independent Observations**  
   - Observations should be **independent of each other** (i.e., no hidden relationships in the data).  
   - If observations are correlated (e.g., time-series data), methods like **Generalized Estimating Equations (GEE)** or **Mixed Effects Models** should be considered.

### **4. No Strong Outliers**  
   - Logistic Regression is sensitive to outliers, as they can distort coefficient estimates.  
   - **Solution:** Use techniques like **IQR (Interquartile Range), Boxplots, or Winsorization** to detect and handle outliers.

### **5. Large Sample Size (Especially for Maximum Likelihood Estimation - MLE)**  
   - Logistic Regression uses **Maximum Likelihood Estimation (MLE)** for parameter estimation, which works best with **a large number of observations**.  
   - Small sample sizes can lead to unreliable coefficient estimates.

### **6. No Perfect Separation**  
   - If one class is perfectly separable from another, Logistic Regression may struggle because **MLE will not converge**.  
   - **Solution:** Use **Regularization (L1/L2), Collect More Data, or Use a Different Model (e.g., Decision Trees, SVM).**

### **7. The Dependent Variable is Binary (For Standard Logistic Regression)**  
   - Logistic Regression assumes the target variable is **binary (0 or 1)** for standard applications.  
   - For multi-class classification, **Multinomial Logistic Regression** should be used.


In [None]:
#10 What are some alternatives to Logistic Regression for classification tasks.
#ans.If Logistic Regression isn't the best fit for your classification problem, several alternative models can be considered, depending on factors like data size, feature relationships, and interpretability. Here are some key alternatives:

---

### **1. Decision Trees**
   - **Why?** Handles **nonlinear relationships** and is easy to interpret.
   - **Pros:** No need for feature scaling, works well with categorical data.
   - **Cons:** Prone to overfitting (unless pruned or regularized).
   - **Best for:** Small to medium-sized datasets, interpretability-focused applications.

---

### **2. Random Forest**
   - **Why?** An ensemble of Decision Trees that improves generalization.
   - **Pros:** Reduces overfitting, works well with missing data.
   - **Cons:** Less interpretable, computationally expensive.
   - **Best for:** Complex datasets with many features and nonlinear relationships.

---

### **3. Support Vector Machines (SVM)**
   - **Why?** Finds an optimal decision boundary (hyperplane).
   - **Pros:** Works well with high-dimensional data, robust to outliers.
   - **Cons:** Computationally expensive for large datasets, sensitive to hyperparameters.
   - **Best for:** Small to medium-sized datasets with **clear class separation**.

---

### **4. k-Nearest Neighbors (k-NN)**
   - **Why?** A simple, non-parametric method that classifies based on similarity.
   - **Pros:** No training time, intuitive.
   - **Cons:** Computationally expensive at inference, sensitive to irrelevant features.
   - **Best for:** Small datasets with well-defined clusters.

---

### **5. Naïve Bayes**
   - **Why?** A probabilistic model based on **Bayes’ Theorem**.
   - **Pros:** Works well with text classification, fast, and requires little data.
   - **Cons:** Assumes feature independence (which is often unrealistic).
   - **Best for:** **Text classification (spam detection, sentiment analysis)**.

---

### **6. Gradient Boosting (XGBoost, LightGBM, CatBoost)**
   - **Why?** Boosting algorithms that sequentially improve predictions.
   - **Pros:** Highly accurate, works with both structured and unstructured data.
   - **Cons:** Computationally expensive, requires careful tuning.
   - **Best for:** Large datasets with complex relationships.

---

### **7. Neural Networks (Deep Learning)**
   - **Why?** Captures complex patterns and works well with big data.
   - **Pros:** Handles **nonlinearity** and large feature sets.
   - **Cons:** Requires large amounts of data, less interpretable.
   - **Best for:** Image recognition, speech processing, NLP.

---

### **How to Choose the Right Model?**
| Scenario | Best Alternative |
|-----------|----------------|
| High-dimensional data | SVM, Gradient Boosting |
| Nonlinear relationships | Random Forest, XGBoost |
| Interpretability required | Decision Tree, Logistic Regression |
| Small dataset | k-NN, Naïve Bayes |
| Large dataset | Neural Networks, XGBoost |



In [None]:
#11. What are Classification Evaluation Metrics.
### **Classification Evaluation Metrics**  
When evaluating a classification model, we use different metrics to measure its performance. The choice of the right metric depends on the problem (e.g., balanced vs. imbalanced classes). Below are key classification evaluation metrics:

---

### **1. Accuracy**  
📌 **Formula:**  
\[
Accuracy = \frac{TP + TN}{TP + TN + FP + FN}
\]  
✅ **Good for:** Balanced datasets.  
❌ **Not ideal for:** Imbalanced datasets (e.g., detecting rare diseases).  

---

### **2. Precision (Positive Predictive Value - PPV)**  
📌 **Formula:**  
\[
Precision = \frac{TP}{TP + FP}
\]  
✅ **Good for:** When **false positives (FP)** need to be minimized (e.g., spam detection).  
❌ **Not ideal for:** When false negatives are more critical.  

---

### **3. Recall (Sensitivity, True Positive Rate - TPR)**  
📌 **Formula:**  
\[
Recall = \frac{TP}{TP + FN}
\]  
✅ **Good for:** When **false negatives (FN)** are costly (e.g., cancer detection).  
❌ **Not ideal for:** Situations where precision is more important.  

---

### **4. F1-Score (Harmonic Mean of Precision and Recall)**  
📌 **Formula:**  
\[
F1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}
\]  
✅ **Good for:** Imbalanced datasets (balances precision and recall).  
❌ **Not ideal for:** Cases where one metric (precision or recall) is much more important.  

---

### **5. Specificity (True Negative Rate - TNR)**  
📌 **Formula:**  
\[
Specificity = \frac{TN}{TN + FP}
\]  
✅ **Good for:** When correctly identifying **negative cases** is important (e.g., fraud detection).  

---

### **6. ROC Curve (Receiver Operating Characteristic Curve)**  
📌 **What it Shows:**  
- Plots **True Positive Rate (Recall) vs. False Positive Rate (FPR)**.  
- Helps visualize the model's ability to distinguish between classes.  

✅ **Good for:** Comparing models and tuning classification thresholds.  

---

### **7. AUC-ROC (Area Under the ROC Curve)**  
📌 **What it Represents:**  
- AUC = **1.0** → Perfect model  
- AUC = **0.5** → Random guessing  
- AUC closer to 1 means a **better classifier**.  

✅ **Good for:** Evaluating model performance across different thresholds.  

---

### **8. PR Curve (Precision-Recall Curve)**  
📌 **What it Shows:**  
- Plots **Precision vs. Recall** at different thresholds.  
- More useful for **imbalanced datasets** than ROC.  

✅ **Good for:** When focusing on precision-recall trade-offs.  

---

### **9. Log Loss (Logarithmic Loss / Cross-Entropy Loss)**  
📌 **Formula:**  
\[
LogLoss = -\frac{1}{N} \sum_{i=1}^{N} \left[y_i \log(p_i) + (1 - y_i) \log(1 - p_i)\right]
\]  
✅ **Good for:** Models that output probabilities (e.g., logistic regression, neural networks).  
❌ **Not ideal for:** Hard classification tasks (where only labels are needed, not probabilities).  

---

### **Choosing the Right Metric**
| **Scenario** | **Best Metric** |
|-------------|---------------|
| Balanced dataset | Accuracy |
| Imbalanced dataset | Precision, Recall, F1-Score, PR Curve |
| Minimize false positives | Precision (e.g., spam detection) |
| Minimize false negatives | Recall (e.g., disease detection) |
| Model comparison | AUC-ROC, PR Curve |


In [None]:
#12. How does class imbalance affect Logistic Regression.
### **How Class Imbalance Affects Logistic Regression**  

Class imbalance occurs when one class significantly outnumbers the other in a dataset (e.g., fraud detection, rare disease diagnosis). This imbalance can negatively impact **Logistic Regression** in several ways:

---

### **1. Biased Model Predictions**  
- **Logistic Regression assumes equal class distribution**, so it **learns to favor the majority class**.  
- This leads to high accuracy but poor performance on the minority class.  

🔹 **Example:**  
- Suppose a dataset has **95% class A** and **5% class B**.  
- If the model always predicts class A, it gets **95% accuracy**, but class B is completely ignored.  

---

### **2. Poor Precision and Recall for the Minority Class**  
- The model **struggles to detect the minority class**, leading to:  
  - **Low Recall (High False Negatives)** → Misses many minority class instances.  
  - **Low Precision (High False Positives)** → Predicts some majority class instances as the minority class incorrectly.  

---

### **3. Misleading Accuracy Metric**  
- Accuracy becomes **misleading** because it is dominated by the majority class.  
- **Solution:** Use **F1-score, Precision-Recall (PR) Curve, or AUC-ROC** instead of accuracy.  

---

### **4. Poor Probability Estimates**  
- The predicted probabilities tend to be **biased toward the majority class**.  
- Logistic Regression uses **Maximum Likelihood Estimation (MLE)**, which assumes a balanced dataset.  
- This leads to **poor threshold calibration** for classification.  

---

## **How to Handle Class Imbalance in Logistic Regression?**  

### ✅ **1. Resampling Techniques**  
- **Oversampling (e.g., SMOTE - Synthetic Minority Over-sampling Technique)**  
  - Generates synthetic samples for the minority class to balance the dataset.  
- **Undersampling**  
  - Reduces the number of majority class samples.  
- **Hybrid Approach (SMOTE + Undersampling)**  
  - A mix of both techniques to avoid overfitting.  

---

### ✅ **2. Adjusting Class Weights**  
- Set **higher weights for the minority class** during training.  
- In **Scikit-learn**, use:  
  ```python
  from sklearn.linear_model import LogisticRegression
  model = LogisticRegression(class_weight='balanced')
  ```

---

### ✅ **3. Changing Decision Threshold**  
- By default, Logistic Regression classifies based on a **0.5 threshold**.  
- Lowering the threshold can **increase recall** for the minority class.  
- **How to find the best threshold?**  
  - Use the **Precision-Recall curve** or **ROC curve**.  

---

### ✅ **4. Alternative Algorithms**  
- **Tree-based models** (Random Forest, XGBoost) handle imbalance better.  
- **Anomaly detection models** work well when the minority class is extremely rare.  


In [None]:
#13. What is Hyperparameter Tuning in Logistic Regression.
#ans.### **Hyperparameter Tuning in Logistic Regression**  

**Hyperparameter tuning** in Logistic Regression involves optimizing the model’s parameters to improve performance. Unlike model parameters (like coefficients), hyperparameters are set **before** training and need tuning for the best results.

---

## **Key Hyperparameters in Logistic Regression**
### ✅ **1. Regularization Parameter (λ or C)**
- Controls the strength of **L1 (Lasso) or L2 (Ridge) regularization**.
- In **Scikit-learn**, the hyperparameter is **C**, which is the **inverse** of λ:
  \[
  C = \frac{1}{\lambda}
  \]
- **Higher C (low λ) → Less regularization** (risk of overfitting).
- **Lower C (high λ) → More regularization** (risk of underfitting).

🔹 **Example:**
```python
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV

param_grid = {'C': [0.001, 0.01, 0.1, 1, 10, 100]}
log_reg = LogisticRegression(penalty='l2', solver='liblinear')
grid_search = GridSearchCV(log_reg, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)
print("Best C:", grid_search.best_params_)
```

---

### ✅ **2. Regularization Type (`penalty`)**
- **L1 (Lasso) → Feature selection (sparse model)**.
- **L2 (Ridge) → Prevents large coefficients (no feature elimination)**.
- **Elastic Net (Mix of L1 and L2) → Balances feature selection and shrinkage**.

🔹 **Example:**
```python
param_grid = {'penalty': ['l1', 'l2'], 'C': [0.01, 0.1, 1, 10]}
log_reg = LogisticRegression(solver='liblinear')
grid_search = GridSearchCV(log_reg, param_grid, cv=5)
grid_search.fit(X_train, y_train)
```

---

### ✅ **3. Solver (`solver`)**
- Different optimization algorithms for training:
  - `'liblinear'` → Good for small datasets (supports L1 & L2).
  - `'lbfgs'` → Default, good for large datasets (L2 only).
  - `'saga'` → Best for large datasets with L1/L2 or Elastic Net.

🔹 **Example:**
```python
param_grid = {'solver': ['liblinear', 'lbfgs', 'saga']}
```

---

### ✅ **4. Class Weight (`class_weight`)**
- Handles **class imbalance** by assigning different weights to classes.
- `'balanced'` automatically adjusts weights based on class distribution.

🔹 **Example:**
```python
log_reg = LogisticRegression(class_weight='balanced')
```

---

### ✅ **5. Maximum Iterations (`max_iter`)**
- Controls the number of iterations for convergence.
- Increase if the model **fails to converge**.

🔹 **Example:**
```python
log_reg = LogisticRegression(max_iter=500)
```

---

## **Hyperparameter Tuning Methods**
### 🔹 **1. Grid Search (Exhaustive Search)**
- Tests all possible combinations.
- **Computationally expensive** for large datasets.

```python
from sklearn.model_selection import GridSearchCV

param_grid = {'C': [0.001, 0.01, 0.1, 1, 10], 'penalty': ['l1', 'l2']}
grid_search = GridSearchCV(LogisticRegression(solver='liblinear'), param_grid, cv=5)
grid_search.fit(X_train, y_train)

print("Best Parameters:", grid_search.best_params_)
```

---

### 🔹 **2. Randomized Search (Faster Alternative)**
- Randomly selects hyperparameter values instead of testing all combinations.
- **Faster than Grid Search**, but may miss the best combination.

```python
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import uniform

param_dist = {'C': uniform(0.001, 10)}
random_search = RandomizedSearchCV(LogisticRegression(solver='liblinear'), param_dist, n_iter=10, cv=5)
random_search.fit(X_train, y_train)

print("Best Parameters:", random_search.best_params_)
```

---

### 🔹 **3. Bayesian Optimization**
- Uses probability to find the best hyperparameters efficiently.
- **More advanced** than Grid/Random Search.


In [None]:
#14 What are different solvers in Logistic Regression? Which one should be used.
#ans.### **Different Solvers in Logistic Regression & When to Use Them**  

In **Logistic Regression**, solvers are optimization algorithms used to find the best model parameters. Choosing the right solver depends on **dataset size, feature count, regularization type, and computational efficiency**.

---

## **1. List of Solvers in Scikit-learn Logistic Regression**
| **Solver**  | **Best For** | **Supports L1?** | **Supports L2?** | **Supports Elastic Net?** | **Multiclass?** |
|------------|-------------|------------------|------------------|------------------------|------------------|
| **'liblinear'**  | Small datasets | ✅ Yes | ✅ Yes | ❌ No | ❌ No (One-vs-Rest only) |
| **'lbfgs'**      | Large datasets | ❌ No | ✅ Yes | ❌ No | ✅ Yes (Multinomial) |
| **'newton-cg'**  | Large datasets | ❌ No | ✅ Yes | ❌ No | ✅ Yes (Multinomial) |
| **'sag'**        | Large datasets, sparse data | ❌ No | ✅ Yes | ❌ No | ✅ Yes (Multinomial) |
| **'saga'**       | Very large datasets, L1/L2, Elastic Net | ✅ Yes | ✅ Yes | ✅ Yes | ✅ Yes (Multinomial) |

---

## **2. Explanation of Each Solver**
### 🔹 **1. 'liblinear' (Library for Large Linear Classification)**
- **Best for**: Small datasets (<10,000 samples), binary classification.
- **Supports**: L1 (Lasso), L2 (Ridge) regularization.
- **Limitation**: Does **not** support multinomial classification.
- **Use Case**: When interpretability & feature selection (L1) are needed.

🔹 **Example:**
```python
LogisticRegression(solver='liblinear', penalty='l1')  # L1 Regularization
```

---

### 🔹 **2. 'lbfgs' (Limited-memory BFGS)**
- **Best for**: Large datasets with many features.
- **Supports**: L2 regularization only.
- **Limitation**: No L1 or Elastic Net support.
- **Use Case**: When using **multiclass classification** (`multi_class='multinomial'`).

🔹 **Example:**
```python
LogisticRegression(solver='lbfgs', multi_class='multinomial')
```

---

### 🔹 **3. 'newton-cg' (Newton Conjugate Gradient)**
- **Best for**: Large datasets with high-dimensional features.
- **Supports**: L2 regularization.
- **Limitation**: No L1 or Elastic Net support.
- **Use Case**: Similar to **'lbfgs'**, but can be **faster in high-dimensional problems**.

🔹 **Example:**
```python
LogisticRegression(solver='newton-cg', multi_class='multinomial')
```

---

### 🔹 **4. 'sag' (Stochastic Average Gradient)**
- **Best for**: Very large datasets, sparse data.
- **Supports**: L2 regularization.
- **Limitation**: No L1 or Elastic Net support.
- **Use Case**: When dealing with **sparse datasets** (e.g., NLP, text data).

🔹 **Example:**
```python
LogisticRegression(solver='sag')
```

---

### 🔹 **5. 'saga' (Stochastic Average Gradient with L1/L2 Support)**
- **Best for**: **Very large datasets (millions of samples)**, Elastic Net.
- **Supports**: L1, L2, Elastic Net regularization.
- **Limitation**: More computationally expensive.
- **Use Case**: When L1 (feature selection) or Elastic Net (combination of L1 & L2) is needed.

🔹 **Example:**
```python
LogisticRegression(solver='saga', penalty='elasticnet', l1_ratio=0.5)
```

---

## **3. Which Solver Should You Use?**
| **Scenario** | **Best Solver** |
|-------------|----------------|
| Small dataset (binary classification) | `'liblinear'` |
| Large dataset (>10,000 samples) | `'lbfgs'` or `'newton-cg'` |
| Multiclass classification | `'lbfgs'` or `'newton-cg'` |
| Sparse datasets (text, NLP) | `'sag'` or `'saga'` |
| L1 regularization (feature selection) | `'liblinear'` or `'saga'` |
| Elastic Net regularization | `'saga'` |


In [None]:
#15 how is Logistic Regression extended for multiclass classification.
#ans.### **How Logistic Regression is Extended for Multiclass Classification**  

By default, **Logistic Regression is designed for binary classification** (i.e., two classes). However, it can be extended to **multiclass classification** (three or more classes) using two main approaches:  

---

## **1. One-vs-Rest (OvR) / One-vs-All (OvA)**
🔹 **Concept:**  
- Trains **one logistic regression model per class**.
- Each model treats one class as **positive** and all others as **negative**.  
- The model with the **highest probability** is chosen.  

🔹 **Example (3 classes: A, B, C)**  
- Model 1: A vs (B + C)  
- Model 2: B vs (A + C)  
- Model 3: C vs (A + B)  

🔹 **When to Use:**  
✅ Works well for **most datasets**.  
✅ **Efficient for large datasets**.  
❌ **Can be inefficient** for a **very large number of classes**.  

🔹 **Scikit-learn Implementation:**  
```python
from sklearn.linear_model import LogisticRegression

model = LogisticRegression(multi_class='ovr', solver='liblinear')
model.fit(X_train, y_train)
```

---

## **2. Multinomial Logistic Regression (Softmax Regression)**
🔹 **Concept:**  
- Uses the **softmax function** to assign probabilities to each class.  
- **Single model** learns all class probabilities simultaneously.  
- Class with the **highest probability** is predicted.  

🔹 **Softmax Formula:**  
\[
P(y = k | X) = \frac{e^{\theta_k^T X}}{\sum_{j=1}^{K} e^{\theta_j^T X}}
\]
where \( k \) is the class label, and \( K \) is the total number of classes.

🔹 **When to Use:**  
✅ Better for **balanced datasets**.  
✅ **More stable than OvR** when the number of classes is small.  
❌ Can be **computationally expensive** for large datasets.  

🔹 **Scikit-learn Implementation:**  
```python
model = LogisticRegression(multi_class='multinomial', solver='lbfgs')
model.fit(X_train, y_train)
```

---

## **3. Choosing Between OvR and Multinomial Logistic Regression**
| **Criteria** | **One-vs-Rest (OvR)** | **Multinomial (Softmax)** |
|-------------|----------------------|--------------------------|
| **Computational Cost** | Lower | Higher |
| **Interpretability** | Easier | More complex |
| **Performance** | Good for imbalanced datasets | Good for balanced datasets |
| **Scalability** | Better for large class counts | Efficient for small class counts |

---

## **Conclusion**
- **One-vs-Rest (OvR)** trains multiple binary classifiers and is efficient for large datasets.  
- **Multinomial (Softmax)** directly optimizes for multiple classes and works well for smaller, balanced datasets.  
- In **Scikit-learn**, `'lbfgs'` or `'newton-cg'` solvers are preferred for **multinomial logistic regression**.

In [None]:
#16.what are the advantages and disadvantages of Logistic Regression.
#ans.Advantages:

    Simple to implement and interpret.
    Works well with linearly separable data.
    Outputs probability scores.
    Computationally efficient.

Disadvantages:

    Assumes linear decision boundary.
    Struggles with complex relationships.
    Sensitive to outliers.
    Requires feature engineering for better performance.

In [None]:
#17 What are some use cases of Logistic Regression.
#ans.What are some use cases of Logistic Regression?

    Medical diagnosis (e.g., predicting disease presence).
    Credit scoring (e.g., loan approval).
    Marketing (e.g., customer churn prediction).
    Spam detection.
    Fraud detection.

In [None]:
#18.What is the difference between Softmax Regression and Logistic Regression.
#ans.Logistic Regression: Used for binary classification; outputs a probability for one class.
Softmax Regression: Used for multiclass classification; assigns probabilities to all classes, summing to 1.

In [None]:
#19 How do we choose between One-vs-Rest (OvR) and Softmax for multiclass classification.
OvR: Works well for datasets with imbalanced classes and is computationally efficient for large class counts.
Softmax: Preferred when class relationships are important and probabilistic interpretation is needed.

In [None]:
#20. How do we interpret coefficients in Logistic Regression?
#ans.Each coefficient represents the log-odds change in the dependent variable for a one-unit change in the predictor variable.
A positive coefficient means increasing the predictor increases the probability of the positive class.
A negative coefficient means increasing the predictor decreases the probability of the positive class.

In [None]:
                                               PRACTICAL QUESTIONS

In [None]:
#1. Write a Python program that loads a dataset, splits it into training and testing sets, applies Logistic Regression, and prints the model accuracy
#ans.from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_iris

# Load dataset
data = load_iris()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=42)

# Train model
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
print("Model Accuracy:", accuracy_score(y_test, y_pred))


In [None]:
2. Write a Python program to apply L1 regularization (Lasso) on a dataset using LogisticRegression(penalty='l1') and print the model accuracy

model = LogisticRegression(penalty='l1', solver='liblinear')
model.fit(X_train, y_train)
print("L1 Regularization Accuracy:", accuracy_score(y_test, model.predict(X_test)))

In [None]:
3. Write a Python program to train Logistic Regression with L2 regularization (Ridge) using LogisticRegression(penalty='l2'). Print model accuracy and coefficients

model = LogisticRegression(penalty='l2', solver='lbfgs')
model.fit(X_train, y_train)
print("L2 Regularization Accuracy:", accuracy_score(y_test, model.predict(X_test)))
print("Coefficients:", model.coef_)

In [None]:
4. Write a Python program to train Logistic Regression with Elastic Net Regularization (penalty='elasticnet')

model = LogisticRegression(penalty='elasticnet', solver='saga', l1_ratio=0.5)
model.fit(X_train, y_train)
print("Elastic Net Accuracy:", accuracy_score(y_test, model.predict(X_test)))

In [None]:
5. Write a Python program to train a Logistic Regression model for multiclass classification using multi_class='ovr'

model = LogisticRegression(multi_class='ovr')
model.fit(X_train, y_train)
print("Multiclass (OvR) Accuracy:", accuracy_score(y_test, model.predict(X_test)))

In [None]:
6. Write a Python program to apply GridSearchCV to tune the hyperparameters (C and penalty) of Logistic Regression. Print the best parameters and accuracy

from sklearn.model_selection import GridSearchCV

params = {'C': [0.1, 1, 10], 'penalty': ['l1', 'l2']}
grid = GridSearchCV(LogisticRegression(solver='saga'), param_grid=params, cv=5)
grid.fit(X_train, y_train)

print("Best Parameters:", grid.best_params_)
print("Best Accuracy:", grid.best_score_)

In [None]:
7. Write a Python program to evaluate Logistic Regression using Stratified K-Fold Cross-Validation. Print the average accuracy

from sklearn.model_selection import StratifiedKFold, cross_val_score

kf = StratifiedKFold(n_splits=5)
scores = cross_val_score(model, X_train, y_train, cv=kf)

print("Average Accuracy:", scores.mean())

In [None]:
#8. Write a Python program to load a dataset from a CSV file, apply Logistic Regression, and evaluate its accuracy

import pandas as pd

# Load dataset
df = pd.read_csv("dataset.csv")  # Replace with actual CSV file
X = df.iloc[:, :-1]  # Features
y = df.iloc[:, -1]   # Target

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = LogisticRegression()
model.fit(X_train, y_train)

# Predict and evaluate
print("Model Accuracy:", accuracy_score(y_test, model.predict(X_test)))

In [None]:
#9.Write a Python program to apply RandomizedSearchCV for tuning hyperparameters (C, penalty, solver) in 
Logistic Regression. Print the best parameters and accuracy
#ans.from sklearn.model_selection import RandomizedSearchCV, train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

# Load dataset
data = load_iris()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=42)

# Define parameter grid
param_dist = {
    'C': [0.01, 0.1, 1, 10, 100],  # Regularization strength
    'penalty': ['l1', 'l2'],       # Type of regularization
    'solver': ['liblinear', 'saga'] # Solvers that support both l1 and l2
}

# Initialize logistic regression model
model = LogisticRegression(max_iter=500)

# Apply RandomizedSearchCV
random_search = RandomizedSearchCV(model, param_distributions=param_dist, n_iter=10, cv=5, random_state=42)
random_search.fit(X_train, y_train)

# Get best parameters and accuracy
best_model = random_search.best_estimator_
y_pred = best_model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print("Best Parameters:", random_search.best_params_)
print("Best Model Accuracy:", accuracy)


In [None]:
#10. Write a Python program to implement One-vs-One (OvO) Multiclass Logistic Regression and print accuracy
#ans.from sklearn.multiclass import OneVsOneClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load dataset
data = load_iris()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=42)

# Initialize Logistic Regression with One-vs-One (OvO)
ovo_model = OneVsOneClassifier(LogisticRegression(max_iter=500))
ovo_model.fit(X_train, y_train)

# Make predictions
y_pred = ovo_model.predict(X_test)

# Evaluate accuracy
accuracy = accuracy_score(y_test, y_pred)
print("One-vs-One (OvO) Accuracy:", accuracy)


In [None]:
11. Write a Python program to train a Logistic Regression model and visualize the confusion matrix for binary 
classification
#ans.import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
from sklearn.datasets import make_classification

# Generate a binary classification dataset
X, y = make_classification(n_samples=1000, n_features=10, random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Predict on test data
y_pred = model.predict(X_test)

# Compute confusion matrix
cm = confusion_matrix(y_test, y_pred)

# Visualize confusion matrix
plt.figure(figsize=(6, 4))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=['Negative', 'Positive'], yticklabels=['Negative', 'Positive'])
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.title("Confusion Matrix")
plt.show()


In [None]:
#12.Write a Python program to train a Logistic Regression model and evaluate its performance using Precision, 
Recall, and F1-Score
#ans.from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import precision_score, recall_score, f1_score, classification_report
from sklearn.datasets import make_classification

# Generate a binary classification dataset
X, y = make_classification(n_samples=1000, n_features=10, random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Predict on test data
y_pred = model.predict(X_test)

# Evaluate performance
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

# Print results
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
print("\nClassification Report:\n", classification_report(y_test, y_pred))


In [None]:
#13.Write a Python program to train a Logistic Regression model on imbalanced data and apply class weights to 
improve model performance
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
from sklearn.datasets import make_classification
import numpy as np

# Generate an imbalanced dataset
X, y = make_classification(n_samples=1000, n_features=10, weights=[0.9, 0.1], random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

# Train Logistic Regression model without class weights (Baseline)
model_baseline = LogisticRegression()
model_baseline.fit(X_train, y_train)
y_pred_baseline = model_baseline.predict(X_test)

# Train Logistic Regression model with class weights
model_weighted = LogisticRegression(class_weight='balanced')
model_weighted.fit(X_train, y_train)
y_pred_weighted = model_weighted.predict(X_test)

# Evaluate models
print("Baseline Model Performance:")
print(classification_report(y_test, y_pred_baseline))

print("\nWeighted Model Performance:")
print(classifica


In [None]:
#14. Write a Python program to train Logistic Regression on the Titanic dataset, handle missing values, and 
evaluate performance
#ans 
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
from sklearn.datasets import make_classification
import numpy as np

# Generate an imbalanced dataset
X, y = make_classification(n_samples=1000, n_features=10, weights=[0.9, 0.1], random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

# Train Logistic Regression model without class weights (Baseline)
model_baseline = LogisticRegression()
model_baseline.fit(X_train, y_train)
y_pred_baseline = model_baseline.predict(X_test)

# Train Logistic Regression model with class weights
model_weighted = LogisticRegression(class_weight='balanced')
model_weighted.fit(X_train, y_train)
y_pred_weighted = model_weighted.predict(X_test)

# Evaluate models
print("Baseline Model Performance:")
print(classification_report(y_test, y_pred_baseline))

print("\nWeighted Model Performance:")
print(classification_report(y_test, y_pred_weighted))


In [None]:
#15. Write a Python program to apply feature scaling (Standardization) before training a Logistic Regression 
model. Evaluate its accuracy and compare results with and without scaling
#ans.
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
from sklearn.datasets import make_classification

# Generate a dataset
X, y = make_classification(n_samples=1000, n_features=10, random_state=42)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression without scaling
model_no_scaling = LogisticRegression()
model_no_scaling.fit(X_train, y_train)
y_pred_no_scaling = model_no_scaling.predict(X_test)
accuracy_no_scaling = accuracy_score(y_test, y_pred_no_scaling)

# Apply Standardization (Feature Scaling)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train Logistic Regression with scaling
model_with_scaling = LogisticRegression()
model_with_scaling.fit(X_train_scaled, y_train)
y_pred_with_scaling = model_with_scaling.predict(X_test_scaled)
accuracy_with_scaling = accuracy_score(y_test, y_pred_with_scaling)

# Print results
print(f"Accuracy without Scaling: {accuracy_no_scaling:.4f}")
print(f"Accuracy with Scaling: {accuracy_with_scaling:.4f}")


In [None]:
#16 Write a Python program to train Logistic Regression and evaluate its performance using ROC-AUC score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score, roc_curve, auc
from sklearn.datasets import make_classification
import matplotlib.pyplot as plt

# Generate a binary classification dataset
X, y = make_classification(n_samples=1000, n_features=10, random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Predict probabilities for the positive class
y_proba = model.predict_proba(X_test)[:, 1]

# Compute ROC-AUC score
roc_auc = roc_auc_score(y_test, y_proba)
print(f"ROC-AUC Score: {roc_auc:.4f}")

# Compute ROC curve
fpr, tpr, _ = roc_curve(y_test, y_proba)

# Plot ROC curve
plt.figure(figsize=(6, 4))
plt.plot(fpr, tpr, label=f"ROC Curve (AUC = {roc_auc:.4f})", color="blue")
plt.plot([0, 1], [0, 1], linestyle="--", color="gray")  # Diagonal line
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.legend(loc="lower right")
plt.show()


In [None]:
#17 Write a Python program to train Logistic Regression using a custom learning rate (C=0.5) and evaluate accuracy
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.datasets import make_classification

# Generate a dataset
X, y = make_classification(n_samples=1000, n_features=10, random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression with a custom learning rate (C=0.5)
model = LogisticRegression(C=0.5, max_iter=500)
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"Model Accuracy with C=0.5: {accuracy:.4f}")


In [None]:
#18. Write a Python program to train Logistic Regression and identify important features based on model coefficients
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification

# Generate a dataset with feature names
X, y = make_classification(n_samples=1000, n_features=10, random_state=42, n_classes=2)
feature_names = [f'Feature {i}' for i in range(1, 11)]

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Get feature importance (coefficients)
coefficients = model.coef_[0]

# Create a DataFrame to display feature importance
feature_importance = pd.DataFrame({'Feature': feature_names, 'Coefficient': coefficients})
feature_importance['Absolute Coefficient'] = np.abs(feature_importance['Coefficient'])
feature_importance = feature_importance.sort_values(by='Absolute Coefficient', ascending=False)

# Print feature importance
print("Feature Importance based on Logistic Regression Coefficients:")
print(feature_importance[['Feature', 'Coefficient']])


In [None]:
#19.Write a Python program to train Logistic Regression and evaluate its performance using Cohen’s Kappa 
Score
#ans.from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import cohen_kappa_score, accuracy_score
from sklearn.datasets import make_classification

# Generate a binary classification dataset
X, y = make_classification(n_samples=1000, n_features=10, random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Predict on test data
y_pred = model.predict(X_test)

# Evaluate performance using Cohen's Kappa Score
kappa_score = cohen_kappa_score(y_test, y_pred)
accuracy = accuracy_score(y_test, y_pred)

# Print results
print(f"Model Accuracy: {accuracy:.4f}")
print(f"Cohen's Kappa Score: {kappa_score:.4f}")


In [None]:
#20 Write a Python program to train Logistic Regression and visualize the Precision-Recall Curve for binary 
classification
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import precision_recall_curve, auc
from sklearn.datasets import make_classification

# Generate a binary classification dataset
X, y = make_classification(n_samples=1000, n_features=10, random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Predict probabilities
y_proba = model.predict_proba(X_test)[:, 1]

# Compute Precision-Recall values
precision, recall, _ = precision_recall_curve(y_test, y_proba)
pr_auc = auc(recall, precision)

# Plot Precision-Recall curve
plt.figure(figsize=(6, 4))
plt.plot(recall, precision, label=f"PR Curve (AUC = {pr_auc:.4f})", color="blue")
plt.xlabel("Recall")
plt.ylabel("Precision")
plt.title("Precision-Recall Curve")
plt.legend(loc="lower left")
plt.grid()
plt.show()


In [None]:
#21.Write a Python program to train Logistic Regression with different solvers (liblinear, saga, lbfgs) and compare 
their accuracy
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.datasets import make_classification

# Generate a binary classification dataset
X, y = make_classification(n_samples=1000, n_features=10, random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define solvers to compare
solvers = ['liblinear', 'saga', 'lbfgs']
accuracies = {}

# Train and evaluate Logistic Regression with different solvers
for solver in solvers:
    model = LogisticRegress


In [None]:
#22. Write a Python program to train Logistic Regression and evaluate its performance using Matthews 
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import matthews_corrcoef, accuracy_score
from sklearn.datasets import make_classification

# Generate a binary classification dataset
X, y = make_classification(n_samples=1000, n_features=10, random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Predict on test data
y_pred = model.predict(X_test)

# Compute MCC
mcc_score = matthews_corrcoef(y_test, y_pred)
accuracy = accuracy_score(y_test, y_pred)

# Print results
print(f"Model Accuracy: {accuracy:.4f}")
print(f"Matthews Correlation Coefficient (MCC): {mcc_score:.4f}")



In [None]:
#23.Write a Python program to train Logistic Regression on both raw and standardized data. Compare their 
accuracy to see the impact of feature scaling
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
from sklearn.datasets import make_classification

# Generate a dataset
X, y = make_classification(n_samples=1000, n_features=10, random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression on raw data
model_raw = LogisticRegression()
model_raw.fit(X_train, y_train)
y_pred_raw = model_raw.predict(X_test)
accuracy_raw = accuracy_score(y_test, y_pred_raw)

# Apply Standardization (Feature Scaling)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train Logistic Regression on standardized data
model_scaled = LogisticRegression()
model_scaled.fit(X_train_scaled, y_train)
y_pred_scaled = model_scaled.predict(X_test_scaled)
accuracy_scaled = accuracy_score(y_test, y_pred_scaled)

# Print comparison results
print(f"Accuracy on Raw Data: {accuracy_raw:.4f}")
print(f"Accuracy on Standardized Data: {accuracy_scaled:.4f}")

# Compare results
if accuracy_scaled > accuracy_raw:
    print("Feature scaling improved the model performance.")
elif accuracy_scaled < accuracy_raw:
    print("Feature scaling reduced the model performance.")
else:
    print("Feature scaling had no impact on the model performance.")


In [None]:
#24.Write a Python program to train Logistic Regression and find the optimal C (regularization strength) using 
cross-validation
#ans.from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score

# Generate a dataset
X, y = make_classification(n_samples=1000, n_features=10, random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define hyperparameter grid for C
param_grid = {'C': [0.01, 0.1, 1, 10, 100]}

# Perform Grid Search with cross-validation
grid_search = GridSearchCV(LogisticRegression(max_iter=500), param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

# Get the best C value
best_C = grid_search.best_params_['C']
print(f"Optimal C value: {best_C}")

# Train Logistic Regression with the best C
best_model = LogisticRegression(C=best_C, max_iter=500)
best_model.fit(X_train, y_train)

# Predict and evaluate accuracy
y_pred = best_model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy with Optimal C: {accuracy:.4f}")


In [None]:
#25.Write a Python program to train Logistic Regression, save the trained model using joblib, and load it again to 
make predictions
#ans.
import joblib
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score

# Generate a dataset
X, y = make_classification(n_samples=1000, n_features=10, random_state=42, n_classes=2)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Save the trained model
joblib.dump(model, "logistic_model.pkl")
print("Model saved successfully!")

# Load the saved model
loaded_model = joblib.load("logistic_model.pkl")
print("Model loaded successfully!")

# Make predictions with the loaded model
y_pred = loaded_model.predict(X_test)

# Evaluate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.4f}")
