## Question 1

1. Output:
   - Linear Regression: Predicts a continuous value (e.g., house price).
   - Logistic Regression: Predicts a categorical outcome, typically binary (e.g., yes/no, 1/0).

2. Mathematical Approach:
- Linear: Uses a linear equation Y=β0+β1X.
- Logistic: Uses the sigmoid function to produce a probability between 0 and 1.
3. Loss Function:
- Linear: Minimizes Mean Squared Error (MSE).
- Logistic: Minimizes Log-Loss (Cross-Entropy).

4. Assumptions:

- Linear: Assumes a linear relationship and normally distributed errors.
- Logistic: Assumes log-odds are linearly related to features.

Example for Logistic Regression
- Predicting if a patient has a disease (yes/no) based on diagnostic data is better suited for logistic regression due to the binary nature of the outcome.

## Question 2

The cost function used in logistic regression is log-loss (cross-entropy loss), defined as:

J(θ)= −1m∑i=1m[y(i)log⁡(hθ(x(i)))+(1−y(i))log⁡(1−hθ(x(i)))]
​
- (x) is the predicted probability, 
- y is the actual label, and 
- m is the number of samples.

Optimization:
It is typically optimized using gradient descent (or variants like stochastic gradient descent), updating the weights iteratively to minimize the cost.

## Question 3

Regularization in logistic regression adds a penalty term to the cost function to prevent overfitting, which occurs when the model fits the training data too closely and performs poorly on unseen data.

Types of Regularization:
1. L2 Regularization (Ridge): Adds a penalty proportional to the square of the coefficients (weights) to the cost function:
    - J(θ)=−1m∑...+λ∑j=1nθj2

2. L1 Regularization (Lasso): Adds a penalty proportional to the absolute value of the coefficients:
     - J(θ)=−1m∑...+λ∑j=1n∣θj∣


How It Helps Prevent Overfitting:
Regularization discourages large coefficient values, thus simplifying the model and reducing variance. This helps the model generalize better to new data by avoiding overfitting to the noise or complex patterns in the training set. The parameter λ controls the strength of the regularization.

## Question 4

The ROC (Receiver Operating Characteristic) curve is a graphical tool used to evaluate the performance of a binary classification model, like logistic regression.

Components of the ROC Curve:
1. True Positive Rate (TPR): Also known as sensitivity or recall, it measures the proportion of actual positives correctly predicted: 
    - TPR=TP/TP+FN.

2. False Positive Rate (FPR): It measures the proportion of actual negatives incorrectly predicted as positives: 
    - FPR=FP/FP+TN.

The ROC curve plots TPR against FPR at various threshold settings of the model.

How It’s Used:
Evaluation: The closer the ROC curve is to the top-left corner, the better the model performs.

AUC (Area Under the Curve): A single-number summary of the ROC curve. An AUC of 1 indicates a perfect model, while 0.5 indicates random guessing.
The ROC curve helps assess the model's ability to distinguish between classes across different thresholds.

## Question 5

Common Techniques for Feature Selection in Logistic Regression:

1. Recursive Feature Elimination (RFE):
Iteratively removes the least important features based on model performance.
Helps by simplifying the model, improving interpretability, and reducing overfitting.

2. L1 Regularization (Lasso):
Penalizes the absolute values of coefficients, driving some to zero.
Helps automatically select the most important features by eliminating irrelevant ones.
3. Univariate Statistical Tests (e.g., Chi-Square, ANOVA):
Selects features based on their statistical significance with the target variable.
Improves performance by choosing features that have the strongest relationship with the outcome.
4. Principal Component Analysis (PCA):
Reduces dimensionality by transforming features into principal components that capture the most variance.
Helps by reducing noise and improving computational efficiency.
5. Correlation Matrix with Threshold:
Removes highly correlated features (multicollinearity) that provide redundant information.
Improves stability and prevents inflated coefficients in the model.

How These Techniques Improve Performance:
1. Reduce Overfitting: By eliminating irrelevant or redundant features, the model generalizes better to new data.
2. Improve Interpretability: Fewer features make the model easier to understand and interpret.
3. Enhance Efficiency: Reducing the number of features can lower computational cost and training time.

## Question 6

Handling imbalanced datasets in logistic regression is crucial for preventing bias toward the majority class. Here are some common strategies to address class imbalance:

1. Resampling Techniques:
    - Oversampling the Minority Class: Increase the number of instances in the minority class (e.g., using SMOTE - Synthetic Minority Over-sampling Technique).
    - Undersampling the Majority Class: Reduce the number of instances in the majority class to balance the dataset.
    - Combination of Both: A balanced mix of oversampling the minority and undersampling the majority class.

2. Class Weighting:
Assign higher class weights to the minority class in the logistic regression model. This forces the model to pay more attention to the minority class by penalizing misclassification of minority class examples more heavily.

3. Anomaly Detection:
When the minority class is rare (e.g., fraud detection), treat it as an anomaly detection problem instead of a standard classification problem.

4. Threshold Tuning:
Instead of using the default 0.5 probability threshold, adjust the decision threshold to better balance the precision and recall for the minority class.

5. Use Evaluation Metrics for Imbalanced Data:
Focus on metrics like Precision, Recall, F1-score, and AUC-ROC instead of accuracy, which can be misleading for imbalanced datasets.

How These Strategies Help:
They ensure that the model pays appropriate attention to the minority class, improving its ability to correctly classify minority class instances without being overwhelmed by the majority class, thus improving overall performance and fairness in predictions.

## Question 7

1. Multicollinearity:
    - Problem: Correlated independent variables cause unstable coefficients.
    - Solution: Use VIF to remove correlated features, apply L2 regularization, or use PCA.

2. Overfitting:
    - Problem: The model fits noise in the data.
    - Solution: Use regularization, apply feature selection, and validate with cross-validation.

3. Imbalanced Data:
    - Problem: The model may favor the majority class.
    - Solution: Use oversampling/undersampling, apply class weights, and focus on precision, recall, or AUC-ROC.

4. Non-linearity:
    - Problem: Logistic regression assumes linear relationships.
    - Solution: Add polynomial features, interaction terms, or consider non-linear models.

5. Outliers:
    - Problem: Outliers can skew predictions.
    - Solution: Remove outliers or apply robust scaling techniques.

6. Convergence Issues:
    - Problem: The algorithm fails to converge.
    - Solution: Scale features, reduce feature count, or use regularization.
