## Que 1

**Overfitting**: Model memorizes data, performs well on training, poorly on new data.

**Underfitting**: Model is too simple, performs poorly on both training and test data.

**Consequences**: Overfitting - poor generalization. Underfitting - can't learn patterns.

**Mitigation (Overfitting)**: Simplify, Regularize, Cross-Validation, Early Stopping, Feature Selection, Ensembles.

**Mitigation (Underfitting)**: Feature Engineering, Complex Models, Hyperparameter Tuning, Adding Features, Ensembles.

## Que 2

Reduce overfitting by:
1. **Simplifying Model**: Use simpler algorithms or fewer features.
2. **Regularization**: Penalize complex models (Ridge, Lasso regression).
3. **Cross-Validation**: Evaluate on various train-test splits.
4. **Early Stopping**: Stop training when validation error increases.
5. **Feature Selection**: Keep only relevant features, remove noise.


## Que 3

Underfitting occurs when a model is too simple to capture underlying patterns in data, resulting in poor performance on both training and test data.

Scenarios of underfitting in ML:
1. **Insufficient Complexity**: Using linear model for highly nonlinear data.
2. **Few Features**: When too few relevant features are used.
3. **Ignoring Interactions**: Not considering interactions between features.
4. **Too Much Regularization**: Excessive regularization in models like Ridge.

## Que 4

**Bias-Variance Tradeoff**: Balancing model simplicity (bias) and complexity (variance) for optimal performance.

**Bias**: Error from simple assumptions, leads to underfitting.

**Variance**: Error from sensitivity to data variations, causes overfitting.

**High Bias**: Underfitting, poor on training and test.

**High Variance**: Overfitting, good on training, poor on test.

**Optimal Tradeoff**: Balancing bias and variance for best generalization.

## Que 5

**Detecting Overfitting**:
1. **Validation Curves**: Plotting training and validation error against model complexity.
2. **Learning Curves**: Plotting training and validation error against training data size.
3. **Cross-Validation**: Evaluating model on multiple train-test splits.
4. **Comparing Train and Test Error**: If train error is much lower than test error, it's likely overfitting.

**Detecting Underfitting**:
1. **Training and Test Error**: If both errors are high, the model might be too simple.
2. **Visual Inspection**: Plotting data and model predictions can reveal underfitting patterns.
3. **Learning Curves**: If both training and validation error are high and close, it indicates underfitting.

**Determining Overfitting/Underfitting**:
- If training error is low, but validation/test error is high, it's likely overfitting.
- If both training and validation/test error are high, it's likely underfitting.
- A balanced model has reasonably low training and validation/test error.

## Que 6

**Bias**:
- **Definition**: Error from simplistic assumptions.
- **Effect**: Underfitting, poor on train and test.
- **Example**: Linear regression on complex data.

**Variance**:
- **Definition**: Error from data sensitivity.
- **Effect**: Overfitting, good on train, poor on test.
- **Example**: High-degree polynomial regression.

**High Bias Model**:
- Linear regression on nonlinear data.
- Performance: Poor on both.

**High Variance Model**:
- High-degree polynomial regression on limited data.
- Performance: Good on train, poor on test.

## Que 7

**Regularization**: Adding a penalty term to the loss function to control model complexity and prevent overfitting.

**Usage for Preventing Overfitting**:
- **High Model Complexity**: Regularization discourages overly complex models.
- **Reduces Overfitting**: Penalty on complexity makes model generalize better.

**Common Regularization Techniques**:
1. **L1 Regularization (Lasso)**:
   - Adds the absolute values of coefficients to the loss.
   - Encourages sparsity, some coefficients become exactly zero.
   
2. **L2 Regularization (Ridge)**:
   - Adds the squared values of coefficients to the loss.
   - Penalizes large coefficients, but doesn't make them exactly zero.
   
3. **Elastic Net Regularization**:
   - Combination of L1 and L2 regularization.
   - Balances between sparsity (L1) and coefficient size control (L2).

4. **Dropout**:
   - Neural network technique where random neurons are ignored during training.
   - Helps prevent co-adaptation of neurons, reducing overfitting.

5. **Early Stopping**:
   - Stop training when validation error starts increasing.
   - Prevents overfitting as model becomes too specialized to training data.

Regularization techniques constrain the model's freedom to fit the training data too closely, helping it generalize better to new data.