**Q1: Define overfitting and underfitting in machine learning.**

*Overfitting*:
- When a model captures not just the underlying trend in the data, but also the noise. It performs very well on the training data but poorly on unseen data (like test data).

*Underfitting*:
- When a model is too simple to capture the underlying trend in the data. It performs poorly both on the training data and unseen data.

*Consequences*:
- Overfitting: Good performance on training data but poor generalization to new, unseen data.
- Underfitting: Poor performance on both training and test data.

*Mitigation*:
- Overfitting: Use simpler models, apply regularization, use more data, or apply techniques like cross-validation.
- Underfitting: Use more complex models, add more features, or decrease regularization.

**Q2: How can we reduce overfitting?**

- Use more data: Increasing the training set can help the model generalize better.
- Simplify the model: Use fewer parameters or features.
- Use cross-validation: Helps to get a better estimate of model performance on unseen data.
- Apply regularization: Techniques like L1 (Lasso) and L2 (Ridge) add penalty terms that prevent coefficients from becoming too large.
- Prune decision trees: Remove branches that have little power to predict.
- Dropouts in neural networks: Randomly set a fraction of input units to 0 at each update during training time.

**Q3: Explain underfitting. List scenarios where underfitting can occur in ML.**

*Underfitting*:
- It occurs when the model fails to capture the underlying trend of the data. 
*Scenarios*:
- Using a linear model for non-linear data.
- Having too few features in the training set.
- Using a small neural network for a complex problem.
- Overly simplifying any algorithm.

**Q4: Explain the bias-variance tradeoff in machine learning.**

- **Bias**: Error due to overly simplistic assumptions in the learning algorithm. High bias can cause the model to miss relevant relations between features and target outputs.
- **Variance**: Error due to too much complexity in the learning algorithm. High variance can cause overfitting.

*Relationship*:
- Increasing a model's complexity might decrease bias but increase variance, and vice-versa.
- Ideally, one aims for a good balance between bias and variance, ensuring minimal total error.

**Q5: Common methods for detecting overfitting and underfitting.**

- **Training vs. Test Error**: A model that overfits will have a low training error but a high test error. Underfitting will manifest as a high training error.
- **Cross-Validation**: Consistently high validation errors across different data subsets can indicate overfitting.
- **Learning Curves**: Plotting training and validation errors over increasing amounts of data can indicate where the model begins to overfit.

**Q6: Compare and contrast bias and variance in machine learning.**

- **Bias**: 
  - *Description*: The model's assumptions are wrong about the target function.
  - *High Bias Example*: Assuming data is linear when it has a non-linear relationship.
  - *Performance*: Tends to have high error on training and test data.
  
- **Variance**:
  - *Description*: The model is highly sensitive to fluctuations in the training data.
  - *High Variance Example*: A high-degree polynomial regression model on a simple dataset.
  - *Performance*: Performs well on training data but poorly on test data.

**Q7: What is regularization in machine learning?**

- **Regularization**: A technique to add a penalty on the complexity of the model, preventing overfitting.
- **Common techniques**:
  - **L1 Regularization (Lasso)**: Adds a penalty equal to the absolute value of the magnitude of coefficients. Can reduce some coefficients to zero, effectively performing feature selection.
  - **L2 Regularization (Ridge)**: Adds a penalty proportional to the square of the magnitude of coefficients. It will make coefficients smaller but not zero.
  - **Elastic Net**: A combination of L1 and L2 regularization.

By adding these penalties, regularization ensures that the model remains simpler and more robust, leading to better generalization on unseen data.