## Question 1

#### Overfitting
 Overfitting occurs when a model learns the training data too well, capturing noise and random fluctuations rather than the underlying patterns. As a result, the model performs poorly on new, unseen data.
 
The consequences of overfitting are :
1. High training accuracy but poor generalization to new data.
2. Sensitivity to noise and outliers in the training data.
3. Overly complex models that may not generalize well.


Mitigation::
1. Regularization: introducing penalty term to the cost function of model discourages overfitting.
2. Cross-validation: techniques like k-fold cross-validation to assess model performance on multiple subsets of the data.
3. Feature Selection: Select only the most relevant features to reduce model complexity.
4. Early Stopping: Monitor performance on a validation set during training and stop when performance starts to degrade.


#### Underfitting

Underfitting occurs when a model is too simple to capture the underlying patterns in the training data. The model may fail to learn the relationships between features and target outputs, leading to poor performance on both training and new data.

Consequences::
1. Low training accuracy and poor generalization to new data.
2. Failure to capture important patterns in the data.
3. Inability to make accurate predictions.

Mitigation ::

1.  Use a more complex model with more parameters.
2. Add more relevant features to the model.
3.  If regularization is too high, it may lead to underfitting. Adjust regularization parameters accordingly.
4. If the current model is too simple, consider using a more complex one.
  

## Question 2

Overfitting occurs when a model learns the training data too well, capturing noise and random fluctuations rather than the underlying patterns. As a result, the model performs poorly on new, unseen data. Therefore to reduce overfitting we use techniques like regularization e.g, ridge and lasso regularizers. We can perform cross validation techniques like k-fold cross validation where subsets of the data are used to train and assess the model's accuracy. Feature selection is also an important and helpful step to overcome overfitting as it reduces the complexity in the data and thus our model does'nt become too complex.

## Question 3

Underfitting occurs when a model is too simple to capture the underlying patterns in the training data. The model may fail to learn the relationships between features and target outputs, leading to poor performance on both training and new data. There are several scenarios that lead upto underfitting :

1. Insufficient model complexity can lead to underfitting. For example using linear models to solve problems with complex non linear relationships.


2. If our model has too few features then alson it leads to underfitting Because our model may fail to capture the important relationships to the dependant variable thus giving us poor generalization.

3. If we use excessive regularizations like L1 and L2 the model can become too simplistic as it penalizes the the model for being too complex.

4. Termination of the training process too early using less number of epochs may also lead to underfitting. 

5. Failing to scale the data properly especially for algorithms which require feature scaling can also lead to sub-optimal performance.


## Question 4

The bias-variance trade-off is a fundamental concept in machine learning that describes relationship between bias, variance and model complexity.

#### Bias:

Bias refers to the error introduced by approximating a real-world problem, which may be complex, by a simplified model. High bias can lead to the model consistently underpredicting or overpredicting the true values. High bias models are often too simplistic and fail to capture the underlying patterns in the data. Models with high bias tend to have poor performance on both the training and test sets.


#### Variance:

Variance measures the model's sensitivity to small fluctuations in the training data. High variance can lead to the model being too flexible and capturing noise in the training data. High variance models are often overly complex, fitting the training data closely but failing to generalize well to new, unseen data.

The bias-variance trade-off refers to the delicate balance between bias and variance. As you decrease bias (make the model more complex), variance tends to increase, and vice versa.The goal is to find the optimal level of model complexity that minimizes both bias and variance, leading to the best generalization to new, unseen data.



## Question 5

Detecting overfitting and underfitting in machine learning models is crucial for building models that generalize well to new, unseen data.

1. Learning curves:

Plot learning curves that show the model's performance on the training and validation sets over time. Overfitting is indicated by a large gap between training and validation performance.High training accuracy but poor test accuracy.

For underfitted model learning curves show poor performance on both training and validation sets, suggesting the model is too simple. Low training and validation accuracy.

2. Performance Metrics: 

Compare training and test set performance metrics, such as accuracy, precision, recall, or F1 score. Overfitting is indicated by a significant drop in performance on the test set. 

Consistently low performance metrics on both training and test sets suggest underfitting.

3. Validation Set Performance:

Monitor performance on a separate validation set during training. A significant drop in validation set performance may indicate overfitting.

Poor performance on the training set may suggest underfitting.

4. Cross-Validation:

Use techniques like k-fold cross-validation to assess model performance on multiple subsets of the data. If performance varies widely across folds, it may indicate overfitting.

If Cross-validation consistently shows poor performance, suggesting that the model is not capturing the underlying patterns in the data.

5. Feature Importance Analysis:

 Analyze feature importance to identify if the model is giving too much weight to specific features. High feature importance for irrelevant features may indicate overfitting.
 
 If the features are not contributing significantly to the model, suggesting a too simplistic model.

## Question 6

Comparison between bias and variance:

1. Performance on Training Data:

High bias models tend to have low accuracy on the training data because they are too simplistic.

High variance models can have high accuracy on the training data, fitting it closely.

2. Performance on Test Data (Generalization):

High bias models may generalize poorly to new, unseen data, leading to systematic errors.

 High variance models often generalize poorly to new data, leading to erratic errors.

3. Sensitivity to Noise:

Bias is less sensitive to noise, as the model is too simple to be influenced significantly by individual data points.

Variance has High sensitivity to noise, as the model may fit the noise in the training data.


Example for high bias model: Using a linear regression model to predict the price of houses based on a highly nonlinear relationship between features and prices.


Example for high variance model: Using a high-degree polynomial regression model to predict house prices when a simpler model would be sufficient.


Different trade-offs between bias and variance:

1. High Bias, Low Variance:

These models may oversimplify the data, leading to systematic errors (underfitting). They perform poorly on both training and test sets. 

2. Low Bias, High Variance:

These models may fit the training data very closely but fail to generalize, performing well on the training set but poorly on the test set (overfitting).

3. Low Bias, Low Variance:

These models strike a balance, capturing essential patterns in the data without being overly sensitive to noise. They tend to perform well on both training and test sets. Thus leading to a generalized model.


## Question 7

Regularization in machine learning is a technique used to prevent overfitting by adding a penalty term to the objective function. The primary purpose of regularization is to encourage the model to be less complex, preventing it from fitting the training data too closely and improving its generalization to new, unseen data. Regularization is commonly applied to linear regression, logistic regression, and neural networks.

There are two main types of regularization: L1 regularization (Lasso) and L2 regularization (Ridge).

1. L1 Regularization (Lasso):

The regularization term is the absolute sum of the model coefficients multiplied by a regularization parameter lambda.  L1 regularization tends to produce sparse models by driving some coefficients to exactly zero, effectively performing feature selection. This regularization technique is used when there is a belief that many features are irrelevant or when feature selection is crucial.

2. L2 Regularization (Ridge) : 

The regularization term is the sum of the squared values of the model coefficients multiplied by a regularization parameter lambda. L2 regularization encourages the model to distribute the weight more evenly across all features, preventing any single feature from dominating. This regularization is used when all features are expected to contribute to the model, and no feature selection is required.



There is also another type of regularization known as the Elastic Net which i sthe combination of both  Ridge and Lasso Regularization. It balances the benefits of both Ridge and Lasso.