## Over-fitting / Under-fitting

- **Bias**: Error from erroneous assumptions (e.g., assuming a non-linear relationship is a straight line).
- **Variance**: Error from sensitivity to small fluctuations in the training data.
- **Irreducible Error**: The "noise" in the data itself that no model can ever fix.

#### The Bias-Variance Tradeoff
The goal is to find the "Sweet Spot" where the sum of Bias and Variance is at its lowest.

<img src="https://media.geeksforgeeks.org/wp-content/uploads/20251209153305748438/420046946.webp" width="500">

- **Underfitting**: Straight line trying to fit a curved dataset but cannot capture the data's patterns, leading to poor performance on both training and test sets.
- **Overfitting**: A squiggly curve passing through all training points, failing to generalize performing well on training data but poorly on test data.

### Underfitting
Occurs when model is too simple and does not cover all real patterns in the data.
- It makes strong assumptions
- Ignores the patterns
- Variance is low because model returns similar outputs when the data changes

Happens due to:
- Model being too simple
- Very high regularization of the data
- Features are weak or missing
- Not enough training

**Underfitting = High Bias + Low Variance**

#### How to reduce underfitting:
- **Use a more complex model**: e.g. move from Linear Regression to a Random Forest
- **Feature engineering**: Add more relevant input features or combine existing ones
- **Reduce regularization**: Constraints might be too strict, preventing the model from learning
- **Train for more epochs** (training iterations)
- **Scale features properly**

### Overfitting
Occurs when the model learns not just the underlying pattern, but also noise or random quirks in the training data (model memorizes training data). It performs very well on training data, but poorly on test data. 

Overfitting happens due to:
- Model being too complex
- Too many features
- Very little data

**Overfitting = Low Bias + High Variance**

#### How to reduce overfitting:
- **Collect more training data**: helps the model distinguish noise from signal
- **Reduce model complexity**: Decrease the depth of a tree or the number of layers in a NN
- **Regularization**: Add a penalty for complex weights (L1/L2)
- **Apply dropout**: Randomly "shut off" neurons during training (Deep Learning)
- **Early Stopping**: Stop training the moment validation error starts to rise
- **Clean noisy data**

#### Analogy:
- **Underfitting**: A student who only reads the table of contents and fails the exam.
- **Overfitting**: A student who memorizes specific practice questions but can't solve new ones.

### Regularization: L1 vs. L2
Introduces penalty for complexity. In standard machine learning, the model tries to minimize Loss (the error). Regularization changes the goal - Instead of just minimizing error, the model now tries to minimize:
$$\text{Total Cost} = \text{Loss (Error)} + \text{Penalty (Complexity)}$$

#### L1 Regularization (Lasso): 
Adds the absolute value of the weights to the loss function.
- Effect: It can push some weights to exactly zero.
- Use case: Great for Feature Selection when you have many features and suspect only a few are actually useful.

#### L2 Regularization (Ridge): 
Adds the square of the weights to the loss function.
- Effect: It pushes weights to be very small, but never zero.
- Use case: Good for general stability and when you want to keep all features but reduce their individual influence.

**weights: the coefficients that the model learns to determine the "importance" or "influence" of each input feature.**

**Interviewer: "Your model has a 99% training accuracy but 70% validation accuracy." - they are describing Overfitting.**

**Interviewer: "I'm looking at my learning curves. Both the training loss and the validation loss have flattened out, but they are both very high. Is this an overfitting or underfitting problem, and how do I solve it?" - That is a clear sign of Underfitting (High Bias). Because both errors are high, the model hasn't learned the basic structure of the data.**