# Tackling overfitting via regularization

If a model suffers from overfitting, we also say that the model has a high variance, which can be caused by havting too many parameters that lead to a model that is too complex given the underlying data. Similarly, our model can also suffer from underfitting (high bias), which means that our model is not complex enough to capture the pattern in the training data well and therefore also suffers from low performance on unseen data.

* **Variance**: Variance measures the consistency (or variability) of the model prediction for a particular sample instance if we would retrain the model multiple times, for example, on different subsets of the training dataset. We can say that the model is sensitive to the *randomness* in the training data.  
* **Bias**: Bias measures how far off the predictions are from the correct values in general if we rebuild the model multiple times on different training datasets; bias is the measure of the *systematic error* that is not due to randomness.

## L2 regularization

$$\frac{\lambda}{2}\Vert{\mathbf{w}}\Vert^2 = \frac{\lambda}{2}\sum^m_{j=1}w^2_j$$

Here, $\lambda$ is the so-called regularization parameter.  
In order to apply regularization, we just need to add the regularization term to the cost function that we definied for logistic regression to shrink the weights:  
$$J(\mathbf{w}) = \sum^n_{i=1}[-y^{(i)}log(\phi(z^{(i)}))-(1-y^{(i)})log(1-\phi(z^{(i)}))]+\frac{\lambda}{2}\Vert w\Vert^2$$

Via the regularization parameter $\lambda$, we can then control how well we fit the training data while keeping the weights small. By increaseing the value of $\lambda$, we increase the regularization strength.

The parameter `C` that is implemented for the `LogisticRegression` class in scikit-learn comes from a convention in support vector machines. `C` is directly related to the regularization parameter $\lambda$, which is its inverse   
$$C = \frac{1}{\lambda}$$

**Question:** Did we mention that we could use L2 to handle overfitting in linear regression class?