# 1. Introduction
## 1.1 Definition
## Bias
**Bias** in machine learning refers to the systematic errors or inaccuracies that arise from the training data used to build a model. These errors can lead to unfair or discriminatory outcomes when the model is applied to real-world situations.

* **High bias:** When a model is too simple, it may make overly strong assumptions about the data. This often leads to underfitting, where the model fails to capture the underlying patterns in the data. For example, a linear model might have high bias if it's used to approximate a non-linear relationship.

* **Low bias:** A more complex model may have low bias, meaning it can capture intricate patterns in the data, but this comes at the cost of higher variance (it may overfit the data and perform poorly on unseen data).

## Variance
**Variance** refers to the model's sensitivity to small fluctuations in the training data. A model with high variance pays too much attention to the noise or details of the training data, which can lead to **overfitting.**

**Overfitting** occurs when the model performs well on the training data but poorly on new, unseen data because it has memorized the specific patterns (and noise) in the training set rather than learning generalizable patterns.

Here's a breakdown of how variance works:

* **High variance:** When a model is too complex, it can fit the training data almost perfectly, capturing not only the true underlying patterns but also the noise or random fluctuations. This results in overfitting, where the model has poor generalization to new data because it is too tuned to the specificities of the training data.

* **Low variance:** A model with low variance makes smoother and more general predictions, potentially ignoring small fluctuations in the data. However, if the variance is too low, the model might underfit the data, failing to capture important details.

## Bias Variance Tradeoff
The **bias-variance trade-off** is a fundamental concept in machine learning that describes the relationship between a model's ability to fit the training data (**bias**) and its ability to generalize to new data (**variance**).

### Key Tradeoff:
* **High bias, low variance:** A simple model with high bias might not fit the training data well (underfitting), but it will likely generalize better because it’s less sensitive to changes in the training data.
* **Low bias, high variance:** A complex model with low bias might fit the training data perfectly (overfitting), but it will perform poorly on new data because it’s too sensitive to variations in the training data.

**Thanks to :** [Bias and Variance in Machine Learning](https://www.bmc.com/blogs/bias-variance-machine-learning/)

# Reference:
- [bias_variance_decomp](https://rasbt.github.io/mlxtend/api_subpackages/mlxtend.evaluate/#bias_variance_decomp)
- [iris_data](https://rasbt.github.io/mlxtend/api_subpackages/mlxtend.data/#iris_data)
- [sklearn.model_selection.train_test_split](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html)
- [sklearn.tree.DecisionTreeClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html)
- [sklearn.ensemble.BaggingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html)

## Decision Tree example

### [Issue while training: AttributeError: module 'numpy' has no attribute 'int'](https://github.com/WongKinYiu/yolov7/issues/1280)

In [None]:
!pip install "numpy<1.24.0"



In [None]:
from mlxtend.evaluate import bias_variance_decomp
from mlxtend.data import iris_data
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import BaggingClassifier

In [None]:
# Get the iris flower data set
X, y = iris_data()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,
                              random_state=123, shuffle=True, stratify=y)

# Define Algorithm
tree = DecisionTreeClassifier(random_state=123)

# Get Bias and Variance - bias_variance_decomp function
avg_exp_loss, avg_bias, avg_var = bias_variance_decomp(tree, X_train, y_train,
            X_test, y_test, loss='0-1_loss', random_seed=123, num_rounds=10000)

# Display Bias and Variance
print(f'Average Expected Loss: {round(avg_exp_loss, 4)}')
print(f'Average Bias: {round(avg_bias, 4)}')
print(f'Average Variance: {round(avg_var, 4)}')

Average Expected Loss: 0.063
Average Bias: 0.0222
Average Variance: 0.0416


## Bagging Classifier example

In [None]:
# Get the iris flower data set
X, y = iris_data()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,
                              random_state=123, shuffle=True, stratify=y)

# Define Algorithm
tree = DecisionTreeClassifier(random_state=123)
bag = BaggingClassifier(estimator=tree, n_estimators=100, random_state=123)

# Get Bias and Variance - bias_variance_decomp function
avg_exp_loss, avg_bias, avg_var = bias_variance_decomp(bag, X_train, y_train,
            X_test, y_test, loss='0-1_loss', random_seed=123, num_rounds=10000)

# Display Bias and Variance
print(f'Average Expected Loss: {round(avg_exp_loss, 4)}')
print(f'Average Bias: {round(avg_bias, 4)}')
print(f'Average Variance: {round(avg_var, 4)}')

Average Expected Loss: 0.0468
Average Bias: 0.0222
Average Variance: 0.0248
