# 04. Machine Learning with Tree-Based Models in Python

## 1. Classification and Regression Trees

### Decision tree for classification

#### Train your first classification tree

In [None]:
# Import DecisionTreeClassifier from sklearn.tree
from sklearn.tree import DecisionTreeClassifier

# Instantiate a DecisionTreeClassifier 'dt' with a maximum depth of 6
dt = DecisionTreeClassifier(max_depth=6, random_state=SEED)

# Fit dt to the training set
dt.fit(X_train, y_train)

# Predict test set labels
y_pred = dt.predict(X_test)
print(y_pred[0:5])

#### Evaluate the classification tree

In [None]:
# Import accuracy_score
from sklearn.metrics import accuracy_score

# Predict test set labels
y_pred = dt.predict(X_test)

# Compute test set accuracy  
acc = accuracy_score(y_test, y_pred)
print("Test set accuracy: {:.2f}".format(acc))

#### Logistic regression vs classification tree


In [None]:
# Import LogisticRegression from sklearn.linear_model
from sklearn.linear_model import  LogisticRegression

# Instatiate logreg
logreg = LogisticRegression(random_state=1)

# Fit logreg to the training set
logreg.fit(X_train, y_train)

# Define a list called clfs containing the two classifiers logreg and dt
clfs = [logreg, dt]

# Review the decision regions of the two classifiers
plot_labeled_decision_regions(X_test, y_test, clfs)

![img](./Logistic_regression_vs_classification_tree.svg)

### Classification tree Learning


#### Growing a classification tree


In the video, you saw that the growth of an unconstrained classification tree followed a few simple rules. Which of the following is not one of these rules?

1. The existence of a node depends on the state of its predecessors. (True)  
2. The impurity of a node can be determined using different criteria such as entropy and the gini-index.(True)
3. When the information gain resulting from splitting a node is null, the node is declared as a leaf. (True) 
4. When an internal node is split, the split is performed in such a way so that information gain is minimized. (False)


* Actually, splitting an internal node always involves maximizing information gain!

#### Using entropy as a criterion


In [None]:
# Import DecisionTreeClassifier from sklearn.tree
from sklearn.tree import DecisionTreeClassifier

# Instantiate dt_entropy, set 'entropy' as the information criterion
dt_entropy = DecisionTreeClassifier(max_depth=8, criterion='entropy', random_state=1)

# Fit dt_entropy to the training set
dt_entropy.fit(X_train, y_train)

#### Entropy vs Gini index


In [None]:
# Import accuracy_score from sklearn.metrics
from sklearn.metrics import accuracy_score

# Use dt_entropy to predict test set labels
y_pred= dt_entropy.predict(X_test)

# Evaluate accuracy_entropy
accuracy_entropy = accuracy_score(y_test, y_pred)

# Print accuracy_entropy
print(f'Accuracy achieved by using entropy: {accuracy_entropy:.3f}')

# Print accuracy_gini
print(f'Accuracy achieved by using the gini index: {accuracy_gini:.3f}')

### Decision tree for regression


#### Train your first regression tree


In [None]:
# Import DecisionTreeRegressor from sklearn.tree
from sklearn.tree import DecisionTreeRegressor

# Instantiate dt
dt = DecisionTreeRegressor(max_depth=8,
                            min_samples_leaf=0.13,
                            random_state=3)

# Fit dt to the training set
dt.fit(X_train, y_train)

#### Evaluate the regression tree


In [None]:
# Import mean_squared_error from sklearn.metrics as MSE
from sklearn.metrics import mean_squared_error as MSE

# Compute y_pred
y_pred = dt.predict(X_test)

# Compute mse_dt
mse_dt = MSE(y_test, y_pred)

# Compute rmse_dt
rmse_dt = mse_dt ** (1/2)

# Print rmse_dt
print("Test set RMSE of dt: {:.2f}".format(rmse_dt))

#### Linear regression vs regression tree


In [None]:
# Predict test set labels 
y_pred_lr = lr.predict(X_test)

# Compute mse_lr
mse_lr = MSE(y_test, y_pred_lr)

# Compute rmse_lr
rmse_lr = mse_lr ** (1/2)

# Print rmse_lr
print('Linear Regression test set RMSE: {:.2f}'.format(rmse_lr))

# Print rmse_dt
print('Regression Tree test set RMSE: {:.2f}'.format(rmse_dt))

## 2. The Bias-Variance Tradeoff


### Generalization Error


#### Complexity, bias and variance


#### Overfitting and underfitting


### Diagnose bias and variance problems


#### Instantiate the model


#### Evaluate the 10-fold CV error


#### Evaluate the training error


#### High bias or high variance?


### Ensemble Learning


#### Define the ensemble


#### Evaluate individual classifiers


#### Better performance with a Voting Classifier


## 3. Bagging and Random Forests


### Bagging


* Bootstrap Aggregation

#### Define the bagging classifier


#### Evaluate Bagging performance


### Out of Bag Evaluation


#### Prepare the ground


#### OOB Score vs Test Set Score


### Random Forests (RF)


#### Train an RF regressor


#### Evaluate the RF regressor


#### Visualizing features importances


## 4. Boosting


### Adaboost


#### Define the AdaBoost classifier


#### Train the AdaBoost classifier


#### Evaluate the AdaBoost classifier


### Gradient Boosting (GB)


#### Define the GB regressor


#### Train the GB regressor


#### Evaluate the GB regressor


### Stochastic Gradient Boosting (SGB)


#### Regression with SGB


#### Train the SGB regressor


#### Evaluate the SGB regressor


## 5. Model Tuning


### Tuning a CART's Hyperparameters


#### Tree hyperparameters


#### Set the tree's hyperparameter grid


#### Search for the optimal tree


#### Evaluate the optimal tree


### Tuning a RF's Hyperparameters


#### Random forests hyperparameters


#### Set the hyperparameter grid of RF


#### Search for the optimal forest


#### Evaluate the optimal forest
