# Decision Trees

**Decision trees** are `supervised learning models used for problems involving classification and regression`. Tree models present a high flexibility that comes at a price: on one hand, `trees are able to capture complex non-linear relationships`; on the other hand, `they are prone to memorizing the noise present in a dataset`. By aggregating the predictions of trees that are trained differently, **ensemble methods** take advantage of the flexibility of trees while reducing their tendency to memorize noise. **Ensemble methods** are used across a variety of fields and have a proven track record of winning many machine learning competitions. 

In this notebook, you'll learn how to use Python to train **decision trees** and **tree-based models** with the user-friendly scikit-learn machine learning library. You'll understand the advantages and shortcomings of trees and demonstrate how ensembling can alleviate these shortcomings, all while practicing on real-world datasets. Finally, you'll also understand how to tune the most influential hyperparameters in order to get the most out of your models.

In [1]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
%matplotlib inline

path = 'data/dc21/'

## Classification and Regression Trees (CART)

CART are a set of supervised learning models used for problems involving classification and regression. In this chapter, you'll be introduced to the CART algorithm.

<img src="images/tree_class_01.png" alt="" style="width: 400px;"/>

<img src="images/tree_class_02.png" alt="" style="width: 400px;"/>

<img src="images/tree_class_03.png" alt="" style="width: 400px;"/>

<img src="images/tree_class_04.png" alt="" style="width: 400px;"/>


## Train your first classification tree

In this exercise you'll work with the [Wisconsin Breast Cancer Dataset](https://www.kaggle.com/uciml/breast-cancer-wisconsin-data) from the UCI machine learning repository. You'll predict whether a tumor is malignant or benign based on two features: the mean radius of the tumor (`radius_mean`) and its mean number of concave points (`concave points_mean`).

The dataset is already loaded in your workspace and is split into 80% train and 20% test. The feature matrices are assigned to `X_train` and `X_test`, while the arrays of labels are assigned to `y_train` and `y_test` where `class 1` corresponds to a malignant tumor and `class 0` corresponds to a benign tumor. To obtain reproducible results, we also defined a variable called SEED which is set to 1.