#### Classification Tree

- Tree-based modeling is structured on **"a sequence of if-else questions"** about individual features.
- The objective is to infer the class labels
- Able to capture **non-linear** relationships between features and labels.
- **Don't require feature scaling.**

In [None]:
#import decision tree classifier and other necessary packages
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

#split the dataset into train/split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
                                                   stratify=y,
                                                    #for labels to be distributed across sets
                                                   random_state=1)

#instantiate the classifier
dtc = DecisionTreeClassifier(max_depth=2,  
                             #maximum number of branches separating the top from an extreme end
                            random_state=1) #reproducibility

#fitting on the training split
dtc.fit(X_train, y_train)

#predict the test set for prediction of labels
y_pred = dtc.predict(X_test)

#evaluate the test-set accuracy by comparing the predicted labels to the test split labels
accuracy_score(y_test, y_pred)

#### Decision Regions

- Region in the feature space where all instances are assigned to one class label
- separated by surfaces separated the different decision regions, called **decision boundary**
- produces **rectangular** decision-regions in the feature-space.

Example - ![image.png](attachment:image.png)

#### How a Decision Tree Learns from Data

Decision Trees have a hierarchy of **nodes**: which represents a decision fork (or question) in the prediction of the labels

##### Other Terminology
- **Root**: no parent node | gives rise to two children nodes
- **Internal Node**: one parent node | gives rise to two children notes
- **Leaf**: one parent node | no children nodes, leads to prediction

##### Using Entropy as  Criterion

In [None]:
from sklearn.tree import DecisionTreeClassifier
dt_entropy = DecisionTreeClassifier(max_depth=8, 
                                    criterion='entropy', random_state=1)
dt_entropy.fit(X_train, y_train)

##### Entropy vs. Gini index

In [None]:
from sklearn.tree import DecisionTreeClassifier
dt_gini = DecisionTreeClassifier(max_depth=8, 
                                    criterion='gini', random_state=1)
dt_gini.fit(X_train, y_train)


from sklearn.metrics import accuracy_score
y_pred = dt_entropy.predict(X_test)
accuracy_entropy = accuracy_score(y_test, y_pred)

print('Accuracy achieved by using entropy: ', accuracy_entropy)
print('Accuracy achieved by using the gini index: ', accuracy_gini)

### Limitations of CARTs

- Classification can only produce orthogonal decision boundaries
- Sensitive to small variatinos in training set
- May suffer from high variance when unconstrained CARTs may overfit the training set

SOLUTION **Ensemble Learning**