## Decision Tree

In [None]:
from sklearn import datasets
import numpy as np
iris = datasets.load_iris()
X = iris.data[:, [2, 3]]
y = iris.target

In [None]:
from sklearn.model_selection import train_test_split
""" The shuffling is done with this method"""
X_train, X_test, y_train, y_test = train_test_split(
  X, y, test_size=0.3, random_state=1, stratify=y)

In [None]:
# Sdandarization
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
sc.fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)

In [None]:
# visualize the newly data
from matplotlib.colors import ListedColormap
import matplotlib.pyplot as plt

def plot_decision_regions(X, y, classifier, test_idx=None,resolution=0.02):
  # setup marker generator and color map
  markers = ('o', 's', '^', 'v', '<')
  colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
  cmap = ListedColormap(colors[:len(np.unique(y))])

  # plot the decision surface
  x1_min, x1_max = X[:, 0].min() - 1, X[:, 0].max() + 1
  x2_min, x2_max = X[:, 1].min() - 1, X[:, 1].max() + 1
  xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution),np.arange(x2_min, x2_max, resolution))
  lab = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
  lab = lab.reshape(xx1.shape)
  plt.contourf(xx1, xx2, lab, alpha=0.3, cmap=cmap)
  plt.xlim(xx1.min(), xx1.max())
  plt.ylim(xx2.min(), xx2.max())

  # plot class examples
  for idx, cl in enumerate(np.unique(y)):
    plt.scatter(x=X[y == cl, 0],
      y=X[y == cl, 1],
      alpha=0.8,
      c=colors[idx],
      marker=markers[idx],
      label=f'Class {cl}',
      edgecolor='black')

  # highlight test examples
  if test_idx:
    # plot all examples
    X_test, y_test = X[test_idx, :], y[test_idx]
    plt.scatter(X_test[:, 0], X_test[:, 1],
      c='none', edgecolor='black', alpha=1.0,
      linewidth=1, marker='o',
      s=100, label='Test set')

Decision trees can build complex decision boundaries by dividing the feature space into rectangles.
However, we have to be careful since the deeper the decision tree, the more complex the decision
boundary becomes, which can easily result in overtting. Using scikit-learn, we will now train a de-
cision tree with a maximum depth of 4, using the Gini impurity as a criterion for impurity.

In [None]:
from sklearn.tree import DecisionTreeClassifier
tree_model = DecisionTreeClassifier(criterion='gini',max_depth=4,random_state=1)
tree_model.fit(X_train, y_train)
X_combined = np.vstack((X_train, X_test))
X_combined_std = np.vstack((X_train_std, X_test_std))
y_combined = np.hstack((y_train, y_test))
plot_decision_regions(X_combined,y_combined,classifier=tree_model,test_idx=range(105, 150))
plt.xlabel('Petal length [cm]')
plt.ylabel('Petal width [cm]')
plt.legend(loc='upper left')
plt.tight_layout()
plt.show()

A nice feature in scikit-learn is that it allows us to readily visualize the decision tree model aer train-
ing via the following code:

In [None]:
from sklearn import tree
feature_names = ['Sepal length', 'Sepal width','Petal length', 'Petal width']
tree.plot_tree(tree_model,feature_names=feature_names,filled=True)
plt.show()

Ensemble methods have gained huge popularity in applications of machine learning during the last
decade due to their good classication performance and robustness toward overtting. While we
are going to cover dierent ensemble methods, including bagging and boosting, later in Chapter 7,
Combining Dierent Models for Ensemble Learning, let’s discuss the decision tree-based random forest
algorithm, which is known for its good scalability and ease of use. A random forest can be considered
as an ensemble of decision trees. The idea behind a random forest is to average multiple (deep) deci-
sion trees that individually suer from high variance to build a more robust model that has a better
generalization performance and is less susceptible to overtting. The random forest algorithm can
be summarized in four simple steps:
1.Draw a random bootstrap sample of size n (randomly choose n examples from the training
dataset with replacement).
2.Grow a decision tree from the bootstrap sample. At each node:
a.Randomly select d features without replacement.
b.Split the node using the feature that provides the best split according to the objective
function, for instance, maximizing the information gain.
3.Repeat steps 1-2 k times.
4.Aggregate the prediction by each tree to assign the class label by majority vote

In [None]:
from sklearn.ensemble import RandomForestClassifier
forest = RandomForestClassifier(n_estimators=25,random_state=1,n_jobs=2)
forest.fit(X_train, y_train)
plot_decision_regions(X_combined, y_combined,classifier=forest, test_idx=range(105,150))
plt.xlabel('Petal length [cm]')
plt.ylabel('Petal width [cm]')
plt.legend(loc='upper left')
plt.tight_layout()
plt.show()

Using the preceding code, we trained a random forest from 25 decision trees via the n_estimators
parameter. By default, it uses the Gini impurity measure as a criterion to split the nodes. Although
we are growing a very small random forest from a very small training dataset, we used the n_jobs pa-
rameter for demonstration purposes, which allows us to parallelize the model training using multiple
cores of our computer (here, two cores). If you encounter errors with this code, your computer may
not support multiprocessing. You can omit the n_jobs parameter or set it to n_jobs=None.

## K-nearest neighbors

In [None]:
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=5, p=2,metric='minkowski')
knn.fit(X_train_std, y_train)
plot_decision_regions(X_combined_std, y_combined,classifier=knn, test_idx=range(105,150))
plt.xlabel('Petal length [standardized]')
plt.ylabel('Petal width [standardized]')
plt.legend(loc='upper left')
plt.tight_layout()
plt.show()