## Decision Trees
source: Wikipedia

In statistics, Decision tree learning uses a decision tree (as a predictive model) to go from observations about an item (represented in the branches) to conclusions about the item's target value (represented in the leaves). It is one of the predictive modeling approaches used in statistics, data mining and machine learning. Tree models where the target variable can take a discrete set of values are called classification trees; in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. Decision trees where the target variable can take continuous values (typically real numbers) are called regression trees.

In decision analysis, a decision tree can be used to visually and explicitly represent decisions and decision making. In data mining, a decision tree describes data (but the resulting classification tree can be an input for decision making).

https://en.wikipedia.org/wiki/Decision_tree_learning

In [None]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from sklearn import tree

In [None]:
golf_data = pd.read_csv('golf_dataset.csv')

In [None]:
print('Dataset length:', len(golf_data))
print('Data shape:', golf_data.shape)

In [None]:
golf_data.head()

In [None]:
from sklearn import preprocessing

le_outlook = preprocessing.LabelEncoder()
le_outlook.fit(golf_data['outlook'].unique())
le_temp = preprocessing.LabelEncoder()
le_temp.fit(golf_data['temperature'].unique())
le_hum = preprocessing.LabelEncoder()
le_hum.fit(golf_data['humidity'].unique())
le_windy = preprocessing.LabelEncoder()
le_windy.fit(golf_data['windy'].unique())
le_play = preprocessing.LabelEncoder()
le_play.fit(golf_data['play'].unique())

le_golf_data = pd.DataFrame()

le_golf_data['outlook'] = le_outlook.transform(golf_data['outlook'])
le_golf_data['temperature'] = le_temp.transform(golf_data['temperature'])
le_golf_data['humidity'] = le_hum.transform(golf_data['humidity'])
le_golf_data['windy'] = le_windy.transform(golf_data['windy'])
le_golf_data['play'] = le_play.transform(golf_data['play'])
le_golf_data

### Gini 

Used by the CART (classification and regression tree) algorithm for classification trees, Gini impurity is a measure of how often a randomly chosen element from the set would be incorrectly labeled if it was randomly labeled according to the distribution of labels in the subset.  

The Gini impurity can be computed by summing the probability $p_{i}$ of an item with label $ i$ being chosen times the probability $ \sum _{k\neq i}p_{k}=1-p_{i}$ of a mistake in categorizing that item. It reaches its minimum (zero) when all cases in the node fall into a single target category.

To compute Gini impurity for a set of items with $ J$ classes, suppose $ i\in \{1,2,...,J\}$, and let $ p_{i}$ be the fraction of items labeled with class $ i in the set.

$ \operatorname {I} _{G}(p)=\sum _{i=1}^{J}p_{i}\sum _{k\neq i}p_{k}=\sum _{i=1}^{J}p_{i}(1-p_{i})=\sum _{i=1}^{J}(p_{i}-{p_{i}}^{2})=\sum _{i=1}^{J}p_{i}-\sum _{i=1}^{J}{p_{i}}^{2}=1-\sum _{i=1}^{J}{p_{i}}^{2}$