# Decision Tree Algorithm
 
Decision Tree algorithm belongs to the family of supervised learning algorithms. Unlike other supervised learning algorithms, the decision tree algorithm can be used for solving regression and classification problems too.

The goal of using a Decision Tree is to create a training model that can use to predict the class or value of the target variable by learning simple decision rules inferred from prior data(training data).

In Decision Trees, for predicting a class label for a record we start from the root of the tree. We compare the values of the root attribute with the record’s attribute. On the basis of comparison, we follow the branch corresponding to that value and jump to the next node.

# Inserting Libraries

In [26]:
# Import the required libraries
import pandas as pd
import numpy as np

# Importing Given File

In [18]:
imported_data_csv = pd.read_csv('car.data.txt',encoding = 'utf-8',header = None)
imported_data_csv.head()


Unnamed: 0,0,1,2,3,4,5,6
0,vhigh,vhigh,2,2,small,low,unacc
1,vhigh,vhigh,2,2,small,med,unacc
2,vhigh,vhigh,2,2,small,high,unacc
3,vhigh,vhigh,2,2,med,low,unacc
4,vhigh,vhigh,2,2,med,med,unacc


# Putting Imported Data in Data Frame

In [20]:
data_frame_pd = {'buyin':list(imported_data_csv[0]), 'maint':list(imported_data_csv[1]), 'doors':list(imported_data_csv[2]), 'person':list(imported_data_csv[3]), 'lug-boot':list(imported_data_csv[4]), 'safety':list(imported_data_csv[5]), 'unacc':list(imported_data_csv[6])}
data_frame_pd = pd.DataFrame(data_frame_pd)

In [21]:
data_frame_pd.shape

(1728, 7)

# Information gain calculates the reduction in entropy or surprise from transforming a dataset in some way.

It is commonly used in the construction of decision trees from a training dataset, by evaluating the information gain for each variable, and selecting the variable that maximizes the information gain, which in turn minimizes the entropy and best splits the dataset into groups for effective classification.

Information gain can also be used for feature selection, by evaluating the gain of each variable in the context of the target variable. In this slightly different usage, the calculation is referred to as mutual information between the two random variables.

In [28]:
def calculate_entropy_for_attributes(df_label):
    classes_variable,count_of_classes = np.unique(df_label,return_counts = True)
    entropy_value = np.sum([(-count_of_classes[i]/np.sum(count_of_classes))*np.log2(count_of_classes[i]/np.sum(count_of_classes)) 
                        for i in range(len(classes_variable))])
    return entropy_value

# What Is Information Gain?
Information Gain, or IG for short, measures the reduction in entropy or surprise by splitting a dataset according to a given value of a random variable.

A larger information gain suggests a lower entropy group or groups of samples, and hence less surprise.

You might recall that information quantifies how surprising an event is in bits. Lower probability events have more information, higher probability events have less information. Entropy quantifies how much information there is in a random variable, or more specifically its probability distribution. A skewed distribution has a low entropy, whereas a distribution where events have equal probability has a larger entropy.

In information theory, we like to describe the “surprise” of an event. Low probability events are more surprising therefore have a larger amount of information. Whereas probability distributions where the events are equally likely are more surprising and have larger entropy.

In [23]:
# Define the calculate information gain function
def calculate_information_gain(dataset,feature,label): 
    # Calculate the dataset entropy
    dataset_entropy = calculate_entropy(dataset[label])   
    values,feat_counts= np.unique(dataset[feature],return_counts=True)
    
    # Calculate the weighted feature entropy                                # Call the calculate_entropy function
    weighted_feature_entropy = np.sum([(feat_counts[i]/np.sum(feat_counts))*calculate_entropy(dataset.where(dataset[feature]
                              ==values[i]).dropna()[label]) for i in range(len(values))])    
    feature_info_gain = dataset_entropy - weighted_feature_entropy
    return feature_info_gain

In [24]:
"""
This In-depth Tutorial Explains All About Decision Tree Algorithm In Data Mining.
You will Learn About Decision Tree Examples, Algorithm & Classification:
We had a look at a couple of Data Mining Examples in our previous tutorial in Free Data Mining Training Series.
Decision Tree Mining is a type of data mining technique that is used to build Classification Models.
It builds classification models in the form of a tree-like structure, just like its name.
This type of mining belongs to supervised class learning.
In supervised learning, the target result is already known.
Decision trees can be used for both categorical and numerical data.
The categorical data represent gender, marital status, etc. while the numerical data represent age, temperature, etc.
"""
def create_decision_tree(dataset,df,feature,label,parent):
  datum=np.unique(df[label],return_counts=True)
  #print(datum[1])
  unique_data=np.unique(dataset[label])
  #print(unique_data)

  if len(unique_data)<=1:
    return unique_data[0]

  elif(len(dataset)==0):
    return unique_data[np.argmax(datum[1])]

  elif(len(features)==0):
    return parent
  
  else:
    
    max=0
    for k in range(0,len(unique_data)):
      if(max<datum[1][k]):
        max=datum[1][k]
        idx=k
    #print(np.argmax(datum[1]))
    parent = unique_data[idx]
    item_values = [calculate_information_gain(dataset,feature,label) for feature in features]
    
    optimum_feature_index = np.argmax(item_values)
    optimum_feature = features[optimum_feature_index]
    decision_tree = {optimum_feature:{}}
    for value in np.unique(dataset[optimum_feature]):
      min_data = dataset.where(dataset[optimum_feature] == value).dropna()

      min_tree = create_decision_tree(min_data,df,features,label,parent)

      decision_tree[optimum_feature][value] = min_tree

    return(decision_tree)
features = df.columns[:-1]
label = 'unacc'
parent=None
features
decision_tree = create_decision_tree(df,df,features,label,parent)


In [25]:
print(decision_tree)

{'safety': {'high': {'person': {'2': 'unacc', '4': {'buyin': {'high': {'maint': {'high': 'acc', 'low': 'acc', 'med': 'acc', 'vhigh': 'unacc'}}, 'low': {'maint': {'high': {'lug-boot': {'big': 'vgood', 'med': {'doors': {'2': 'acc', '3': 'acc', '4': 'vgood', '5more': 'vgood'}}, 'small': 'acc'}}, 'low': {'lug-boot': {'big': 'vgood', 'med': {'doors': {'2': 'good', '3': 'good', '4': 'vgood', '5more': 'vgood'}}, 'small': 'good'}}, 'med': {'lug-boot': {'big': 'vgood', 'med': {'doors': {'2': 'good', '3': 'good', '4': 'vgood', '5more': 'vgood'}}, 'small': 'good'}}, 'vhigh': 'acc'}}, 'med': {'maint': {'high': 'acc', 'low': {'lug-boot': {'big': 'vgood', 'med': {'doors': {'2': 'good', '3': 'good', '4': 'vgood', '5more': 'vgood'}}, 'small': 'good'}}, 'med': {'lug-boot': {'big': 'vgood', 'med': {'doors': {'2': 'acc', '3': 'acc', '4': 'vgood', '5more': 'vgood'}}, 'small': 'acc'}}, 'vhigh': 'acc'}}, 'vhigh': {'maint': {'high': 'unacc', 'low': 'acc', 'med': 'acc', 'vhigh': 'unacc'}}}}, 'more': {'buyin':