<a href="https://colab.research.google.com/github/13286733-u/UTS_ML_2019/blob/master/A2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction



The Decision Tree algorithm is one of the most well-known machine learning algorithms available. The decision tree, as stated in the name, uses a tree-like model to determine specific outcomes or decisions. Decision trees are also a “greedy” algorithm, which means that it attempts to make the most locally optimum choice at each stage. Although this algorithm can is useful for both classification and regression models, this report will mainly focus on the implementation of a classification decision tree. In this report, I will cover a basic implementation of the ID3 decision tree, or otherwise known as the “Iterative Dichotomiser 3” decision tree. This type of decision tree was invented by Ross Quinlan and is the processor to the C4.5 decision tree. When using the ID3 decision tree as a classifier, input data can be either numerical or categorical, and the outcome is usually used to determine a category or Boolean value.






# Exploration


Although I have had previous experience in programming, I was not entirely prepared to implement a machine learning algorithm from scratch. Furthermore, all my previous experience around programming was web or app-based, which was little help for this project which was my first challenge. To correctly implement the ID3 algorithm, I would have to skim through the Python documentation and teach myself throughout this project. The next problem was the actual implementation. Even though I had a rough understanding of how the ID3 decision tree worked, it wasn’t as straight forward to implement it in code.
 
To properly implement the ID3 decision tree, two metrics would need to be calculated and utilized: Entropy and Information Gain. 

Entropy is a thermodynamic quantity used to measure importance relative to its size which uses the formulas below.

![Wikipedia Entropy Description](https://miro.medium.com/max/3104/1*EoWJ8bxc-iqBS-dF-XxsBA.jpeg)

To visualize this we can use the target of "playing golf" and the below figure.

![Entropy Formula In Action](https://www.saedsayad.com/images/Entropy_3.png)

Information Gain, derived from entropy calculations, is used to compute the amount of information gained from a variable while observing another variable.

![Wikipedia Information Gain](https://miro.medium.com/max/3162/1*wQjVzx7zCVb87htqk46vUA.jpeg)
Source: Wikipedia

These two metrics would need to be calculated for every single attribute based on their information gain weighting and used to create a decision tree.


![Entropy with two attributes](https://www.saedsayad.com/images/Entropy_2.png)


 


# Methodology

To assist in implementation I have created a mockup data set as seen in the screenshot below. The iris data set has also been loaded for evaluation purposes.

![Test Data Set](https://i.imgur.com/fixdTQ7.png)

Along this report there will be code snippets which have been altered to run properly on Google colab. The full script has also been attached to this report and Git.

To start the data needs to be loaded into an array. In the real world, certain variables need to be changed when running the algorithm.



In [0]:
# Imports and Setup
# Please run this once before everything else

import math
import os

In [133]:


# This is the same data as the picture above
# In the real program this would be loaded via csv

'''loaded_data = [ 
    [ 'Outlook', 'Temp', 'Humidity', 'Windy', 'Play' ],
    [ 'Wet', 'Hot', 'High', False, 'NO' ],
    [ 'Wet', 'Hot', 'High', True, 'NO' ],
    [ 'Sunny', 'Hot', 'High', False, 'YES' ],
    [ 'Cloudy', 'Mild', 'High', False, 'YES' ],
    [ 'Cloudy', 'Cool', 'Normal', False, 'YES' ],
    [ 'Cloudy', 'Cool', 'Normal', True, 'NO' ],
    [ 'Sunny', 'Cool', 'Normal', True, 'YES' ],
    [ 'Wet', 'Mild', 'High', False, 'NO' ],
    [ 'Wet', 'Cool', 'Normal', False, 'YES' ],
    [ 'Cloudy', 'Mild', 'Normal', False, 'YES' ],
    [ 'Wet', 'Mild', 'Normal', True, 'YES' ],
    [ 'Sunny', 'Mild', 'High', True, 'YES' ],
    [ 'Sunny', 'Hot', 'Normal', False, 'YES' ],
    [ 'Cloudy', 'Mild', 'High', True, 'NO' ] 
]'''

# THIS IS FOR EVALUATION PURPOSES ONLY COMMENT
# THIS OUT FOR OTHER USES
# USE '''
# IF YOU COMMENT THIS OUT YOU WILL NEED TO CHANGE 
# THE ATTRIBUTES AND TARGET AS WELL
loaded_data = [
[ 'sepal_length','sepal_width','petal_length','petal_width','species' ],
[ 5.1, 3.5, 1.4, 0.2, 'setosa' ],
[ 4.9, 3, 1.4, 0.2, 'setosa' ],
[ 4.7, 3.2, 1.3, 0.2, 'setosa' ],
[ 4.6, 3.1, 1.5, 0.2, 'setosa' ],
[ 5, 3.6, 1.4, 0.2, 'setosa' ],
[ 5.4, 3.9, 1.7, 0.4, 'setosa' ],
[ 4.6, 3.4, 1.4, 0.3, 'setosa' ],
[ 5, 3.4, 1.5, 0.2, 'setosa' ],
[ 4.4, 2.9, 1.4, 0.2, 'setosa' ],
[ 4.9, 3.1, 1.5, 0.1, 'setosa' ],
[ 5.4, 3.7, 1.5, 0.2, 'setosa' ],
[ 4.8, 3.4, 1.6, 0.2, 'setosa' ],
[ 4.8, 3, 1.4, 0.1, 'setosa' ],
[ 4.3, 3, 1.1, 0.1, 'setosa' ],
[ 5.8, 4, 1.2, 0.2, 'setosa' ],
[ 5.7, 4.4, 1.5, 0.4, 'setosa' ],
[ 5.4, 3.9, 1.3, 0.4, 'setosa' ],
[ 5.1, 3.5, 1.4, 0.3, 'setosa' ],
[ 5.7, 3.8, 1.7, 0.3, 'setosa' ],
[ 5.1, 3.8, 1.5, 0.3, 'setosa' ],
[ 5.4, 3.4, 1.7, 0.2, 'setosa' ],
[ 5.1, 3.7, 1.5, 0.4, 'setosa' ],
[ 4.6, 3.6, 1, 0.2, 'setosa' ],
[ 5.1, 3.3, 1.7, 0.5, 'setosa' ],
[ 4.8, 3.4, 1.9, 0.2, 'setosa' ],
[ 5, 3, 1.6, 0.2, 'setosa' ],
[ 5, 3.4, 1.6, 0.4, 'setosa' ],
[ 5.2, 3.5, 1.5, 0.2, 'setosa' ],
[ 5.2, 3.4, 1.4, 0.2, 'setosa' ],
[ 4.7, 3.2, 1.6, 0.2, 'setosa' ],
[ 4.8, 3.1, 1.6, 0.2, 'setosa' ],
[ 5.4, 3.4, 1.5, 0.4, 'setosa' ],
[ 5.2, 4.1, 1.5, 0.1, 'setosa' ],
[ 5.5, 4.2, 1.4, 0.2, 'setosa' ],
[ 4.9, 3.1, 1.5, 0.1, 'setosa' ],
[ 5, 3.2, 1.2, 0.2, 'setosa' ],
[ 5.5, 3.5, 1.3, 0.2, 'setosa' ],
[ 4.9, 3.1, 1.5, 0.1, 'setosa' ],
[ 4.4, 3, 1.3, 0.2, 'setosa' ],
[ 5.1, 3.4, 1.5, 0.2, 'setosa' ],
[ 5, 3.5, 1.3, 0.3, 'setosa' ],
[ 4.5, 2.3, 1.3, 0.3, 'setosa' ],
[ 4.4, 3.2, 1.3, 0.2, 'setosa' ],
[ 5, 3.5, 1.6, 0.6, 'setosa' ],
[ 5.1, 3.8, 1.9, 0.4, 'setosa' ],
[ 4.8, 3, 1.4, 0.3, 'setosa' ],
[ 5.1, 3.8, 1.6, 0.2, 'setosa' ],
# [ 4.6, 3.2, 1.4, 0.2, 'setosa' ],
[ 5.3, 3.7, 1.5, 0.2, 'setosa' ],
# [ 5, 3.3, 1.4, 0.2, 'setosa' ],
[ 7, 3.2, 4.7, 1.4, 'versicolor' ],
[ 6.4, 3.2, 4.5, 1.5, 'versicolor' ],
[ 6.9, 3.1, 4.9, 1.5, 'versicolor' ],
[ 5.5, 2.3, 4, 1.3, 'versicolor' ],
[ 6.5, 2.8, 4.6, 1.5, 'versicolor' ],
[ 5.7, 2.8, 4.5, 1.3, 'versicolor' ],
[ 6.3, 3.3, 4.7, 1.6, 'versicolor' ],
[ 4.9, 2.4, 3.3, 1, 'versicolor' ],
[ 6.6, 2.9, 4.6, 1.3, 'versicolor' ],
[ 5.2, 2.7, 3.9, 1.4, 'versicolor' ],
[ 5, 2, 3.5, 1, 'versicolor' ],
[ 5.9, 3, 4.2, 1.5, 'versicolor' ],
[ 6, 2.2, 4, 1, 'versicolor' ],
[ 6.1, 2.9, 4.7, 1.4, 'versicolor' ],
[ 5.6, 2.9, 3.6, 1.3, 'versicolor' ],
[ 6.7, 3.1, 4.4, 1.4, 'versicolor' ],
[ 5.6, 3, 4.5, 1.5, 'versicolor' ],
[ 5.8, 2.7, 4.1, 1, 'versicolor' ],
[ 6.2, 2.2, 4.5, 1.5, 'versicolor' ],
[ 5.6, 2.5, 3.9, 1.1, 'versicolor' ],
[ 5.9, 3.2, 4.8, 1.8, 'versicolor' ],
[ 6.1, 2.8, 4, 1.3, 'versicolor' ],
[ 6.3, 2.5, 4.9, 1.5, 'versicolor' ],
[ 6.1, 2.8, 4.7, 1.2, 'versicolor' ],
[ 6.4, 2.9, 4.3, 1.3, 'versicolor' ],
[ 6.6, 3, 4.4, 1.4, 'versicolor' ],
[ 6.8, 2.8, 4.8, 1.4, 'versicolor' ],
[ 6.7, 3, 5, 1.7, 'versicolor' ],
[ 6, 2.9, 4.5, 1.5, 'versicolor' ],
[ 5.7, 2.6, 3.5, 1, 'versicolor' ],
[ 5.5, 2.4, 3.8, 1.1, 'versicolor' ],
[ 5.5, 2.4, 3.7, 1, 'versicolor' ],
[ 5.8, 2.7, 3.9, 1.2, 'versicolor' ],
[ 6, 2.7, 5.1, 1.6, 'versicolor' ],
[ 5.4, 3, 4.5, 1.5, 'versicolor' ],
[ 6, 3.4, 4.5, 1.6, 'versicolor' ],
[ 6.7, 3.1, 4.7, 1.5, 'versicolor' ],
[ 6.3, 2.3, 4.4, 1.3, 'versicolor' ],
[ 5.6, 3, 4.1, 1.3, 'versicolor' ],
[ 5.5, 2.5, 4, 1.3, 'versicolor' ],
[ 5.5, 2.6, 4.4, 1.2, 'versicolor' ],
[ 6.1, 3, 4.6, 1.4, 'versicolor' ],
[ 5.8, 2.6, 4, 1.2, 'versicolor' ],
[ 5, 2.3, 3.3, 1, 'versicolor' ],
[ 5.6, 2.7, 4.2, 1.3, 'versicolor' ],
[ 5.7, 3, 4.2, 1.2, 'versicolor' ],
[ 5.7, 2.9, 4.2, 1.3, 'versicolor' ],
# [ 6.2, 2.9, 4.3, 1.3, 'versicolor' ],
[ 5.1, 2.5, 3, 1.1, 'versicolor' ],
# [ 5.7, 2.8, 4.1, 1.3, 'versicolor' ],
[ 6.3, 3.3, 6, 2.5, 'virginica' ],
[ 5.8, 2.7, 5.1, 1.9, 'virginica' ],
[ 7.1, 3, 5.9, 2.1, 'virginica' ],
[ 6.3, 2.9, 5.6, 1.8, 'virginica' ],
[ 6.5, 3, 5.8, 2.2, 'virginica' ],
[ 7.6, 3, 6.6, 2.1, 'virginica' ],
[ 4.9, 2.5, 4.5, 1.7, 'virginica' ],
[ 7.3, 2.9, 6.3, 1.8, 'virginica' ],
[ 6.7, 2.5, 5.8, 1.8, 'virginica' ],
[ 7.2, 3.6, 6.1, 2.5, 'virginica' ],
[ 6.5, 3.2, 5.1, 2, 'virginica' ],
[ 6.4, 2.7, 5.3, 1.9, 'virginica' ],
[ 6.8, 3, 5.5, 2.1, 'virginica' ],
[ 5.7, 2.5, 5, 2, 'virginica' ],
[ 5.8, 2.8, 5.1, 2.4, 'virginica' ],
[ 6.4, 3.2, 5.3, 2.3, 'virginica' ],
[ 6.5, 3, 5.5, 1.8, 'virginica' ],
[ 7.7, 3.8, 6.7, 2.2, 'virginica' ],
[ 7.7, 2.6, 6.9, 2.3, 'virginica' ],
[ 6, 2.2, 5, 1.5, 'virginica' ],
[ 6.9, 3.2, 5.7, 2.3, 'virginica' ],
[ 5.6, 2.8, 4.9, 2, 'virginica' ],
[ 7.7, 2.8, 6.7, 2, 'virginica' ],
[ 6.3, 2.7, 4.9, 1.8, 'virginica' ],
[ 6.7, 3.3, 5.7, 2.1, 'virginica' ],
[ 7.2, 3.2, 6, 1.8, 'virginica' ],
[ 6.2, 2.8, 4.8, 1.8, 'virginica' ],
[ 6.1, 3, 4.9, 1.8, 'virginica' ],
[ 6.4, 2.8, 5.6, 2.1, 'virginica' ],
[ 7.2, 3, 5.8, 1.6, 'virginica' ],
[ 7.4, 2.8, 6.1, 1.9, 'virginica' ],
[ 7.9, 3.8, 6.4, 2, 'virginica' ],
[ 6.4, 2.8, 5.6, 2.2, 'virginica' ],
[ 6.3, 2.8, 5.1, 1.5, 'virginica' ],
[ 6.1, 2.6, 5.6, 1.4, 'virginica' ],
[ 7.7, 3, 6.1, 2.3, 'virginica' ],
[ 6.3, 3.4, 5.6, 2.4, 'virginica' ],
[ 6.4, 3.1, 5.5, 1.8, 'virginica' ],
[ 6, 3, 4.8, 1.8, 'virginica' ],
[ 6.9, 3.1, 5.4, 2.1, 'virginica' ],
[ 6.7, 3.1, 5.6, 2.4, 'virginica' ],
[ 6.9, 3.1, 5.1, 2.3, 'virginica' ],
[ 5.8, 2.7, 5.1, 1.9, 'virginica' ],
[ 6.8, 3.2, 5.9, 2.3, 'virginica' ],
[ 6.7, 3.3, 5.7, 2.5, 'virginica' ],
[ 6.7, 3, 5.2, 2.3, 'virginica' ],
[ 6.3, 2.5, 5, 1.9, 'virginica' ],
# [ 6.5, 3, 5.2, 2, 'virginica' ],
[ 6.2, 3.4, 5.4, 2.3, 'virginica' ],
# [ 5.9, 3, 5.1, 1.8, 'virginica' ]
]
# Notify user of how many rows have been loaded
print("Successfully loaded {0} rows!".format(len(loaded_data)))

Successfully loaded 151 rows!


After loading the dataset into our script, it is also necessary to configure it properly to train on the right attributes for the right target.

In [134]:
# Define the attributes to be used for testing (this provides
# extra freedom to the user instead of editing their dataset
# many times - and also if there are any IDs)

# In this scenario we want to use all our attributes
#attributes = [ 'Outlook', 'Temp', 'Humidity', 'Windy', 'Play' ]
attributes = ['sepal_length','sepal_width','petal_length','petal_width','species']

# We also need to define the class for the script to target
#target = "Play"
target = "species"

print ("There are {0} attributes to train on and we are targeting \"{1}\"".format(len(attributes), target))

There are 5 attributes to train on and we are targeting "species"


Now with the updated configuration, our script needs to apply the necessary changes to our datasets to ensure the right columns and rows are used.

In [135]:
# In our example dataset, the first row had no actual data
# instead it was used to title each column.  Lets separate
# them into two different objects for use later

header_row = loaded_data[0]
training_rows = loaded_data[1:]

# I come from a web developer background so using a json 
# object seems more familiar with me 
jsond_rows = []
for i in range(0, len(training_rows)):
    json_data = {}
    row = training_rows[i]
    for j in range(0, len(header_row)):
        json_data[header_row[j]] = row[j]

    jsond_rows.append(json_data)

# Now we need to remove any columns that won't be used by the
# algorithm when training, there's probably a really easy
# way to map this properly but I haven't gone through the 
# documentation thoroughly enough
for i in range(0, len(jsond_rows)):
    json_data = {}
    row = jsond_rows[i]
    for j in range(0, len(attributes)):
        json_data[attributes[j]] = row[attributes[j]]

    jsond_rows[i] = json_data

# Remove the target from the attributes so we don't get mixed up later
if target in attributes:
    attributes.remove(target)   

print ("We've filtered out any unused attributes!")

We've filtered out any unused attributes!


We also need to define extra utility functions to assist us instead of reusing code.

In [0]:
def getUniqueValues(data, attributeName):
    # Get all the possible values for our attribute
    attributeValues = []
    for row in data:
        val = row[attributeName]
        if val not in attributeValues:
            attributeValues.append(val)

    return attributeValues

# https://www.geeksforgeeks.org/python-find-most-frequent-element-in-a-list/
def most_frequent(List): 
    dict = {} 
    count, itm = 0, '' 
    for item in reversed(List): 
        dict[item] = dict.get(item, 0) + 1
        if dict[item] >= count : 
            count, itm = dict[item], item 
    return(itm) 

Next we need to define the functions used to calcualate entropy and information gain. 

Entropy can be easily calculated using the formula above and using the occurrences of each value within in its column as seen in the code snippet below.

In the code snippet, the function calculates probability/occurrence of a value from the target column and then uses that value to calculate the entropy.

In [0]:
# To calculate entropy we basically 
# need to calculate a value's occurence
# within it's own column/class
# the data object is still the whole set
# and the attribute name is the attribute
# for which we are calulcating entropy for

def calculateEntropy(data, attributeName):
    # Used to as a denominator in the fraction
    numberOfRows = len(data)

    # This small patch of code basically counts
    # the occurrences of a certain value within
    # that column
    occurrs = {}
    for row in data: 
        val = row[attributeName]

        if val not in occurrs:
            occurrs[val] = 0
    
        occurrs[val] += 1

    entropy = 0
    for val in occurrs:
        # Calculates proportions
        # Float is needed here otherwise it becomes 0
        p = occurrs[val] / float(numberOfRows)
        entropy += - p * math.log(p, 2)

    return entropy

Information Gain on the other hand is a bit more complex requiring multiple entropy calculations. It requires the grouping of values from another attribute and then calculating their individual entropies in relation to the target attribute.

In [0]:
# Pass the target Entropy, data and attribute name for which
# you want to calculate the information gain for
def calculateIGain(data, target, attributeName):
    # Used to as a denominator in the fraction
    numberOfRows = len(data)

    # We need the target's entropy to calculate info gain
    targetEntropy = calculateEntropy(data, target)

    # Get all the possible values for our attribute
    attributeValues = getUniqueValues(data, attributeName)

    # We want to group each attribute 
    # and then calculate their entropies
    sumOfEntropies = 0
    for attribute in attributeValues:
        # This function is used to group rows together based on the
        # value of an attribute
        def isGroup(row):
            if row[attributeName] == attribute:
                return True
            else:
                return False

        # Groups the rows
        matchedRows = list(filter(isGroup, data))

        entropy = len(matchedRows) / float(numberOfRows) * calculateEntropy(matchedRows, target)
        sumOfEntropies += entropy

    return targetEntropy - sumOfEntropies

Due to the way decision trees treat information gain, it is better to sort all the different attributes and features first with their information gain metrics before creating their nodes.

In [0]:
# This function gets the attribute with the biggest information gain
def getBiggestInfoGain(infoGain):
    attributeName = None
    biggestGain = None

    for attribute in infoGain.keys():
        if biggestGain is None or biggestGain < infoGain[attribute]:
            biggestGain = infoGain[attribute]
            attributeName = attribute 

    return attributeName

def getInfoGains(data, target, attributes):
    # Let's find out how much information gain each attribute has
    informationGain = {}
    for attr in attributes:
        # Skip the target attr
        if attr == target:
            continue
        iGain = calculateIGain(data, target, attr)
        informationGain[attr] = iGain
    
    return informationGain

Now we can actually get on to creating the decision tree. Our decision tree will have three variables at each stage: attributeName, children and result. The attribute name is what is currently being tested, the children are any further stages and the result appears when the tree ends.

To create the decision tree, our script needs to filter the data at every stage and create a funnel until an answer can be obtained. This will require many loops of the create Tree function as seen below

In [140]:
def createTree(data, target, attributes):
    # Create the tree object (this could also be a child)
    tree = {}

    # Get all the possible values for our target
    targetValues = getUniqueValues(data, target)

    # If there is only one possible value we can
    # end the tree
    if len(targetValues) == 1:
        tree['result'] = targetValues[0]
        return tree

    # If there are no attributes to train with
    # use the most common one
    if len(attributes) == 0:
        tree['result'] = most_frequent(targetValues)
        return tree

    # Calculate info gains for all the attributes
    infoGains = getInfoGains(data, target, attributes)
    # Get the biggest one 
    biggestInfoGain = getBiggestInfoGain(infoGains)

    # Set the attribute being tested at this stage
    tree['attributeName'] = biggestInfoGain
    tree['children'] = {}

    # Remove the attribute now
    updatedAttributes = attributes.copy() # using copy doesn't change the attributes variable in the global
    updatedAttributes.remove(biggestInfoGain)

    # Get all the possible values of the biggest info attribute
    possibleValues = getUniqueValues(data, biggestInfoGain)

    # For every possible value we need to create a branch 
    # with a more filtered data set and the remaining attributes
    for value in possibleValues:
        filteredData = []
        for row in data:
            if row[biggestInfoGain] == value:
                filteredData.append(row)

        tree['children'][value] = createTree(filteredData, target, updatedAttributes)

    return tree

decisionTree = createTree(jsond_rows, target, attributes)

print (decisionTree)

{'attributeName': 'petal_length', 'children': {1.4: {'result': 'setosa'}, 1.3: {'result': 'setosa'}, 1.5: {'result': 'setosa'}, 1.7: {'result': 'setosa'}, 1.6: {'result': 'setosa'}, 1.1: {'result': 'setosa'}, 1.2: {'result': 'setosa'}, 1: {'result': 'setosa'}, 1.9: {'result': 'setosa'}, 4.7: {'result': 'versicolor'}, 4.5: {'attributeName': 'sepal_length', 'children': {6.4: {'result': 'versicolor'}, 5.7: {'result': 'versicolor'}, 5.6: {'result': 'versicolor'}, 6.2: {'result': 'versicolor'}, 6: {'result': 'versicolor'}, 5.4: {'result': 'versicolor'}, 4.9: {'result': 'virginica'}}}, 4.9: {'attributeName': 'sepal_width', 'children': {3.1: {'result': 'versicolor'}, 2.5: {'result': 'versicolor'}, 2.8: {'result': 'virginica'}, 2.7: {'result': 'virginica'}, 3: {'result': 'virginica'}}}, 4: {'result': 'versicolor'}, 4.6: {'result': 'versicolor'}, 3.3: {'result': 'versicolor'}, 3.9: {'result': 'versicolor'}, 3.5: {'result': 'versicolor'}, 4.2: {'result': 'versicolor'}, 3.6: {'result': 'versicolo

# Evaluation

As I was still new to Python there were a lot of short cuts and bad coding practices in this algorithm. Ultimately the algorithm does work in building an ID3 decision tree model however it is yet to be tested.

To evaluate the performance of our algorithm we will be using the iris dataset which has already been previously loaded and also a new function to predict individual results.

In [138]:
def predict(data):
    tree = decisionTree
    while 'result' not in tree:
        child = data[tree['attributeName']]
        tree = tree['children'][child]

    return tree['result']

# This test data set will only work if you are using the iris dataset
# [ 'sepal_length','sepal_width','petal_length','petal_width','species' ],
test_data = [
    { 'sepal_length': 4.6, 'sepal_width': 3.2, 'petal_length': 1.4, 'petal_width': 0.2 }, # Setosa
    { 'sepal_length': 5, 'sepal_width': 3.3, 'petal_length': 1.4, 'petal_width': 0.2 }, # Setosa
    { 'sepal_length': 6.2, 'sepal_width': 2.9, 'petal_length': 4.3, 'petal_width': 1.3 }, # versicolor
    { 'sepal_length': 5.7, 'sepal_width': 2.8, 'petal_length': 4.1, 'petal_width': 1.3 }, # versicolor
    { 'sepal_length': 6.5, 'sepal_width': 3, 'petal_length': 5.2, 'petal_width': 2 }, # virginica
    { 'sepal_length': 5.9, 'sepal_width': 3, 'petal_length': 5.1, 'petal_width': 1.8 }, # virginica
]

for row in test_data:
    prediction = predict(row)
    print (row, prediction)


{'sepal_length': 4.6, 'sepal_width': 3.2, 'petal_length': 1.4, 'petal_width': 0.2} setosa
{'sepal_length': 5, 'sepal_width': 3.3, 'petal_length': 1.4, 'petal_width': 0.2} setosa
{'sepal_length': 6.2, 'sepal_width': 2.9, 'petal_length': 4.3, 'petal_width': 1.3} versicolor
{'sepal_length': 5.7, 'sepal_width': 2.8, 'petal_length': 4.1, 'petal_width': 1.3} versicolor
{'sepal_length': 6.5, 'sepal_width': 3, 'petal_length': 5.2, 'petal_width': 2} virginica
{'sepal_length': 5.9, 'sepal_width': 3, 'petal_length': 5.1, 'petal_width': 1.8} virginica


Although there wasn't extensive testing, the results of the above is sufficient to prove that the model is working to expectations. In terms of performance, parsing the data into a JSON object would require a lot of computing power especially for bigger datasets, so it would be more efficient to stick with the array format especially since that is the default format used when importaing data from CSV files.

# Conclusion

To conclude, the ID3 decision tree is a lot more easier to comprehend once it has been built out in code. In this form, it is simple to understand the creation and methodologies behind the decision tree as well as the formulas used. To better this script, use of more libraries would help significantly as many functions above could have easily been replaced and better optimized. Datasets could have also been imported directly from libraries like scikit instead of being initialized directly into the objects. Overall it has been a great experience to learn Python and a hands on experience with the ID3 algorithm.

# Ethics

The ID3 algorithm is an incredibly useful and powerful tool when used correctly. With the wrong type of data the ID3 algorithm could be skewed and biased which may be detrimental depending on its use cases. One example is if users were to feed an algorithm with disproportional crime data. This would create a skew in the algorithm and incorrectly predict outcomes. Also as noted in the code above, when there is not enough information, the ID3 algorithm falls back onto the most common outcome which can be fatal decision. However, despite all these disadvantages, these hurdles must be tested and overcome for a future with better machine learning algorithms and artificial intelligence for the greater good.

# Links

https://www.youtube.com/watch?v=HeMYgsp6Cqo&feature=youtu.be

https://github.com/13286733-u/MLA2

https://colab.research.google.com/drive/16E_NCBxcmSxP-ya0vR4sB5Icc93C_fnQ
