# ÁRBOLES DE DECISIÓN
## Como desarrollar un arbol de decision en python
Se cargan las librerias necesarias para el funcionamiento del codigo.
Se carga el fichero de datos llamado zoo.csv y se indican las variables predictoras omitiendo la primer columna.


In [None]:
import pandas as pd
import numpy as np
from pprint import pprint

dataset = pd.read_csv('/Documents/zoo.csv',
                      names=['animal_name','hair','feathers','eggs','milk',
                            'airbone','aquatic','predator','toothed','backbone',
                            'breathes','venomous','fins','legs','tail','domestic',
                            'catsize','class',]).drop('animal_name',axis=1)


La funcion entropy calcula la entropia del set de datos.

**Entrada:** 
- target, este parametro representa la variable objetivo.

In [4]:
def entropy(target):
    
    elements,counts = np.unique(target,return_counts = True)
    entropy = np.sum([(-counts[i]/np.sum(counts))*np.log2(counts[i]/np.sum(counts)) for i in range(len(elements))])
    return entropy

### Algoritmo para la ganancia
La funcion InfoGain obtiene la ganacia del set de datos.

**Entradas:** 
- data, set de datos.

- split_attribute_name, el nombre del atributo cuyo valor de ganancia va a ser calculada.

- tarjet_name, target, este parametro representa la variable objetivo.

In [None]:
def InfoGain(data,split_attribute_name,target_name="class"):
    
    # total_entropy, almacena el valor de entropia total del dataset
    total_entropy = entropy(data[target_name])
    
    #Calcule los valores y los conteos correspondientes para el atributo de división.
    vals,counts= np.unique(data[split_attribute_name],return_counts=True)
    
    # Calcula la entropía ponderada.
    Weighted_Entropy = np.sum([(counts[i]/np.sum(counts))*entropy(data.where(data[split_attribute_name]==vals[i]).dropna()[target_name]) for i in range(len(vals))])
    
    # Calcula la ganancia de información
    Information_Gain = total_entropy - Weighted_Entropy
    return Information_Gain

### Algoritmo arbol ID3
Esta funcion representa el algoritmo de arbol de decision ID3 visto en la **Figura 1**. 
![image.png](attachment:image.png) <center>**Figura 1.** El algoritmo de aprendizaje del árbol de decisión. [1]</center>

**Entradas:** 
- data, esta es la informacion sobre la cual el algoritmo va a actuar.

- originaldata, este es el conjunto de datos original necesario para calcular el valor de la 
característica de destino de modo del conjunto de datos original.

- features, es el espacio de atributos del conjunto de datos, el cual es necesario para la 
llamada recursiva, ya que durante el proceso de crecimiento del árbol.

- target_attribute_name,es el nombre de la variable objetivo.

In [None]:
def ID3(data,originaldata,features,target_attribute_name="class",parent_node_class = None):
    #Define the stopping criteria --> If one of this is satisfied, we want to return a leaf node#
    
    #If all target_values have the same value, return this value
    if len(np.unique(data[target_attribute_name])) <= 1:
        return np.unique(data[target_attribute_name])[0]
    
    #If the dataset is empty, return the mode target feature value in the original dataset
    elif len(data)==0:
        return np.unique(originaldata[target_attribute_name])[np.argmax(np.unique(originaldata[target_attribute_name],return_counts=True)[1])]
    
    #If the feature space is empty, return the mode target feature value of the direct parent node --> Note that
    #the direct parent node is that node which has called the current run of the ID3 algorithm and hence
    #the mode target feature value is stored in the parent_node_class variable.
    
    elif len(features) ==0:
        return parent_node_class
    
    #If none of the above holds true, grow the tree!
    
    else:
        #Set the default value for this node --> The mode target feature value of the current node
        parent_node_class = np.unique(data[target_attribute_name])[np.argmax(np.unique(data[target_attribute_name],return_counts=True)[1])]
        
        #Select the feature which best splits the dataset
        item_values = [InfoGain(data,feature,target_attribute_name) for feature in features] #Return the information gain values for the features in the dataset
        best_feature_index = np.argmax(item_values)
        best_feature = features[best_feature_index]
        
        #Create the tree structure. The root gets the name of the feature (best_feature) with the maximum information
        #gain in the first run
        tree = {best_feature:{}}
        
        
        #Remove the feature with the best inforamtion gain from the feature space
        features = [i for i in features if i != best_feature]
        
        #Grow a branch under the root node for each possible value of the root node feature
        
        for value in np.unique(data[best_feature]):
            value = value
            #Split the dataset along the value of the feature with the largest information gain and therwith create sub_datasets
            sub_data = data.where(data[best_feature] == value).dropna()
            
            #Call the ID3 algorithm for each of those sub_datasets with the new parameters --> Here the recursion comes in!
            subtree = ID3(sub_data,dataset,features,target_attribute_name,parent_node_class)
            
            #Add the sub tree, grown from the sub_dataset to the tree under the root node
            tree[best_feature][value] = subtree
            
        return(tree)  