<div style="position: relative; text-align: center; padding: 30px;">
  <h1><strong>Comparasión de Clasificadores</strong></h1>
  <h3><strong>Walter y Heri</strong></h3>
</div>

**Objetivo:** El objetivo de esta actividad es que cada equipo cree una función que permita manejar diferentes formas de representación de clasificaciones.

**Instrucciones:**
- Construye una clase en Python que reciba una clasificación en representación vectorial, y construya la representación matricial y de conjunto de conjuntos de la clasificación. Haz un método para cada conversión:  
    - vectorial -> matricial.  
    - vectorial -> conjunto de conjuntos.  
- Guarda las tres representaciones como variables ocultas usando la nomenclatura "__parameter".  
- Crea tres métodos getter para devolver cada representación.  
- Construye un método para calcular los siguientes valores:  
    - Entropía de una clasificación.  
    - La entropía condicional de una clasificación dada otra clasificación.  
    - La información mutua de una clasificación dada otra clasificación.  
    - La variación de información de una clasificación dada otra clasificación.  

In [None]:
import numpy as np

class MyClass():
    def __init__(self, vector):
        self.__vector = vector
        self.__labels = self.get_labels()
        self.__probabilities = self.get_probabilities()
        self.__join_probabilities = self.get_join_probabilities()
        self.__matrix = self.create_matrix()
        self.__sets = self.create_sets()


    def create_matrix(self):
        # Create an empty matrix
        matrix = []
        for row in range(len(self.__vector)): # for each label in the vector
            matrix.append([0] * len(self.__labels)) # add a row of zeros

        # Fill the matrix
        for label in range(len(self.__vector)): # for each label in the vector

            matrix[label][self.__labels.index(self.__vector[label])] = 1 # set the value to 1
            # labels.index(self.__vector[label]) returns the index of the label in the labels list

        return matrix
    

    def create_sets(self):
        # Create a dictionary of sets        
        conjuntos = {} 
        for i, elemento in enumerate(self.__vector):  
            if elemento not in conjuntos:
                conjuntos[elemento] = set()
            conjuntos[elemento].add(i) 

        lista_de_conjuntos = list(conjuntos.values())

        return lista_de_conjuntos

    



    # ---
    # Crea tres métodos getter para devolver cada representación. 

    def get_vector(self):
        return self.__vector
    
    def get_matrix(self):
        return self.__matrix
    
    def get_sets(self):
        return self.__sets
    
    # Las usamos mucho, entonces las guardamos en variables
    def get_labels(self):
        labels = []
        for label in self.__vector:
            if label not in labels:
                labels.append(label)
        return labels
    
    def get_probabilities(self):
        probabilities = [] 
        for label in self.__labels:
            probability = self.__vector.count(label) / len(self.__vector) # |y_i| / N
            probabilities.append(probability)
    
    # Joint probabilities
    #
    # Y = {{}, {1, 2, 3, 4}, {   5, 6, 7, 8}}
    # Y'= {{}, {   2, 3, 4}, {1, 5, 6, 7, 8}}
    # 
    # P(Y'=1|Y=1) = 3/4
    # P(Y'=2|Y=1) = 1/4
    # 
    # La suma de los de arriba da 1
    #
    # P(Y'=1|Y=2) = 0
    # P(Y'=2|Y=2) = 1

    def get_join_probabilities(self, other):
        joint_probabilities = []
        for i in range(len(self.__labels)):
            for j in range(len(self.__labels)):
                joint_probability = 0
                for k in range(len(self.__vector)):
                    if self.__vector[k] == self.__labels[i] and other[k] == self.__labels[j]:
                        joint_probability += 1
                joint_probability /= len(self.__vector)
                joint_probabilities.append(joint_probability)
        return joint_probabilities

    

    # ---
    # Construye un método para calcular los siguientes valores:

    # - Entropía de una clasificación.
    def entropy(self):
        
        entropy = 0
        for prob in self.__probabilities:
            # lim_{p -> 0} p log_2(p) = 0
            if prob != 0:
                entropy += -prob * np.log2(prob)
            else:
                entropy += 0

        print("Probabilities:", self.__probabilities, "Entropy:", entropy)
        return entropy
    



    # - La entropía condicional de una clasificación dada otra clasificación.
    def conditional_entropy(self, other):
        # Intersection between the two classifications
        intersection = []
        for i in range(len(self.__vector)):
            if self.__vector[i] == other[i]:
                intersection.append(self.__vector[i])
        
        # Calculate the probabilities p(l, l')
        probabilities = [] # create a list of probabilities
        for label in self.__labels: # for each label in the intersection
            probability = intersection.count(label) / len(self.__vector) # count the number of times the label appears in the intersection
            probabilities.append(probability)

        
        # Cuantos elementos hay en la clase y 
        joint_probabilities = []
        for i in range(len(self.__labels)):
            for j in range(len(self.__labels)):
                joint_probability = 0
                for k in range(len(self.__vector)):
                    if self.__vector[k] == self.__labels[i] and other[k] == self.__labels[j]:
                        joint_probability += 1
                joint_probability /= len(self.__vector)
                joint_probabilities.append(joint_probability) # - Chat

        # Calculate the conditional entropy
        conditional_entropy = 0
        for prob in probabilities:
            # lim_{p -> 0} p log_2(p) = 0
            if prob != 0:
                joint = joint_probabilities[probabilities.index(prob)] / prob  # - Chat
                conditional_entropy += -prob * np.log2(joint)
            else:
                conditional_entropy += 0

        print("Intersection:", intersection, "Probabilities:", probabilities, "N:", len(self.__vector), "Conditional Entropy:", conditional_entropy)
        
        return conditional_entropy

    
    
    # - La información mutua de una clasificación dada otra clasificación.
    def mutual_information(self, other):
        # 2 equalit: I(y, x)= H(y) - H(y| x)
        mutual_information = self.entropy() - self.conditional_entropy(other)
        print("Mutual Information 1:", mutual_information)

        # 1 equality: I(y, x)= sum_{y in Y} sum_{x in X} p(y, x) log_2(p(y, x) / (p(y) * p(x)))

        # Calculate the joint probabilities
        joint_probabilities = []
        for i in range(len(self.__labels)):
            for j in range(len(self.__labels)):
                joint_probability = 0
                for k in range(len(self.__vector)):
                    if self.__vector[k] == self.__labels[i] and other[k] == self.__labels[j]:
                        joint_probability += 1
                joint_probability /= len(self.__vector)
                joint_probabilities.append(joint_probability) # - Chat

        # Calculate the mutual information
        mutual_information = 0
        for i in range(len(self.__labels)):
            for j in range(len(self.__labels)):
                if joint_probabilities[i * len(self.__labels) + j] != 0:
                    mutual_information += joint_probabilities[i * len(self.__labels) + j] * np.log2(joint_probabilities[i * len(self.__labels) + j] / (probabilities[i] * probabilities[j])) # - Chat

        print("Mutual Information 2:", mutual_information, "Probabilities:", probabilities, "Joint Probabilities:", joint_probabilities)
        
        return mutual_information
    
    # - La variación de información de una clasificación dada otra clasificación. 

    def variation_of_information(self, other):
        # VI(x, y) = H(y) + H(x) - 2I(y, x)
        variation_of_information = self.entropy() + self.entropy() - 2 * self.mutual_information(other)
        print("Variation of Information:", variation_of_information)

        return variation_of_information

In [15]:
vector = (0, 0, 0, 1, 1, 1, 2, 2, 2)
other =  (1, 1, 1, 1, 1, 1, 1, 1, 1)

In [16]:
my_class = MyClass(vector)


In [17]:
my_class.create_matrix()

[[1, 0, 0],
 [1, 0, 0],
 [1, 0, 0],
 [0, 1, 0],
 [0, 1, 0],
 [0, 1, 0],
 [0, 0, 1],
 [0, 0, 1],
 [0, 0, 1]]

In [18]:
my_class.get_matrix()

[[1, 0, 0],
 [1, 0, 0],
 [1, 0, 0],
 [0, 1, 0],
 [0, 1, 0],
 [0, 1, 0],
 [0, 0, 1],
 [0, 0, 1],
 [0, 0, 1]]

In [19]:
my_class.create_sets()

[{0, 1, 2}, {3, 4, 5}, {6, 7, 8}]

In [20]:
vector = (0, 0, 0, 1, 1, 1, 2, 2, 2)
my_class.get_sets()

[{0, 1, 2}, {3, 4, 5}, {6, 7, 8}]

In [21]:
my_class.entropy()

Probabilities: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] Entropy: 1.584962500721156


1.584962500721156

In [22]:
my_class.conditional_entropy(vector) 

Intersection: [0, 0, 0, 1, 1, 1, 2, 2, 2] Probabilities: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] N: 9 Conditional Entropy: 0.0


0.0

In [23]:
my_class.conditional_entropy(other)

Intersection: [1, 1, 1] Probabilities: [0.0, 0.3333333333333333, 0.0] N: 9 Conditional Entropy: 0.0


0.0

In [24]:
my_class.mutual_information(vector)

Probabilities: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] Entropy: 1.584962500721156
Intersection: [0, 0, 0, 1, 1, 1, 2, 2, 2] Probabilities: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] N: 9 Conditional Entropy: 0.0
Mutual Information 1: 1.584962500721156
Mutual Information 2: 1.5849625007211559 Probabilities: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] Joint Probabilities: [0.3333333333333333, 0.0, 0.0, 0.0, 0.3333333333333333, 0.0, 0.0, 0.0, 0.3333333333333333]


1.5849625007211559

In [25]:
my_class.mutual_information(other)

Probabilities: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] Entropy: 1.584962500721156
Intersection: [1, 1, 1] Probabilities: [0.0, 0.3333333333333333, 0.0] N: 9 Conditional Entropy: 0.0
Mutual Information 1: 1.584962500721156
Mutual Information 2: 1.5849625007211559 Probabilities: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] Joint Probabilities: [0.0, 0.3333333333333333, 0.0, 0.0, 0.3333333333333333, 0.0, 0.0, 0.3333333333333333, 0.0]


1.5849625007211559

In [26]:
my_class.variation_of_information(vector)

Probabilities: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] Entropy: 1.584962500721156
Probabilities: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] Entropy: 1.584962500721156
Probabilities: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] Entropy: 1.584962500721156
Intersection: [0, 0, 0, 1, 1, 1, 2, 2, 2] Probabilities: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] N: 9 Conditional Entropy: 0.0
Mutual Information 1: 1.584962500721156
Mutual Information 2: 1.5849625007211559 Probabilities: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333] Joint Probabilities: [0.3333333333333333, 0.0, 0.0, 0.0, 0.3333333333333333, 0.0, 0.0, 0.0, 0.3333333333333333]
Variation of Information: 4.440892098500626e-16


4.440892098500626e-16