# Project 1 - Information measures

The goal of this first project is to get accustomed to the information and uncertainty measures. We ask you to write a brief report (pdf format) collecting your answers to the different questions. All codes must be written in Python inside this Jupyter Notebook. No other code file will be accepted. Note that you can not change the content of locked cells or import any extra Python library than the ones already imported (numpy and pandas).

## Implementation

In this project, you will need to use information measures to answer several questions. Therefore, in this first part, you are asked to write several functions that implement some of the main measures seen in the first theoretical lectures. Remember that you need to fill in this Jupyter Notebook to answer these questions. Pay particular attention to the required output format of each function.

In [1138]:
# [Locked Cell] You can not import any extra Python library in this Notebook.
import numpy as np
import pandas as pd

### Question 1

Write a function entropy that computes the entropy $\mathcal{H(X)}$ of a random variable $\mathcal{X}$ from its probability distribution $P_\mathcal{X} = (p_1, p_2, . . . , p_n)$. Give the mathematical formula that you are using and explain the key parts of your implementation. Intuitively, what is measured by the entropy?

In [1139]:
def entropy(Px):
    """
    Computes the entropy from the marginal probability distribution. 
    Arguments:
    ----------
    - Px :  Marginal probability distribution of the random 
            variable X in a numpy array where Px[i]=P(X=i)
    Return:
    -------
    - The entropy of X (H(X)) as a number (integer, float or double).
    """
    
    H = 0
    for px in Px:
        if px > 0:
            H -= px * np.log2(px)
    return H


### Question 2

Write a function joint_entropy that computes the joint entropy $\mathcal{H(X,Y)}$ of two discrete random variables $\mathcal{X}$ and $\mathcal{Y}$ from the joint probability distribution $P_\mathcal{X,Y}$. Give the mathematical formula that you are using and explain the key parts of your implementation. Compare the entropy and joint_entropy functions (and their corresponding formulas), what do you notice?

In [1141]:
def joint_entropy(Pxy):
    """
    Computes the joint entropy from the joint probability distribution.  
    Arguments:
    ----------
    - Pxy:  joint probability distribution of X and Y 
            in a 2-D numpy array where Pxy[i][j]=P(X=i,Y=j)
    Return:
    -------
    - The joint entropy H(X,Y) as a number (integer, float or double).
    """
    
    H = 0
    for i in range(Pxy.shape[0]):
        for j in range(Pxy.shape[1]):
            px_y = Pxy[i][j]
            if px_y > 0:
                H -= px_y * np.log2(px_y)
    return H


### Question 3

Write a function conditional_entropy that computes the conditional entropy $\mathcal{H(X|Y)}$ of a discrete random variable $\mathcal{X}$ given another discrete random variable $\mathcal{Y}$ from the joint probability distribution $P_\mathcal{X,Y}$. Give the mathematical formula that you are using and explain the key parts of your implementation. Describe an equivalent way of computing that quantity.

In [1143]:
def conditional_entropy(Pxy):
    """
    Computes the conditional entropy from the joint probability distribution.
    Arguments:
    ----------
    - Pxy:  joint probability distribution of X and Y 
            in a 2-D numpy array where Pxy[i][j]=P(X=i,Y=j)
    Return:
    -------
    - The conditional entropy H(X|Y) as a number (integer, float or double)
    """
    
    Hy = entropy(np.sum(Pxy, axis=0))
    Hxy = joint_entropy(Pxy)
    
    return Hxy - Hy


### Question 4

Write a function mutual_information that computes the mutual information $\mathcal{I(X;Y)}$ between two discrete random variables $\mathcal{X}$ and $\mathcal{Y}$ from the joint probability distribution $P_\mathcal{X,Y}$ . Give the mathematical formula that you are using and explain the key parts of your implementation. What can you deduce from the mutual information $\mathcal{I(X;Y)}$ on the relationship between $\mathcal{X}$ and $\mathcal{Y}$? Discuss.

In [1145]:
def mutual_information(Pxy):
    """
    Computes the mutual information I(X;Y) from joint probability distribution
    
    Arguments:
    ----------
    - Pxy:  joint probability distribution of X and Y 
            in a 2-D numpy array where Pxy[i][j]=P(X=i,Y=j)
    Return:
    -------
    - The mutual information I(X;Y) as a number (integer, float or double)
    """
    
    Px = np.sum(Pxy, axis=1)
    Py = np.sum(Pxy, axis=0)
    Hx = entropy(Px)
    Hy = entropy(Py)
    Hxy = joint_entropy(Pxy)
    
    return Hx + Hy - Hxy


### Question 5

Let $\mathcal{X}$, $\mathcal{Y}$ and $\mathcal{Z}$ be three discrete random variables. Write the functions cond_joint_entropy and cond_mutual_information that respectively compute $\mathcal{H(X,Y|Z)}$ and $\mathcal{I(X;Y|Z)}$ of two discrete random variable $\mathcal{X}$, $\mathcal{Y}$ given another discrete random variable $\mathcal{Z}$ from their joint probability distribution $P_\mathcal{X,Y,Z}$. Give the mathematical formulas that you are using and explain the key parts of your implementation.
Suggestion: Observe the mathematical definitions of these quantities and think how you could derive them from the joint entropy and the mutual information.

In [1147]:
def cond_joint_entropy(Pxyz):
    """
    Computes the conditional joint entropy of X, Y knowing Z 
    from the joint probability distribution Pxyz
    Arguments:
    ----------
    - Pxyz: joint probability distribution of X, Y and Z
            in a 3-D array where Pxyz[i][j][k]=P(X=i,Y=j,Z=k)
    Return:
    -------
    - The conditional joint entropy H(X,Y|Z) as a number (integer, float or double)
    """
    
    # Compute Pxz
    Pxz = np.sum(Pxyz, axis=1)

    # Compute Hx_z
    Hx_z = conditional_entropy(Pxz)
    
    Hy_xz = 0
    
    # Compute Hy_xz
    for i in range(Pxyz.shape[0]):
        for k in range (Pxyz.shape[2]):
            for j in range(Pxyz.shape[1]):
                if(Pxyz[i][j][k] != 0):
                    Hy_xz -= Pxyz[i][j][k]*np.log2(Pxyz[i][j][k]/np.sum(Pxyz, axis = 1)[j][k])
    
    Hxy_z = Hx_z + Hy_xz
    
    return Hxy_z


In [1149]:
def cond_mutual_information(Pxyz):
    """
    Computes the conditional mutual information of X, Y knowing Z 
    from joint probability distribution Pxyz
    Arguments:
    ----------
    - Pxyz: joint probability distribution of X, Y and Z
            in a 3-D array where Pxyz[i][j][k]=P(X=i,Y=j,Z=k)
    Return:
    -------
    - I(X;Y|Z): The conditional joint entropy as a number (integer, float or double)
    """
    
    # Compute Pxz
    Pxz = np.sum(Pxyz, axis=1)

    # Compute Hx_z
    Hx_z = conditional_entropy(Pxz)
    
    Hx_yz = 0
    
    # Compute Hx_yz
    for i in range(Pxyz.shape[0]):
        for k in range (Pxyz.shape[2]):
            for j in range(Pxyz.shape[1]):
                if(Pxyz[i][j][k] != 0):
                    Hx_yz -= Pxyz[i][j][k]*np.log2(Pxyz[i][j][k]/np.sum(Pxyz, axis = 0)[j][k])
    
    Ixy_z = Hx_z - Hx_yz
    
    return Ixy_z


In [1151]:
# [Locked Cell] Evaluation of your functions by the examiner. 
# You don't have access to the evaluation, this will be done by the examiner.
# Therefore, this cell will return nothing for the students.
import os
if os.path.isfile("private_evaluation.py"):
    from private_evaluation import unit_tests
    unit_tests(entropy, joint_entropy, conditional_entropy, mutual_information, cond_joint_entropy, cond_mutual_information)

### Football outcome

You may create cells below to answer the different questions related to football outcome. Unlike in the first part (Implementation), you are free to define as many cells as you need below to answer the different questions. Try to be structured and clear in your code (comment it if necessary). Note that you have to answer the questions in the pdf report, including the numbers you get!

In [1152]:
# import data into a pandas dataframe
data = pd.read_csv("data.csv", sep=",")


#### Question 6

Compute and report the entropy of each variable, and compare each value with its corresponding variable cardinality.

In [1153]:
name_column = data.columns.tolist()
entropies = np.zeros((13,1))
cardinality = np.zeros((13,1))

for i in range(0,13):
    proba_distri = np.zeros((data[name_column[i]].value_counts().shape[0],1))
    
    # Computing probability distribution for each variable 
    for j in range(0,data[name_column[i]].value_counts().shape[0]):
        proba_distri[j] = data[name_column[i]].value_counts()[j]/data.shape[0]
        
    # Computing entropy for each variable    
    entropies[i] = entropy(proba_distri) 
    
    # Computing cardinality for each variable 
    cardinality[i] = data[name_column[i]].value_counts().shape[0]
    
    print(name_column[i], ": entropy = ", np.round(entropies[i], 5))
    print(name_column[i], ": cardinality = ", cardinality[i])
    

outcome : entropy =  [1.33488]
outcome : cardinality =  [3.]
previous_outcome : entropy =  [1.483]
previous_outcome : cardinality =  [3.]
day : entropy =  [2.80656]
day : cardinality =  [7.]
time : entropy =  [0.93252]
time : cardinality =  [3.]
month : entropy =  [3.58263]
month : cardinality =  [12.]
wind_speed : entropy =  [1.58471]
wind_speed : cardinality =  [3.]
weather : entropy =  [1.76408]
weather : cardinality =  [4.]
location : entropy =  [0.99994]
location : cardinality =  [2.]
capacity : entropy =  [1.53391]
capacity : cardinality =  [4.]
stadium_state : entropy =  [0.63955]
stadium_state : cardinality =  [2.]
injury : entropy =  [0.99984]
injury : cardinality =  [2.]
match_type : entropy =  [0.9999]
match_type : cardinality =  [2.]
opponent_strength : entropy =  [1.58435]
opponent_strength : cardinality =  [3.]


#### Question 7

Compute and report the conditional entropy of outcome given each of the other variables.

In [1154]:
# Calculate probability distribution of outcome
proba_distri = np.zeros((data['outcome'].value_counts().shape[0],1))  
for j in range(0,data['outcome'].value_counts().shape[0]):
    proba_distri[j] = data['outcome'].value_counts()[j]/data.shape[0] 

# Calculate entropy of outcome
outcome_entropy = entropy(np.round(proba_distri, 5))
print("Entropy of outcome", ": ", outcome_entropy, "\n")

cond_entropy = {}
for col in data.columns:
    if col != 'outcome':
        # Create a contingency table
        contingency_table = pd.crosstab(data['outcome'], data[col], normalize=True)
        
        # Convert into a numpy array
        Pxy = np.array(contingency_table)
        
        # Calculate conditional entropy
        cond_entropy[col] = conditional_entropy(Pxy)

print("Conditional entropy of outcome given each of the other variables:\n")
for col, ce in cond_entropy.items():
    print(col, ": ", np.round(ce, 5))


Entropy of outcome :  [1.33487925] 

Conditional entropy of outcome given each of the other variables:

previous_outcome :  1.18148
day :  1.33349
time :  1.3338
month :  1.33036
wind_speed :  1.33473
weather :  1.33384
location :  1.33351
capacity :  1.33202
stadium_state :  1.33432
injury :  1.33024
match_type :  1.33483
opponent_strength :  0.93861


#### Question 8

Compute the mutual information between the variables month and capacity. What about the variables day and time?

month and capacity

In [1155]:
# Create a contingency table
contingency_table = pd.crosstab(data['month'], data['capacity'], normalize=True)

# Convert into numpy array
Pxy = np.array(contingency_table)

# Calculate mutual information
mutual_info = mutual_information(Pxy)

print("Mutual information between month and capacity:", np.round(mutual_info, 5))



Mutual information between month and capacity: 0.00607


day and time

In [1156]:
# Create a contingency table
contingency_table = pd.crosstab(data['day'], data['time'], normalize=True)

# Convert into a numpy array
Pxy = np.array(contingency_table)

# Calculate mutual information
mutual_info = mutual_information(Pxy)

print("Mutual information between day and time:", np.round(mutual_info, 5))


Mutual information between day and time: 0.50461


#### Question 9

Mutual Information between outcome and each of the other variables.

In [1157]:
mutual_info = {}
for col in data.columns:
    if col != 'outcome':
        # Create a contingency table
        contingency_table = pd.crosstab(data['outcome'], data[col], normalize=True)

        # Convert into a numpy array
        Pxy = np.array(contingency_table)
        
        # Calculate mutual information
        mutual_info[col] = mutual_information(Pxy)

print("Mutual Information between outcome and each of the other variables:")
for col, mi in mutual_info.items():
    print(col, ": ", np.round(mi, 5))


Mutual Information between outcome and each of the other variables:
previous_outcome :  0.1534
day :  0.00139
time :  0.00108
month :  0.00452
wind_speed :  0.00015
weather :  0.00104
location :  0.00137
capacity :  0.00286
stadium_state :  0.00055
injury :  0.00464
match_type :  5e-05
opponent_strength :  0.39627


#### Question 10

Mutual Information between outcome and each of the other variables given previous_outcome.

In [1158]:
cond_mutual_info = {}
for col in data.columns:
    if col != 'outcome' and col != 'previous_outcome':

        # Create a contingency table
        contingency_table = pd.crosstab(index=[data['outcome'], data[col]], columns = data['previous_outcome'], normalize=True)

        # Convert the contingency table to a 3-D array
        Pxyz = contingency_table.values.reshape((len(set(data['outcome'])), len(set(data[col])), len(set(data['previous_outcome']))))

        # Calculate mutual information
        cond_mutual_info[col] = cond_mutual_information(Pxyz)

print("Mutual Information between outcome and each of the other variables given previous_outcome:")
for col, cmi in cond_mutual_info.items():
    print(col, ": ", np.round(cmi, 5))


Mutual Information between outcome and each of the other variables given previous_outcome:
day :  0.00474
time :  0.00347
month :  0.01369
wind_speed :  0.00212
weather :  0.00303
location :  0.00213
capacity :  0.00495
stadium_state :  0.00062
injury :  0.009
match_type :  0.00051
opponent_strength :  0.24459


#### Question 11

Mutual information between stadium_state and weather when location is home

In [1159]:
home_data = data[data['location'] == 'home']

# Create a contingency table
contingency_table = pd.crosstab(home_data['stadium_state'], home_data['weather'], normalize=True)

# Convert into numpy array
Pxy = np.array(contingency_table)

# Calculate mutual information
mutual_info = mutual_information(Pxy)

print("Mutual information between stadium_state and weather when location is home:", mutual_info)


Mutual information between stadium_state and weather when location is home: 0.0
