# Project 1 - Information measures

The goal of this first project is to get accustomed to the information and uncertainty measures. We ask you to write a brief report (pdf format) collecting your answers to the different questions. All codes must be written in Python inside this Jupyter Notebook. No other code file will be accepted. Note that you can not change the content of locked cells or import any extra Python library than the ones already imported (numpy and pandas).

## Implementation

In this project, you will need to use information measures to answer several questions. Therefore, in this first part, you are asked to write several functions that implement some of the main measures seen in the first theoretical lectures. Remember that you need to fill in this Jupyter Notebook to answer these questions. Pay particular attention to the required output format of each function.

In [1]:
# [Locked Cell] You can not import any extra Python library in this Notebook.
import numpy as np
import pandas as pd

### Question 1

Write a function entropy that computes the entropy $\mathcal{H(X)}$ of a random variable $\mathcal{X}$ from its probability distribution $P_\mathcal{X} = (p_1, p_2, . . . , p_n)$. Give the mathematical formula that you are using and explain the key parts of your implementation. Intuitively, what is measured by the entropy?

In [2]:
def entropy(Px):
    """
    Computes the entropy from the marginal probability distribution. 
    Arguments:
    ----------
    - Px :  Marginal probability distribution of the random 
            variable X in a numpy array where Px[i]=P(X=i)
    Return:
    -------
    - The entropy of X (H(X)) as a number (integer, float or double).
    """
    sum = 0

    for _, Pi in enumerate(Px):
        if Pi == 0:
            continue
        sum = sum - Pi * np.log2(Pi)

    return sum

### Question 2

Write a function joint_entropy that computes the joint entropy $\mathcal{H(X,Y)}$ of two discrete random variables $\mathcal{X}$ and $\mathcal{Y}$ from the joint probability distribution $P_\mathcal{X,Y}$. Give the mathematical formula that you are using and explain the key parts of your implementation. Compare the entropy and joint_entropy functions (and their corresponding formulas), what do you notice?

In [3]:
def joint_entropy(Pxy):
    """
    Computes the joint entropy from the joint probability distribution.  
    Arguments:
    ----------
    - Pxy:  joint probability distribution of X and Y 
            in a 2-D numpy array where Pxy[i][j]=P(X=i,Y=j)
    Return:
    -------
    - The joint entropy H(X,Y) as a number (integer, float or double).
    """
    sum = 0

    for _, Pij in np.ndenumerate(Pxy):
        if Pij == 0:
            continue
        sum = sum - Pij * np.log2(Pij)

    return sum

### Question 3

Write a function conditional_entropy that computes the conditional entropy $\mathcal{H(X|Y)}$ of a discrete random variable $\mathcal{X}$ given another discrete random variable $\mathcal{Y}$ from the joint probability distribution $P_\mathcal{X,Y}$. Give the mathematical formula that you are using and explain the key parts of your implementation. Describe an equivalent way of computing that quantity.

In [4]:
def conditional_entropy(Pxy):
    """
    Computes the conditional entropy from the joint probability distribution.
    Arguments:
    ----------
    - Pxy:  joint probability distribution of X and Y 
            in a 2-D numpy array where Pxy[i][j]=P(X=i,Y=j)
    Return:
    -------
    - The conditional entropy H(X|Y) as a number (integer, float or double)
    """
    sum = 0

    Py = np.sum(Pxy,0)

    for (i,j), Pij in np.ndenumerate(Pxy):
        if Pij == 0:
            continue
        sum = sum - Pij * np.log2(Pij/Py[j])

    return sum

### Question 4

Write a function mutual_information that computes the mutual information $\mathcal{I(X;Y)}$ between two discrete random variables $\mathcal{X}$ and $\mathcal{Y}$ from the joint probability distribution $P_\mathcal{X,Y}$ . Give the mathematical formula that you are using and explain the key parts of your implementation. What can you deduce from the mutual information $\mathcal{I(X;Y)}$ on the relationship between $\mathcal{X}$ and $\mathcal{Y}$? Discuss.

In [5]:
def mutual_information(Pxy):
    """
    Computes the mutual information I(X;Y) from joint probability distribution
    
    Arguments:
    ----------
    - Pxy:  joint probability distribution of X and Y 
            in a 2-D numpy array where Pxy[i][j]=P(X=i,Y=j)
    Return:
    -------
    - The mutual information I(X;Y) as a number (integer, float or double)
    """

    sum = 0

    Px = np.sum(Pxy,1)
    Py = np.sum(Pxy,0)

    for (i,j), Pij in np.ndenumerate(Pxy):
        if Px[i] == 0 or Py[j] == 0 or Pij == 0:
            continue
        sum = sum + Pij * np.log2(Pij/(Px[i]*Py[j]))

    return sum

### Question 5

Let $\mathcal{X}$, $\mathcal{Y}$ and $\mathcal{Z}$ be three discrete random variables. Write the functions cond_joint_entropy and cond_mutual_information that respectively compute $\mathcal{H(X,Y|Z)}$ and $\mathcal{I(X;Y|Z)}$ of two discrete random variable $\mathcal{X}$, $\mathcal{Y}$ given another discrete random variable $\mathcal{Z}$ from their joint probability distribution $P_\mathcal{X,Y,Z}$. Give the mathematical formulas that you are using and explain the key parts of your implementation.
Suggestion: Observe the mathematical definitions of these quantities and think how you could derive them from the joint entropy and the mutual information.

In [6]:
def cond_joint_entropy(Pxyz):
    """
    Computes the conditional joint entropy of X, Y knowing Z 
    from the joint probability distribution Pxyz
    Arguments:
    ----------
    - Pxyz: joint probability distribution of X, Y and Z
            in a 3-D array where Pxyz[i][j][k]=P(X=i,Y=j,Z=k)
    Return:
    -------
    - The conditional joint entropy H(X,Y|Z) as a number (integer, float or double)
    
    """
    Pz = np.sum(Pxyz, (0,1))

    return joint_entropy(Pxyz) - entropy(Pz)

In [7]:
def cond_mutual_information(Pxyz):
    """
    Computes the conditional mutual information of X, Y knowing Z 
    from joint probability distribution Pxyz
    Arguments:
    ----------
    - Pxyz: joint probability distribution of X, Y and Z
            in a 3-D array where Pxyz[i][j][k]=P(X=i,Y=j,Z=k)
    Return:
    -------
    - I(X;Y|Z): The conditional joint entropy as a number (integer, float or double)
    
    """

    Pxz = np.sum(Pxyz, 1)
    Pyz = np.sum(Pxyz, 0)

    return conditional_entropy(Pxz) - joint_entropy(Pxyz) + joint_entropy(Pyz)

In [8]:
# [Locked Cell] Evaluation of your functions by the examiner. 
# You don't have access to the evaluation, this will be done by the examiner.
# Therefore, this cell will return nothing for the students.
import os
if os.path.isfile("private_evaluation.py"):
    from private_evaluation import unit_tests
    unit_tests(entropy, joint_entropy, conditional_entropy, mutual_information, cond_joint_entropy, cond_mutual_information)

## Weather forecasting

You may create cells below to answer the different questions related to weather forecasting. Unlike in the first part (Implementation), you are free to define as many cells as you need below to answer the different questions. Try to be structured and clear in your code (comment it if necessary). Note that you have to answer the questions in the pdf report, including the numbers you get!

### Question 6

In [9]:
# Read the csv file
df = pd.read_csv ('weather_data.csv')
#df.info

In [10]:
# Compute the probability distribution of each column, then compute the entropy
zero_data = np.zeros(shape=(3,len(df.columns)))
entropy_df = pd.DataFrame(zero_data, columns = df.columns, index=['entropy', 'cardinality', 'log(card)'])

for col in df.columns:
    item_counts = df[col].value_counts()
    Px = item_counts/np.sum(item_counts)
    entropy_df[col][0] = entropy(Px)
    entropy_df[col][1] = int(item_counts.shape[0])
    entropy_df[col][2] = np.log2(item_counts.shape[0])

#print(entropy_df)

# To latex

#print(entropy_df[['temperature', 'air_pressure', 'same_day_rain', 'next_day_rain', 'relative_humidity']].to_latex())
#print(entropy_df[['wind_direction', 'wind_speed', 'cloud_height', 'cloud_density', 'month']].to_latex())
#print(entropy_df[['day', 'daylight', 'lightning', 'air_quality']].to_latex())


### Question 7

In [11]:
cond_entropy = np.zeros(shape=(1,len(df.columns)-1))
col_without_next_day_rain = list(df.columns)
col_without_next_day_rain.remove("next_day_rain")
cond_entropy_df = pd.DataFrame(cond_entropy, columns = col_without_next_day_rain, index=['conditional entropy'])

for col in col_without_next_day_rain:
    Pxy = pd.crosstab(df["next_day_rain"], df[col])
    Pxy = Pxy.to_numpy()
    Pxy = Pxy/len(df)
    cond_entropy_df[col] = conditional_entropy(Pxy)

print(cond_entropy_df)

# To latex

#print(cond_entropy_df[['temperature', 'air_pressure', 'same_day_rain', 'relative_humidity', 'wind_direction']].to_latex())
#print(cond_entropy_df[['wind_speed', 'cloud_height', 'cloud_density', 'month', 'day']].to_latex())
#print(cond_entropy_df[['daylight', 'lightning', 'air_quality']].to_latex())

                     temperature  air_pressure  same_day_rain  \
conditional entropy     1.568101      0.939975       1.389486   

                     relative_humidity  wind_direction  wind_speed  \
conditional entropy           1.301055        1.567815    1.567767   

                     cloud_height  cloud_density    month       day  daylight  \
conditional entropy      1.566763        1.56659  1.56488  1.567157  1.568259   

                     lightning  air_quality  
conditional entropy   1.568233     1.567881  


### Question 8

In [12]:
Pxy = pd.crosstab(df['relative_humidity'], df['wind_speed'])
Pxy = Pxy/len(df)
Pxy = Pxy.to_numpy()
#print(Pxy)
print(mutual_information(Pxy))



Pxy = pd.crosstab(df['month'], df['temperature'])
Pxy = Pxy/len(df)
Pxy = Pxy.to_numpy()
#print(Pxy)
print(mutual_information(Pxy))

0.00012439598067303954
0.5753467937246421


### Question 9

In [13]:
mutual_info = np.zeros(shape=(1,len(df.columns)-1))
col_without_next_day_rain = list(df.columns)
col_without_next_day_rain.remove("next_day_rain")
mutual_info_df = pd.DataFrame(mutual_info, columns = col_without_next_day_rain, index=['Mutual information'])

for col in col_without_next_day_rain:
    Pxy = pd.crosstab(df["next_day_rain"], df[col])
    Pxy = Pxy.to_numpy()
    Pxy = Pxy/len(df)
    mutual_info_df[col] = mutual_information(Pxy)

print(mutual_info_df)

                    temperature  air_pressure  same_day_rain  \
Mutual information     0.000555      0.628681       0.179171   

                    relative_humidity  wind_direction  wind_speed  \
Mutual information           0.267601        0.000841    0.000889   

                    cloud_height  cloud_density     month       day  daylight  \
Mutual information      0.001893       0.002066  0.003776  0.001499  0.000397   

                    lightning  air_quality  
Mutual information   0.000424     0.000775  


### Question 10

In [14]:
mutual_info = np.zeros(shape=(1,len(df.columns)-1))
col_without_next_day_rain = list(df.columns)
col_without_next_day_rain.remove("next_day_rain")
mutual_info_df = pd.DataFrame(mutual_info, columns = col_without_next_day_rain, index=['Mutual information'])

drizzluge_df = df[df["next_day_rain"] != "dry"]

for col in col_without_next_day_rain:
    Pxy = pd.crosstab(drizzluge_df["next_day_rain"], drizzluge_df[col])
    Pxy = Pxy.to_numpy()
    Pxy = Pxy/len(drizzluge_df)
    mutual_info_df[col] = mutual_information(Pxy)

print(mutual_info_df)

                    temperature  air_pressure  same_day_rain  \
Mutual information     0.000822      0.008075       0.158054   

                    relative_humidity  wind_direction  wind_speed  \
Mutual information           0.439192        0.000773    0.000701   

                    cloud_height  cloud_density     month       day  daylight  \
Mutual information      0.000474        0.00026  0.003855  0.000637  0.000347   

                    lightning  air_quality  
Mutual information   0.000412     0.001078  


### Question 11

In [15]:
mutual_info = np.zeros(shape=(1,len(df.columns)-2))
col_without_next_day_rain_temperature = list(df.columns)
col_without_next_day_rain_temperature.remove("next_day_rain")
col_without_next_day_rain_temperature.remove("temperature")
mutual_info_df = pd.DataFrame(mutual_info, columns = col_without_next_day_rain, index=['Mutual information'])

for col in col_without_next_day_rain_temperature:
    yzx = df[["next_day_rain", col, "temperature"]]
    yzx = yzx.to_numpy()
    Pyzx, edges = np.histogramdd(yzx)
    Pyzx = Pyzx/len(df)
    print(Pyzx)
    Pzx = pd.crosstab(df["next_day_rain"], df["temperature"])
    Pzx = Pzx.to_numpy()
    Pzx = Pzx/len(df)
    mutual_info_df[col] = mutual_information(Pzx) + cond_mutual_information(Pyzx)

print(mutual_info_df)

ValueError: Shape of passed values is (1, 12), indices imply (1, 13)

### Question 12

In [16]:
letters_to_guess_nb = 5
possible_letters_nb = 26
Px = [1/possible_letters_nb for i in range(possible_letters_nb)]

# For each field
field_entropy = entropy(Px)

# Sum of each field
field_entropy_array = [field_entropy for i in range(letters_to_guess_nb)]
field_entropy_sum = sum(field_entropy_array)

# Game entropy
comb_nb = possible_letters_nb ** letters_to_guess_nb
p = 1/comb_nb

Px = [p for i in range(comb_nb)]

game_entropy = entropy(Px)

print('field_entropy : ', field_entropy)
print('field_entropy_sum : ', field_entropy_sum)
print('game_entropy : ', game_entropy)

field_entropy :  4.70043971814109
field_entropy_sum :  23.50219859070545
game_entropy :  23.502198594922692


### Question 13

In [17]:
letters_to_guess_nb = 4
possible_letters_nb = 22
Px = [1/possible_letters_nb for i in range(possible_letters_nb)]

# For each field
field_entropy = entropy(Px)

# Sum of each field
field_entropy_array = [field_entropy for i in range(letters_to_guess_nb)]
field_entropy_sum = sum(field_entropy_array)

# Game entropy
comb_nb = possible_letters_nb ** letters_to_guess_nb
p = 1/comb_nb

Px = [p for i in range(comb_nb)]

game_entropy = entropy(Px)

print('field_entropy : ', field_entropy)
print('field_entropy_sum : ', field_entropy_sum)
print('game_entropy : ', game_entropy)

field_entropy :  4.459431618637295
field_entropy_sum :  17.83772647454918
game_entropy :  17.83772647455847


### Question 14

In [44]:
# For field 1, 3 and 5
possible_letters_nb = 17
pG = 1/3
print('PG : ', pG)
p = (1-pG)/possible_letters_nb

Px = [p, 0, p, p, 0, p, pG, 0, p, p, p, 0, p, p, 0, p, p, 0, p, 0, 0, p, p, p, p, p]

entropy_1_3_5 = entropy(Px)
print(sum(Px))

# For field 4
possible_letters_nb = 17
p = 1/possible_letters_nb

Px = [p, 0, p, p, 0, p, 0, 0, p, p, p, 0, p, p, 0, p, p, 0, p, 0, 0, p, p, p, p, p]
print(sum(Px))
entropy_4 = entropy(Px)

# Sum of each field
field_entropy_sum = 3*entropy_1_3_5 + entropy_4

# Game entropy
comb_nb = (1*17*17*17) + (17*1*17*17) + (17*17*17*1) + \
        (1*1*17*17) + (17*1*17*1) + (1*17*17*1) + (1*1*17*1)

p = 1/comb_nb

Px = [p for i in range(comb_nb)]

game_entropy = entropy(Px)

print('entropy_1_3_5 : ', entropy_1_3_5)
print('entropy_4 : ', entropy_4)
print('field_entropy_sum : ', field_entropy_sum)
print('game_entropy : ', game_entropy)

PG :  0.3333333333333333
0.9999999999999997
1.0
entropy_1_3_5 :  3.6432710615547164
entropy_4 :  4.08746284125034
field_entropy_sum :  15.017276025914487
game_entropy :  13.931383892543712
