# Multivariate Processes: Seasons and Temperatures

## Conditional Probabilities and Entropies 1

We consider a bi-variate process. The four seasons $\mathbb{S} = \{ \text{spring}, \text{summer}, \text{autumn}, \text{winter} \}$ occur with the same probability $\Pr \{ S_{\mu} = 0.25 \}$. From our experience, the seasons are correlated to average temperatures. For simplicity, we only have three temperatures $\mathbb{T} = \{ -10°C, +10°C, +20°C \}$. The reslationship between seasons and temperatures can be described by the conditional probabilities given in the table below.


| ${\Pr\{ {\cal T} \mid {\cal S} \}}$ | spring | summer | autumn | winter |
| ----- | ----- | ----- | ----- | ----- | 
| -10°C |  0.1  |  0.0  |  0.1  |  0.8  |
| +10°C |  0.6  |  0.1  |  0.6  |  0.2  |
| +20°C |  0.3  |  0.9  |  0.3  |   0   | 

Given a particular season, the three temperatures occur with different probabilities, they are not equally likely. 

The conditional entropy $H({\cal T} \mid {\cal S})$ is defined by

\begin{align*}
    H({\cal T} \mid {\cal S})
    &= \mathrm{E} \big\{ \log_2 \Pr \{ {\cal T}=t \mid {\cal S}=s \} \big\} \\
    &= - \sum_{s \in \mathbb{S}} \sum_{t \in \mathbb{T}} \Pr \{ {\cal T}=t, {\cal S}=s \} \cdot \log_2 \Pr \{ {\cal T}=t \mid {\cal S}=s \} \\
&= - \sum_{s \in \mathbb{S}} \Pr\{ {\cal S} = s \} \cdot \sum_{t \in \mathbb{T}} \Pr \{ {\cal T}=t \mid {\cal S}=s \} \cdot \log_2 \Pr \{ {\cal T}=t \mid {\cal S}=s \} \\
&= - \frac{1}{4} \cdot  \sum_{s \in \mathbb{S}} \sum_{t \in \mathbb{T}} \Pr \{ {\cal T}=t \mid {\cal S}=s \} \cdot \log_2 \Pr \{ {\cal T}=t \mid {\cal S}=s \}
\end{align*}

For $H({\cal T} \mid {\cal J})=0$, the temperature is completely determined by the season and there is no uncertainty about the temperature if the season is known. For $H({\cal T} \mid {\cal S})=H({\cal T})$ , the uncertainty is maximized because knowing the season does not reduce the uncertainty about the temperature. In this case, seasons and temperatures will be statistically independent of each other.  

In [1]:
import numpy as np

# definition of temperature probabilities conditioned on seasons
Pr_T_S = np.array([[0.1, 0, 0.1, 0.8],[0.6, 0.1, 0.6, 0.2],[0.3, 0.9, 0.3, 0.0]])

# Computation of conditional entropy H(T | S)  (zeros are not considered due to 0 * log(0) = 0)
H_T_S = 0.25 * Pr_T_S * np.log2(Pr_T_S, out=np.zeros_like(Pr_T_S), where=(Pr_T_S!=0))
H_T_S = - np.sum(H_T_S)

print("The conditional entropy amounts to H(T | S) = %g bit.\n" % (H_T_S))

The conditional entropy amounts to H(T | S) = 0.945462 bit.



## Joint Probabilities and Joint Entropy

The joint probabilities can be computed by

\begin{equation}
    \Pr\{ {\cal T}=t, {\cal S}=s\} = \Pr\{ {\cal T}=t \mid {\cal S}=s\} \cdot \Pr\{ {\cal S}=s\}
\end{equation}

because the season probabilitiers are known. The corresponding joint entropy becomes

\begin{align*}
    H({\cal T}, {\cal S}) 
    &= \mathrm{E} \big\{ \log_2 \Pr\{ {\cal T}=t, {\cal S}=s\} \big\} \\
    &= - \sum_{s \in \mathbb{S}} \sum_{t \in \mathbb{T}} \Pr \{ {\cal T}=t, {\cal S}=s \} \cdot \log_2 \Pr \{ {\cal T}=t, {\cal S}=s \} .
\end{align*}


In [2]:
# Computation of joint probabilities
Pr_TS = Pr_T_S * 0.25

print("The joint probabilities amount to \n", Pr_TS, "\n")

The joint probabilities amount to 
 [[0.025 0.    0.025 0.2  ]
 [0.15  0.025 0.15  0.05 ]
 [0.075 0.225 0.075 0.   ]] 



In [3]:
# Computation of joint entropy (zeros are skipped due to 0 * log(0) = 0)
H_TS = Pr_TS * np.log2(Pr_TS, out=np.zeros_like(Pr_TS), where=(Pr_TS!=0))
H_TS = - np.sum(H_TS)

print("The joint entropy becomes %g bit.\n" % (H_TS))

The joint entropy becomes 2.94546 bit.



## Marginal Probabilities and Entropies

The marginal probabilities can be computed using the relationships

\begin{align}
    \Pr\{ {\cal T}=t \} &= \sum_{ s \in \mathbb{S}} \Pr\{ {\cal T}=t, {\cal S}=s\} \\
    \Pr\{ {\cal S}=s \} &= \sum_{ t \in \mathbb{T}} \Pr\{ {\cal T}=t, {\cal S}=s\} . 
\end{align}

Therefore, the temperature probabilities are obtained by summing over the columns of the above table. Consequently, the season probabilities are determined by summing the rows of that table.

In [4]:
# marginal temperature probabilities 
Pr_T = np.sum(Pr_TS,axis=1)
Pr_S = np.sum(Pr_TS,axis=0)

# Entropies for temperatures and seasons
H_T = - Pr_T @ np.transpose(np.log2(Pr_T))
H_S = - Pr_S @ np.transpose(np.log2(Pr_S))

print("The temperature probabilities amount to \n", Pr_T)
print("The temperature entropie amounts to H(T)=%g bit. \n" %(H_T))

print("for validation: The season probabilities mount to \n", Pr_S)
print("The season entropy amounts to H(S)=%g bit. \n" %(H_S))

The temperature probabilities amount to 
 [0.25  0.375 0.375]
The temperature entropie amounts to H(T)=1.56128 bit. 

for validation: The season probabilities mount to 
 [0.25 0.25 0.25 0.25]
The season entropy amounts to H(S)=2 bit. 



## Conditional Probabilities and Entropies 2

Based on marginal and joint probabilities, we can determine the conditional probabilities

\begin{equation}
    \Pr\{ {\cal S} \mid {\cal T} \} = \frac{\Pr\{ {\cal T},  {\cal S} \}} {\Pr\{ {\cal T} \}} .
\end{equation}

They describe with which probabilitiy a season occurs for a given temperature.

The conditional entropy $H({\cal S} \mid {\cal T})$ becomes

\begin{align*}
    H({\cal S} \mid {\cal T}) 
    &= \mathrm{E} \big\{ \log_2 \Pr\{ {\cal S} \mid {\cal T} \} \big\} \\
    &= - \sum_{s \in \mathbb{S}} \sum_{t \in \mathbb{T}} \Pr \{ {\cal T}=t, {\cal S}=s \} \cdot \log_2 \Pr \{ {\cal S}=s \mid {\cal T}=t \} \\
    &= - \sum_{t \in \mathbb{T}} \Pr\{ {\cal T} = t \} \cdot \sum_{s \in \mathbb{S}} \Pr \{ {\cal S}=s \mid {\cal T}=t \} \cdot \log_2 \Pr \{ {\cal S}=s \mid {\cal T}=t \} \\
\end{align*}

In [5]:
# Compute conditional probabilities
Pr_S_T = Pr_TS / np.tile(np.reshape(Pr_T,(3, 1)),(1,4))

print("The conditional probabilities Pr{S | T} amount to \n", Pr_S_T, "\n")

# Computation of conditional entropy H( S | T)  (zeros are not considered due to 0 * log(0) = 0)
H_S_T = Pr_TS * np.log2(Pr_S_T, out=np.zeros_like(Pr_S_T), where=(Pr_S_T!=0))
H_S_T = - np.sum(H_S_T)

print("The conditional entropy H(S | T) amounts to %g bit.\n" %(H_S_T))

The conditional probabilities Pr{S | T} amount to 
 [[0.1        0.         0.1        0.8       ]
 [0.4        0.06666667 0.4        0.13333333]
 [0.2        0.6        0.2        0.        ]] 

The conditional entropy H(S | T) amounts to 1.38418 bit.



A comparison of the conditional entropies $H({\cal T} | {\cal S})$ and $H({\cal S} | {\cal T})$ with the uncoditional entropies $H({\cal T})$ and $H({\cal T})$ discloses that side information (conditioning) can reduce uncertainty. For instance, the temperatures entropy reduces from 1.56 bit to 0.945 bit whenn the season is known. For the seasons, the entropy reduces from 2 bit down to 1.38 bit for given temperature. From this artificial example, we might conclute that the temperature provides less information about the season as vice versa.

## Mutual Information

The mutual information represents the common information of two processes, in our example seasons and temperatures. It is defined as

\begin{align}
    I({\cal T};{\cal S})
    &= H({\cal T}) + H({\cal S}) - H({\cal T},{\cal S})
     = H({\cal T}) - H({\cal T} \mid {\cal S})
     = H({\cal S}) - H({\cal S} \mid {\cal T}) .
\end{align}

The larger the mutual information the more similar are the considered processes. Thereby, the mutual information is bounded to above by the minimum of the entropies $H({\cal S})$ and $H({\cal T})$.

In [6]:
I_ST = 2 - H_S_T

print("The mutual information amounts to I(S;T)=%g bit.\n" %(I_ST))

The mutual information amounts to I(S;T)=0.615816 bit.

