# Exercise 1 - Inference Queries, Independence

In this exercise, we will answer inference queries from a probability table and check for marginal independence.

In the event of a persistent problem, do not hesitate to contact the course instructors under
- paul.kahlmeyer@uni-jena.de

### Submission

- Deadline of submission:
        30.10.2022
- Submission on [moodle page](https://moodle.uni-jena.de/course/view.php?id=34630)

### Help
In case you cannot solve a task, you can use the saved values within the `help` directory:
- Load arrays with [Numpy](https://numpy.org/doc/stable/reference/generated/numpy.load.html)
```
np.load('help/array_name.npy')
```
- Load functions with [Dill](https://dill.readthedocs.io/en/latest/dill.html)
```
import dill
with open('help/some_func.pkl', 'rb') as f:
    func = dill.load(f)
```

to continue working on the other tasks.

# Probability Table

We will use a probability table derived from the [migraine dataset](https://www.kaggle.com/datasets/weinoose/migraine-classification). 
Due to the lack of description, we can only guess what some of the attributes stand for exactly.
Nevertheless, we have 11 discrete features from patients suffering from migraine.

1. `Age`: the age of the patient in intervals of 20 years
    - 0: $\leq 20$
    - 1: $\in (20, 40]$
    - 2: $\in (40, 60]$
    - 3: $>60$
2. `Duration`: how long did the migraine attack last?
    - 0: short
    - 1: normal
    - 2: long
3. `Intensity`: how intense was the migraine attack?
    - 0: very light
    - 1: light
    - 2: intense
    - 3: very intense
4. `Nausea`: did patient feel sick?
    - 0: no
    - 1: yes
5. `Vomit`: did the migraine attack cause the patient to vomit?
    - 0: no
    - 1: yes
6. `Phonophobia`: did the patient feel specifically sensitive for sound?
    - 0: no
    - 1: yes
7. `Photophobia`: did the patient feel specifically sensitive for light?
    - 0: no
    - 1: yes
8. `Tinnitus`: did the patient suffer from tinnitus?
    - 0: no
    - 1: yes
9. `Conscience`: did the patient loos conscience?
    - 0: no
    - 1: yes
10. `Paresthesia`: did the patient feel numbness?
    - 0: no
    - 1: yes
11. `Type`: What kind of migraine did the patient have?
    - 0: Basilar-type aura
    - 1: Familial hemiplegic migraine
    - 2: Migraine without aura
    - 3: Other
    - 4: Sporadic hemiplegic migraine
    - 5: Typical aura with migraine
    - 6: Typical aura without migraine

### Task 1

Load the probability table from `prob_table.npy`.
The first 11 columns correspond to the features, the last column holds the probabilty.

In [None]:
columns = ['Age', 
           'Duration', 
           'Intensity', 
           'Nausea', 
           'Vomit', 
           'Phonophobia',
           'Photophobia', 
           'Vertigo', 
           'Tinnitus', 
           'Conscience', 
           'Paresthesia',
           'Type']

# TODO: load probability table

# Inference Queries

The probability table encodes the joint probability distribution $p(x_1, \dots, x_{11})$, where $x_i$ corresponds to the $i$-th feature. The whole point of having such a distribution is to answer queries with it.

## Prior Marginal
For the prior marginal, a subset of indices $I\subseteq\{1,\dots, 11\}$ is given and the marginal distribution 

\begin{equation}
p(x_I)
\end{equation}

has to be computed.

### Task 2

Calculate the marginal distribution of `Vertigo` and `Vomit`.

In [None]:
def prior_marginal(prob_table:np.ndarray, I:np.ndarray) -> np.ndarray:
    '''
    Computes the probability table for a subset of the indices.
    
    @Params:
        prob_table... numpy array with columns holding values, last column holding the probabilities
        I... numpy array with indices
    
    @Returns:
        numpy array with columns holding values, last column holding the probabilities for indices in I
    '''
    # TODO: implement
    pass

# TODO: calculate p(Vertigo, Vomit)

## Posterior Marginal
For the posterior marginal, two subsets of indices $I, J\subseteq\{1,\dots, 11\}$ together with values $e_J\in \mathcal{X}_J$ are given and the conditional distribution 

\begin{equation}
p(x_I|x_J=e_J) 
\end{equation}

has to be computed.

### Task 3
Calculate the posterior marginal distribution of `Type` given we observe the patient feels sick (`Nausea`=1) but no has no tinitus (`Tinnitus`=0).

In [None]:
def posterior_marginal(prob_table:np.ndarray, I:np.ndarray, J:np.ndarray, e_J:np.ndarray) -> np.ndarray:
    '''
    Computes the probability table for a subset of the indices given other subset is set to values.
    
    @Params:
        prob_table... numpy array with columns holding values, last column holding the probabilities
        I... numpy array with indices
        J... numpy array with indices
        e_J... numpy array with values for J
    
    @Returns:
        numpy array with columns holding values, last column holding the probabilities for indices in I
    '''
    # TODO: implement
    pass

# TODO: calculate p(Type | Nausea = 1, Tinnitus = 0)

## Probability of Evidence

For a subsets of indices $I, J \subseteq \{1, \dots, 11\}$ and evidence $e_I\in \mathcal{X}_I$ and $e_J\in\mathcal{X}_J$ compute the posterior marginal probability 

\begin{equation}
p(x_I = e_I| x_J = e_J)\,.
\end{equation}

In the special case $J = \emptyset$, compute the prior marginal probability $p(x_I = e_I)$.

### Task 4

Calculate the probability of a short attack (`Duration` = 0) given the patient is $\leq 20$ years (`Age` = 0) old and experiences swindle (`Vertigo` = 1).

In [None]:
def prob_of_evidence(prob_table:np.ndarray, I:np.ndarray, e_I: np.ndarray, J:np.ndarray, e_J:np.ndarray) -> float:
    '''
    Computes the probability of I being e_I given J is e_J.
    
    @Params:
        prob_table... numpy array with columns holding values, last column holding the probabilities
        I... numpy array with indices
        e_I... numpy array with values for I
        J... numpy array with indices
        e_J... numpy array with values for J
    
    @Returns:
        probability of I being e_I given J is e_J.
    '''
    
    # TODO: implement
    pass


# TODO: calculate p(Duration = 0 | Age = 0, Vertigo = 1)

## Most probable explanation (MPE)

Given evidence $e_J\in\mathcal{X}_J$ for a subset of indices $J\subseteq\{1,\dots, 11\}$, compute

\begin{equation}
\text{argmax}_{x\in\mathcal{X}} p(x|x_J = e_J)\,.
\end{equation}

### Task 5

What is the intesity of the most probable explanation for a long (`Duration` = 2) migraine attack of a 30 years old (`Age` = 1) patient with Tinitus (`Tinnitus` = 1), both Phono- and Photophobia (`Phonophobia` = 1, `Photophobia` = 1) where we know that it is of the type "Basilar-type aura" (`Type` = 0)?

In [None]:
def most_prob_explanation(prob_table:np.ndarray, J:np.ndarray, e_J:np.ndarray) -> np.ndarray:
    '''
    Computes the most probable x given some evidence
    
    @Params:
        prob_table... numpy array with columns holding values, last column holding the probabilities
        J... numpy array with indices
        e_J... numpy array with values for J
    
    @Returns:
        x that maximizes probability of x given J is set to e_J
    '''
    # TODO: implement
    pass


# TODO: calculate intensity of argmax p(x | Age = 1, Tinnitus = 1, Duration = 2, Phonophobia = 1, Photophobia = 1, Type = 0)

### Maximum a Posteriori hypothesis (MAP)

For a subsets of indices $I, J \subseteq \{1, \dots, 11\}$ and evidence $e_J\in\mathcal{X}_J$ compute 

\begin{equation}
\text{argmax}_{x_I} p(x_I|x_J = e_J)\,.
\end{equation}

### Task 6
What is the maximum a posteriori hypothesis for the type of migraine attack (`Type`) of a 15 year old person (`Age` = 0) that has tinnitus (`Tinnitus` = 1)?

In [None]:
def max_a_posteriori(prob_table:np.ndarray, I:np.ndarray, J:np.ndarray, e_J:np.ndarray) -> np.ndarray:
    '''
    Computes the most probable x given some evidence
    
    @Params:
        prob_table... numpy array with columns holding values, last column holding the probabilities
        I... numpy array with indices
        J... numpy array with indices
        e_J... numpy array with values for J
    
    @Returns:
        x_I that maximizes probability of x given J is set to e_J
    '''
    # TODO: implement
    pass

# TODO: calculate argmax p(type | age = 0, tinnitus = 1)

# Independence

As pointed out in the lecture, the number of parameters reduces if we know two features are independent.
Independence of features also has great value for the interpretation of data: One feature does not contain any information about the other.

Here we want to look at **marginal independence**. Two features $x_i, x_j$ are marginally independent, if
\begin{equation}
p(x_i, x_j) = p(x_i)p(x_j)\,.
\end{equation}

Of course in real data we will never have perfect marginal independence.

### Task 7
Implement the function `independence_error`, that calculates how similar the vectors
\begin{align}
v_1 &= \left[p(x_i = e_i, x_j = e_j)\right]_{e_i\in\mathcal{X}_i,e_j\in\mathcal{X}_j}\\
v_2 &= \left[p(x_i = e_i)p(x_j = e_j)\right]_{e_i\in\mathcal{X}_i,e_j\in\mathcal{X}_j}\\
\end{align}
are. If they are very similar, they are close to being marginally independent.

Which features are closest to being marginally independent from `Type`?

In [None]:
def independence_error(prob_table : np.ndarray, i : int, j : int) -> float:
    '''
    Compares the vectors p(x_i, x_j) and p(x_i)*p(x_j).
    
    @Params:
        prob_table... numpy array with columns holding values, last column holding the probabilities
        i... index of first feature
        j... index of second feature
        
    @Returns:
        difference of vectors p(x_i, x_j) and p(x_i)*p(x_j)
    '''
    # TODO: implement
    pass

# TODO: which features are close to marginal independence with 'type'?