# Exercise 1.1: Conditional probability
Since $p(y \mid \theta=1)= N(y \mid 1, \sigma^2)$ and $p(y \mid \theta=2)=N(y \mid 2, \sigma^2)$, hence:

\begin{equation*}
  p(y \mid \theta=\theta_0) = \begin{cases}
    N(y \mid \theta_0, \sigma^2) & \text{if } \theta_0 \in \{1, 2\}\\
    0 & \text{otherwise}
  \end{cases}
\end{equation*}

(a) Let $\sigma=2$, then:

\begin{equation*}
\begin{split}
p(y) & = \sum\limits_{\theta} p(y,\theta) = \sum\limits_{\theta} p(y \mid \theta)p(\theta)  = \sum\limits_{\theta \in \{1,2\}} N(y \mid \theta, 2^2) \cdot 0.5\\
& = 0.5 \cdot N(y \mid 1, 4) + 0.5 \cdot N(y \mid 2, 4) = 0.5 \cdot N(y\mid 3, 8) = N(y\mid 1.5, 2)
\end{split}
\end{equation*}

<img src="figures/fig1.1.png">

(b) By Bayes' theorem and the law of total probabilities:

\begin{equation*} 
\begin{split}
p(\theta=1\mid y=1) & = \frac{p(y=1\mid \theta=1)\cdot p(\theta=1)}{p(y)} = \frac{p(y=1\mid \theta=1)\cdot p(\theta=1)}{p(y=1\mid \theta=1)\cdot p(\theta=1) + p(y=1\mid \theta=2)\cdot p(\theta=2)} \\
& = \frac{0.5 \cdot N(1\mid 1,4)}{0.5 \cdot N(1\mid 1,4) + 0.5 \cdot N(1\mid 2,4)} =  \frac{1}{1 + \exp(-\frac{1}{8})} \approx 0.531
\end{split}
\end{equation*}

(c) Notice that we have the following equalities:

\begin{equation*}
\begin{split}
N(1\mid 1, \sigma) = & \frac{1}{\sigma \sqrt{2\pi}} & \\
N(1\mid 2, \sigma) = & \frac{1}{\sigma \sqrt{2\pi}} \cdot \exp\left(-\frac{1}{2}\cdot \left( \frac{1-2}{\sigma}\right)^2\right) & = \frac{1}{\sigma \sqrt{2\pi}} \cdot \exp\left(-\frac{1}{2\sigma^2}\right)
\end{split}
\end{equation*}

Hence 

\begin{equation*} 
p(\theta=1\mid y=1)  = \frac{1}{1 + \exp\left(-\frac{1}{2\sigma^2}\right)} = 
    \begin{cases}
        1 & \text{for } \sigma\rightarrow 0 \\
        \frac{1}{2} & \text{for } \sigma\rightarrow \infty
  \end{cases}
\end{equation*}

We deduce that for $\sigma\rightarrow 0$ the posterior density of $\theta$ approaches 1 while for $\sigma\rightarrow \infty$ the posterior of $\theta$ approaches its prior.

---

Here's the python code to generate Fig 1.1, the distribution of $p(y)$:
```python
import numpy as np
import scipy.stats as stats
import plotly.graph_objects as go

y = np.arange(-8, 10, 0.05)
py = 0.5 * stats.norm(loc=1, scale=2).pdf(y) + 0.5 * stats.norm(loc=2, scale=2).pdf(y)
fig = go.Figure(
        go.Scatter(x=y, 
                   y=py,
                   mode='markers',
                   marker={'size':3},
                  )
)

fig.update_layout(
    title={'text': 'Fig 1.1 - Marginal distribution density of y',
           'y':0.9, 'x':0.5, 'xanchor': 'center', 'yanchor': 'top'},
    xaxis={'title': 'y'},
    yaxis={'title': 'p(y)'})

fig
```

# Exercise 1.2: Conditional means and variances 

Let $u = (u_1, \dots, u_n)^T \in \mathbb{R}^n$. By definition we have:

\begin{align*}
    E[u] &= \begin{bmatrix}
           E[u_1] \\
           E[u_2] \\
           \vdots \\
           E[u_n]
         \end{bmatrix}
\end{align*}

From the univariate case follows that: 

\begin{equation*}
E[u_i] = E\left[E[u_i \mid v]\right]  \:\: \forall i\in \{1, \dots, n\}
\end{equation*}

Hence the first result.

Recall that for a multivariate random variable $u \in \mathbb{R}^n$:

\begin{equation*}
cov(u) = E \big[ (u − E(u))(u − E(u))^T \big] = E[uu^T] - E[u]E[u]^T \in \mathbb{R}^{n\times n}
 \:\: \Longrightarrow
 \: \: cov(u)_{i,j} = \begin{cases}
    var(u_i) = cov(u_i, u_i) & \text{if } i=j\\
    cov(u_i, u_j) & \text{if} i\neq j
  \end{cases}
\end{equation*}

The following component-wise equality yields the result:

\begin{equation*}
\begin{split}
& E\big[ cov(u_i, u_j) \mid v\big] + cov\big(E[u_i \mid v], E[u_j \mid v] \big)  \\
& = E\left[E[u_i u_j |v] − E[u_i \mid v]\cdot E[u_j \mid v] \right] + E\left[ E[u_i \mid v] \cdot E[u_j \mid v] \right]   − E\big[E[u_i \mid v]\big] \cdot E\left[E[u_j \mid v]\right] \\
& = E\left[ u_i u_j \right] − E\left[ E[u_i \mid v]\cdot E[u_j \mid v] \right] + E\left[E[u_i \mid v]\cdot E[u_j \mid v] \right] − E\left[E[u_i \mid v]\right] \cdot E\left[E[u_j \mid v]\right] \\
& = E\left[u_i u_j\right] − E[u_i]E[u_j] = cov(u_i, u_j) \:\: \forall i,j \in \{1,\dots , n\} 
\end{split}
\end{equation*}

# Exercise 1.3: Probability calculation for genetics (from Lindley, 1965)

Let $C$ denote the child, $P_1, P_2$ denote the parents, $B$ be the event of having brown-eyes and $H$ be the event of being heterozygote.

We are given the following priors:

\begin{equation*}
\begin{split}
& Pr(xx) = p^2 \\
& Pr(H) = 2p(1-p) \\
& Pr(XX) = (1-p)^2
\end{split}
\end{equation*}

Then using Bayes' theorem and assuming that the probability of parents being brown-eyed are independent we have:

\begin{equation*}
\begin{split}
Pr(C \in H \mid C\in B, P_1 \in B, P_2 \in B) & = \frac{Pr(C \in H, P_1 \in B, P_2 \in B)}{Pr(C\in B, P_1 \in B, P_2 \in B)} \\[5pt]
& = \frac{\sum\limits_{b_1 \in B}\sum\limits_{b_2 \in B} Pr(C \in H \mid P_1=b_1, P_2=b_2) Pr(P_1=b_1) Pr(P_2=b_2)}{\sum\limits_{b_1 \in B}\sum\limits_{b_2 \in B} Pr(C\in B, P_1=b_1, P_2=b_2) Pr(P_1=b_1) Pr(P_2=b_2)} \\[5pt]
& = \frac{\frac{1}{2}\cdot 4p^2(1-p)^2 + \frac{1}{2} \cdot 2p(1-p)^3 + \frac{1}{2}\cdot 2p(1-p)^3 + 0 \cdot (1-p)^4}{\frac{3}{4} \cdot 4p^2(1-p)^2 + 1\cdot 2p(1-p)^3 + 1\cdot 2p(1-p)^3 + 1\cdot (1-p)^4}\\[5pt]
& = \frac{2p^2(1-p)^2 + 2p(1-p)^3}{3p^2(1-p)^2 + 4p(1-p)^3 + (1-p)^4} = \frac{2p}{1+ 2p}\\
\end{split}
\end{equation*}

Since Judy has brown-eyed with brown-eyed parents, the prior for Judy being a heterozygote, denoted by $J \in H$, is 

\begin{equation*}
\begin{split}
& Pr(J\in H) = \frac{2p}{1+ 2p} \\
& Pr(J \in B \setminus H) = Pr(J = XX) = 1 - Pr(J\in H) = \frac{1}{1+ 2p}
\end{split}
\end{equation*}

Now we are given the information that she marries a heterozygote and they have $n$ brown-eyed children.
Let's denote by $X_i, \: i=1, \dots n$ the event that the $i$-th child has brown-eyes, and by $XX$ the event $B\setminus H$, then:

\begin{equation}
\begin{split}
Pr(J \in H \mid X_1 \in B, \dots, X_n\in B) & = \frac{Pr(X_1 \in B, \dots, X_n\in B \mid  J \in H ) \cdot P(J\in H)}{Pr(X_1 \in B, \dots, X_n\in B)}\\[5pt]
& = \frac{Pr(X_1 \in B, \dots, X_n\in B \mid  J \in H ) \cdot P(J\in H)}{Pr(X_1 \in B,.., X_n\in B \mid  J \in H )P(J\in B) + Pr(X_1 \in B,.., X_n\in B \mid  J = XX)P(J = XX)}\\
\end{split}
\end{equation}

Breaking it down we have:

\begin{equation*}
\begin{split}
& Pr(X_1 \in B, \dots, X_n\in B \mid  J \in H ) = \prod\limits_{i=1}^n Pr(X_i \in B \mid J\in H) = \left( \frac{3}{4}\right)^n \\
& Pr(X_1 \in B, \dots, X_n\in B \mid  J = XX) = 1
\end{split}
\end{equation*}

Therefore (1) reduces to the following:

\begin{equation}
\begin{split}
Pr(J \in H \mid X_1 \in B, \dots, X_n\in B) & = \frac{ \left( \frac{3}{4}\right)^n \cdot \frac{2p}{1+ 2p} }{ \left( \frac{3}{4}\right)^n \cdot \frac{2p}{1+ 2p} + \frac{1}{1+ 2p}}
\end{split}
\end{equation}

This will update our prior value for $J \in H$ and $J=XX$; for brevity let's denote by $p_{J_H, n}$ the value of equation (2).

Given the information that Judy’s children are all brown-eyed, her grandchild has blue eyes only if Judy’s child is heterozygote and her/his spouse is not XX.
First let's compute the probability that Judy's child is heterozygote ($C \in H):

\begin{equation*}
\begin{split}
Pr(C \in H \mid \text{all information above}) & = Pr(C \in H, J \in H \mid \text{all information above}) + Pr(C \in H, J = XX \mid \text{all information above}) \\[5pt]
& =  Pr(C \in H \mid J \in H) \cdot Pr(J\in H) + Pr(C \in H \mid J = XX) \cdot Pr(J = XX)\\[5pt]
& = \frac{2}{3} \cdot p_{J_H, n} + \frac{1}{3} \cdot (1-p_{J_H, n})  = \frac{2}{3} \cdot \frac{ \left( \frac{3}{4}\right)^n \cdot \frac{2p}{1+ 2p} } { \left( \frac{3}{4}\right)^n \cdot \frac{2p}{1+ 2p} + \frac{1}{1+ 2p}} + \frac{1}{3} \cdot \frac{\frac{1}{1+ 2p} }{ \left( \frac{3}{4}\right)^n \cdot \frac{2p}{1+ 2p} + \frac{1}{1+ 2p}} \\[5pt]
& = \frac{ 4p\cdot \left( \frac{3}{4}\right)^n + 1 }{ 3 \cdot \left( \left( \frac{3}{4}\right)^n \cdot 2p + 1 \right)}
\end{split}
\end{equation*}

Now assuming that Judy’s child is in $H$, the probability of her grandchild $Gc$ being blue-eyed depends upon the child's spouse:

\begin{equation*}
\begin{split}
& Pr(Gc = xx \mid C\in H, S=XX) = 0\\[5pt]
& Pr(Gc = xx \mid C\in H, S\in H) = \frac{1}{4}\\[5pt]
& Pr(Gc = xx \mid C\in H, S=xx) = \frac{1}{2}
\end{split}
\end{equation*}

Finally:

\begin{equation*}
\begin{split}
Pr(Gc = xx\mid \text{all information}) & = Pr(Gc = xx \mid C\in H, S=XX) Pr(C\in H) Pr(S=XX)\\
& \:\:\:\:+ Pr(Gc = xx \mid C\in H, S\in H)  Pr(C\in H) Pr(S\in H) \\
& \:\:\:\:+ Pr(Gc = xx \mid C\in H, S=xx)  Pr(C\in H) Pr(S=xx)\\[5pt]
& = Pr(C\in H) \left(0 \cdot (1-p)^2 + \frac{1}{4} \cdot 2p(1-p) + \frac{1}{2}\cdot p^2 \right)\\[5pt]
& = \frac{ 4p\cdot \left( \frac{3}{4}\right)^n + 1 }{ 3 \cdot \left( \left( \frac{3}{4}\right)^n \cdot 2p + 1 \right)} \cdot \left(\frac{1}{2}p \right)\\[5pt]
& = \frac{ 2p^2\cdot \left( \frac{3}{4}\right)^n + 1 }{ 3 \cdot \left( \left( \frac{3}{4}\right)^n \cdot 2p + 1 \right)}
\end{split}
\end{equation*}

# Exercise 1.4: Probability assignment

As in the textbook let $y= \text{outcome}$, $x=\text{point spread}$ and $d = y-x$, so that $d  \mid x \sim N(0, 14^2)$ and $d$ independent from $x$.

(a) Using empirical frequencies we have
\begin{equation*}
\begin{split}
& Pr(y>0 \mid x=8) = \frac{8}{12} = \frac{2}{3} \approx 0.67 \\
& Pr(y>8 \mid x=8) = \frac{5}{12}  \approx 0.42 \\
& Pr(y>8 \mid x=8, y>0) = \frac{5}{8} = 0.625
\end{split}
\end{equation*}

(b) Using normal approximation as in the textbook: 
\begin{equation*}
\begin{split}
& Pr(y>0 \mid x=8) = Pr(d>-8 \mid x=8) = Pr(d>-8) = 1 - \Phi\left( -\frac{8}{14} \right) \approx 0.71\\
& Pr(y>8 \mid x=8) = Pr(d>0 \mid x=8) = Pr(d>0) = \frac{1}{2} = 0.5 \\
& Pr(y>8 \mid x=8, y>0) = \frac{Pr(y>8 \mid x=8)}{Pr(y>0 \mid x=8)} = \frac{0.5}{0.71} \approx 0.69
\end{split}
\end{equation*}

---

Python code
```
import scipy.stats as stats
1 - stats.norm(0,1).cdf(-8/14)

0.7161454169013237
```

# Exercise 1.6: Conditional probability

Let $F$ be the event of being fraternal twins, $I$ the event of being identical twins.

Hence we want to compute $P(I \mid \text{twin brother})$:

\begin{equation*}
\begin{split}
P(I \mid \text{twin brother}) & = \frac{P( I, \text{twin brother})}{P(\text{twin brother})} \\[5pt]
& = \frac{P( I, \text{twin brother})}{P( I, \text{twin brother}) + P(F, \text{twin brother}))} \\[5pt]
& = \frac{P(\text{twin brother} \mid I) P(I)}{P(\text{twin brother} \mid I) P(I) + P(\text{twin brother} \mid F) P(F)}
\\[5pt]
& = \frac{\frac{1}{2} \cdot \frac{1}{300}}{\frac{1}{2} \cdot \frac{1}{300} + \frac{1}{4} \cdot \frac{1}{125}} = \frac{5}{11}
\end{split}
\end{equation*}


# Exercise 1.7: Conditional probability

Denote $D_i$ the event that the fabulous prize is inthe $i$-th door, then our prior is $P(D_i) = \frac{1}{3}$ and consequently the probability of the prize not being behind such door is $P(D^c_i) = 1 - P(D_i) = \frac{2}{3}$.

Let $S_j$ be the event that Monty Hall shows the inside of the $j$-th "lesser prize" door. We then ask ourselves what is $p(D_i \mid S_j)$.
Given the rules of the game, we have that $p(S_j \mid D_i) = 1 = p(S_j \mid D^c_i)$ since Monty Hall _always_ open a losing door. Therefore:

\begin{equation*}
\begin{split}
p(D_i \mid S_j) & = \frac{p(S_j \mid D_i)p(D_i)}{p(S_j \mid D_i)p(D_i) + p(S_j \mid D^c_i)p(D^c_i)} \\[5pt]
& = \frac{1 \cdot \frac{1}{3}}{1 \cdot \frac{1}{3} + 1 \cdot \frac{2}{3}} = \frac{1}{3}\\[5pt]
\end{split}
\end{equation*}

which means that the probability the $i$-th door has the fabolous prize after that Monty Hall opens door $j$ is still $\frac{1}{3}$. Hence the remaining door $k$ has probability $\frac{2}{3}$ of being the winning one. Switching door leads to doubling contestant odds of winning.

# Exercise 1.8: Subjective probability

(a) For the person $A$ the probability is absorbed by the outcome of the die roll, hence 
\begin{equation*}
P_A(6) = \begin{cases} 
    1 & \text{if outcome = 6}\\
    0 & \text{otherwise}
\end{cases}
\end{equation*}
On the other hand, for the person $B$, who doesn't observe the outcome, each result is equally likely (assuming the die is fair) i.e. $P_B(6) = \frac{1}{6}$ 

(b) For semplicity of reasoning let's focus on the tournament phase of the soccer World Cup (32 teams).
For the person $A$, we want to express some kind of domain ignorance. One could say that each of the 32 teams are equally likely to win hence $P_A(\text{Brazil wins}) = \frac{1}{32}$. 

On the other hand, person $B$, who is a knowledgeable sports fan, could make further considerations and analyze recent players and teams performance. A starter could be to consider overall performance of Brazil in past world cup events to conclude that, since Brazil won 5 out of the 23 Soccer World Cup,  $P_B(\text{Brazil wins}) = \frac{5}{23}$.

# Exercise 1.9: Simulation of a queuing problem

In [1]:
import pandas as pd
import numpy as np
import scipy.stats as stats


class Clinic:
    
    def __init__(self, n_doctors=3, n_minutes=420, exp_scale=10, unif_a=5, unif_b=20):
        """
        Parameters
        ----------
        n_doctors: int
            number of doctors in the clinic
        n_minutes: int
            number of minutes they receive people for
        exp_scale: (positive) float,
            scale paramenter of the exponential random variable which 
            models patients arrival time
        unif_a: float
            loc parameter of the uniform distribution
        unif_b: float
            scale + unif_a parameter of the uniform distribution
        """
        
        self.n_doctors = n_doctors 
        self.n_minutes = n_minutes
        
        self.exp_scale = exp_scale
        self.exp_rv = stats.expon(scale=self.exp_scale)
        self.unif_rv = stats.uniform(loc=unif_a, scale=unif_b-unif_a)
    
        self.init_arrivals_visits()      
        self.processes = None
        
    def init_arrivals_visits(self):
        """
        Initializes arrival times of the patients and visiting time
        """
        
        size = 2*self.n_minutes//self.exp_scale
        arrivals = self.exp_rv.rvs(size).cumsum()
        arrivals = arrivals[:np.argmin(arrivals<self.n_minutes)]
        
        self.arrivals = arrivals
        self.n_patients = self.arrivals.shape[0]
        
        self.visits_time = self.unif_rv.rvs(self.n_patients)
    
        return self
        
        
    def process(self):
        """
        Simulate one day process in the clinic
        
        Returns
        -------
        stats: pd.Series
            series containing statistics to monitor
        """
        doctors = np.zeros(self.n_doctors, dtype=float) 
        waiting_time, n_waited, visits_end = 0, 0, np.zeros(self.n_patients, dtype=float)
        
        for i, (arrival, visit_time) in enumerate(zip(self.arrivals, self.visits_time)):
            
            min_time = np.min(doctors)
            current_wait = np.maximum(min_time - arrival, 0)
            waiting_time += current_wait
            n_waited += (current_wait > 0)
            
            visit_start = np.max([min_time, arrival])
            visit_end = visit_start + visit_time
            
            doctors[np.argmin(doctors)] = visit_end
            visits_end[i] = visit_end
        
        stats = pd.Series(data={
            'n_patients': self.n_patients,
            'n_waited': n_waited,
            'avg_waiting': waiting_time/self.n_patients,
            'closing_time': np.max(visit_end)
        }) 
        
        return stats
        
        
    def simulate(self, n_processes=100):
        """
        Simulates n_processes (i.e. n days) and concatenates the results
        """
        df = pd.concat([self.init_arrivals_visits().process() for _ in range(n_processes)], axis=1).T
        self.processes = df
        self.n_processes = n_processes
        return self
    
    
    def summary(self):
        """
        prints the quantiles as required in the question
        """
        if not isinstance(self.processes, pd.DataFrame):
            print(f'Simulating 100 processes')
            n_processes = 100
            self.simulate(n_processes)
            self.n_processes = n_processes
            
        p = self.processes
        print(f'Simulated {self.n_processes} processes', '-'*25, p.quantile([0.25, 0.5, .75]), sep='\n')
        
        return self
        
c = Clinic()
print(c.process(), '\n')
c.simulate(n_processes=100).summary()

n_patients       46.000000
n_waited          6.000000
avg_waiting       0.464350
closing_time    428.243645
dtype: float64 

Simulated 100 processes
-------------------------
      n_patients  n_waited  avg_waiting  closing_time
0.25        37.0       3.0     0.200543    420.013916
0.50        41.0       5.0     0.503573    424.768200
0.75        46.0       9.0     0.966340    428.939956


<__main__.Clinic at 0x237b204adc0>