# Attributions

**Chris:** Bulk of question

**Naveen:** Looked over final version

**Emily:** Edited most parts for clarity/added explanation


In [1]:
#Imports
import numpy as np
from scipy import integrate

import bokeh.io
import bokeh.plotting
bokeh.io.output_notebook()

<div class="alert alert-info">
You used <tt>from scipy import integrate</tt> here but called <tt>scipy.integrate</tt> rather than <tt>integrate</tt> in your code.  I've added a cell here so that the code runs on my machine, but just to let you know!
</div>

In [2]:
import scipy.integrate

**a)** We start with our best friend Bayes's Theorem, then use a binomial distribution for our likelihood and a uniform prior. We have selected the binomial distribution for our likelihood because its parameters fit with those required for the distribution: $p$ probability of reversal and $n$ trials, with $k$ successes of reversal. We chose a uniform distribution for our prior because do not know any information prior (ha!) about our prior, so it's easier and reasonable to assume equal probabilities:

\begin{align}
\text{Posterior:} \, g(p \mid r, n) & = \frac{f(r \mid p, n) \, g(p)}{f(r \mid n)} \\
\text{Likelihood:} \, f(r \mid p, n) & = \binom{n}{r} p^r (1-p)^{n-r} \\
\text{Prior:} \, g(p) & = 1 \, \textrm{for all 0 $\leq$ p $\leq$ 1}
\end{align}

The probability of obtaining the data/evidence ($f(r \mid n)$) can then be calculated by integrating the posterior from $p = 0$ to $p = 1$ and setting it equal to 1.

<div class="alert alert-info">
a: 10/10  
Nice work!
</div>

**b)** In order to plot the posterior probability density function for each of the three strains, we first need to find the value for the evidence, which should be a constant for each. We can plug in the n and r from our trials to get this value. We'll start with the wild-type:

\begin{align}
g(p \mid r, n) & = p^{13} (1-p) ^{126-13} * C_{WT} \\
& = p^{13} (1-p) ^{113} * C_{WT} \\
C_{WT} & = \frac{1}{\int_0 ^1 p^{13} (1-p) ^{126-13} dp} \\
& \approx 2.168 * 10^{19}
\end{align}

Similarly, we find the constants for the other two strains and plug them into the posterior expression to get the following probability distributions (see calculations for C of each type below using scipy.integrate):

In [3]:
p = bokeh.plotting.figure(plot_height=250,
                          plot_width=700,
                          x_axis_label='Reversal Probability (p)',
                          y_axis_label='Likelihood')

# Wild-type
# Finding the constant
wt_func = lambda p: p**(13) * (1-p)**(126-13)
wt_const = scipy.integrate.quad(wt_func, 0, 1)
c1 = 1/wt_const[0]
# Plotting
x = np.linspace(0, 1, 1000)
y = np.vectorize(lambda x: c1 * x**13 * (1-x)**113)
p.line(x, y(x), color='dodgerblue', legend="Wild-type")

# ASH
ASH_func = lambda p: p**(39) * (1-p)**(124-39)
ASH_const = scipy.integrate.quad(ASH_func, 0, 1)
c2 = 1/ASH_const[0]
# Plotting
y = np.vectorize(lambda x: c2 * x**39 * (1-x)**(124-39))
p.line(x, y(x), color='tomato', legend="ASH")

# AVA
AVA_func = lambda p: p**(91) * (1-p)**(124-91)
AVA_const = scipy.integrate.quad(AVA_func, 0, 1)
c3 = 1/AVA_const[0]
# Plotting
y = np.vectorize(lambda x: c3 * x**91 * (1-x)**(124-91))
p.line(x, y(x), color='green', legend="AVA")

bokeh.io.show(p)

We can conclude that the wildtype has the lowest probability of reversal, while the ASH strain has a slightly higher probability of reversal, and the AVA with the highest probability, approximately 2x that of ASH. This indicates that stimulation of the AVA command interneuron directly is more effective than stimulation of the ASH sensory neuron that should lead to activation of AVA. This probably makes sense - the neurons that are further down the cascade probably have more direct control over the final activity.

**Note:** We could have found these posteriors by using a conjugate prior like we did in lecture. We would've avoided doing a messy integral and gotten the same final result. We will now apply this concept in part c.

<div class="alert alert-info">
b: 10/10  
Good job!  In particular, nice analysis of the biological implications of your results.  Your code is fine, but it would be cleaner (not to mention easier on you) to have a single function that takes $n$ and $r$ as arguments rather than writing a new function for each strain.
</div>

**c)** The two data sets that are being compared are independent from each other and therefore can be expressed as individual probabilities/distributions, then multiplied together. We will use the functional form of the Beta distribution to express both priors,

\begin{align}
g(p_1 \mid a, b) = \frac{p_1^{a-1}(1-p_1)^{b-1}}{B(a,b)} \\
g(p_2 \mid a, b) = \frac{p_2^{a-1}(1-p_2)^{b-1}}{B(a,b)}
\end{align}

where $a = b = 1$ because we assumed a uniform distribution for the prior.

Now, we want the result of the product of the two priors multiplied by the product of the two likelihoods. Notice that Beta function is simply a function of the known parameters $n, r$ so they should all end up being constants when calculated out, in addition to the $\binom{n}{r}$ constants from the likelihoods. When we multiply all the probabilities together, we'll simply roll the constants into a single constant, $C$. Using the mathematical manipulation described in lecture to combine the likelihood and prior this leaves us with:

\begin{align}
g(p_1,p_2 \mid n_1,n_2,r_1,r_2) & = C \, (p_1^{r_1}(1-p_1)^{n_1-r_1})(p_2^{r_2}(1-p_2)^{n_2-r_2}) \\
C & = \frac{1}{B(r_1, n_1 - r_1) B(r_2, n_2 - r_2)} \\
\end{align}

This distribution however is not very useful since it doesn't give us more information. However we can transform $p_1$ and $p_2$ into variables that are useful in the next part.

<div class="alert alert-info">
c: 6/6  
Great job!
</div>

**d)** Now we can use the inverse of the Jacobian as shown in lecture to transform into our new variables $\delta$ and $\gamma$. We accomplish this by first expressing $p_1$ and $p_2$ in terms of $\gamma$ and $\delta$ $(\gamma = p_2+p_1, \delta = p_2-p_1)$ and using their determinant to find the transform:

\begin{align}
p_1 & = (\gamma + \delta)/2 \\
p_2 & = (\gamma - \delta)/2 \\
\end{align}

So we can write out the Jacobian:

\begin{align}
\begin{vmatrix} \frac{dp_2}{d\gamma} & \frac{dp_2}{d\delta}\\ \frac{dp_1}{d\gamma} & \frac{dp_1}{d\delta} \end{vmatrix} \\
\begin{vmatrix} \frac{1}{2} & -\frac{1}{2}\\ \frac{1}{2} & \frac{1}{2} \end{vmatrix}& = \frac{1}{2}\\
g(\gamma,\delta \mid n_1,n_2,r_1,r_2) & = \frac{C}{2} (\gamma^{r_1}(1-\gamma)^{n_1-r_1})(\delta^{r_2}(1-\delta)^{n_2-r_2}) \\
\end{align}

Now to get what we're finally after, we need to marginalize with respect to $\gamma$ to remove it from the posterior expression. First we need to note the range of both $\delta$ and $\gamma$. Because $\delta$ is the difference, it can range from -1 to 1 depending on which probability is bigger since each probability ranges from 0 to 1. Similarly, $\gamma$ ranges from 0 to 2 since it's the sum of the two (and they're independent). This means that for values of $\gamma$ outside of [0,2] and values of $\delta$ outside of [-1,1], the probability would be zero.

<div class="alert alert-info">
d: 5/8  
You have the right idea here but are missing a few steps in the execution.
<ul>
<li>-2: You need to substitute in the expressions you found for $p_1$ and $p_2$ in terms of $\delta$ and $\gamma$.  Specifically, the final result (for the normalized form) should be
\begin{align}
P \left( \delta, \gamma \mid r_1, r_2, n_1, n_2 \right) &= \frac{C}{2} \left( \frac{\gamma - \delta}{2} \right)^{r_1} \left( 1 - \frac{\gamma - \delta}{2} \right)^{n_1 - r_1} \left( \frac{\gamma + \delta}{2} \right)^{r_2} \left( 1 - \frac{\gamma + \delta}{2} \right)^{n_2 - r_2}
\end{align}
</li>
<li>-1: The bounds on $\delta$ are correct, but you need to bound $\gamma$ in terms of $\delta$ to ensure that $p_1$ and $p_2$ are valid probabilities.  Specifically, you will find that $\gamma$ must satisfy $\left| \delta \right| \leq \gamma \leq 2 - \left| \delta \right|$.</li>
</ul>
</div>

**e)** Based on the explanation in **(d)**, we see that if we marginalize $\gamma$ by integrating wrt $\gamma$ from 0 to 2, we get the marginal distribution:

$$ g(\delta \mid n_1,n_2,r_1,r_2) = \int_0^2 \frac{C}{2} (\gamma^{r_1}(1-\gamma)^{n_1-r_1})(\delta^{r_2}(1-\delta)^{n_2-r_2}) d\gamma $$

<div class="alert alert-info">
e: 4/4  
-0: See above for correct bounds on integral over $\gamma$.
</div>

<div class="alert alert-info">
Other: 2/2  
Explained code cells clearly and included attributions.  Great discussion of your mathematical approach!
</div>

<div class="alert alert-info">
Total: 37/40
</div>