# Part 2: Bayesian Concepts

## Bayes' theorem

If $Y$ is a random variable, then $f(y|\theta)$ is a probability ditribution representing the sampling model for the observed data $y = (y_1, y_2, ..., y_n)$ given an unknown parameter, $\theta$.
The distribution $f(y|\theta)$ is often called the *likelihood* and sometimes written as $L(\theta; y)$.
We know that $L(\theta; y)$ is not a probability distribution for $\theta$ given $y$. Therefore, $\int L(\theta;y) \ d\theta \ \ $ is not neccesarily equal to $1$ or even finite.

It is possible to find the value of $\theta$ that maximises the likelihood function: *a maximum likelihood estimate* (MLE) for $\theta$, as:
\begin{align}
\hat{\theta} = argmax_{\theta}L(\theta;y)
\end{align}

In Bayesian statistics, $\theta$ is not a fixed (although unknown) parameter
but a random quantity.

This is done by adopting a probability distribution, called _prior distribution_, for ${\theta}$ that contains any information we have about ${\theta}$ not related to the data ${y}$.

Inferences on ${\theta}$ are based on its _posterior distribution_ given by
\begin{equation}
    P({\theta}|{y}) = \frac{f({y}|{\theta})\pi({\theta})}{m({y})} 
                        = \frac{f({y}|{\theta})\pi({\theta})}{\int f({y}|{\theta}) \pi({\theta}) d{\theta}}
\end{equation}

This formula is known as _Bayes' Theorem_.

The posterior probability is simply the product of the likelihood and the prior,
normalised so that integrates to $1$.
The posterior distribution is therefore a proper (or legitimate) probability distribution.

We will now prove this theorem using an example. One can even use Venn diagrams which are useful but not rigorous.

The greatest loss of vertebrate biodiversity we observed in the past 30 years is due to a chytrids fungus which is responsible for the extinction of over a hundred species of amphibians.
Let's assume we have a sample space $S$ with all the possible outcomes of an experiment and we are interested in a subset of $S$, representing only some events.
In our example we are interested in detecting which samples of frogs are infected or not by the fungus. 
During our fieldwork, we take some samples and test whether they are
infected or not.

Let's consider $S$ to consist of all the samples collected in a particular area.

We can split our $S$ in two events: the event "samples with infection"
(designated as set $A$), and "samples with no infection" (complement of set $A$, or $A^c$).

What is the probability that a randomly chosen sample is infected?

It is the number of elements in $A$ divided by the number of elements of $S$.

We can denote the number of elements of $A$ as $|A|$, called the cardinality of $A$.

The probability of $A$, $P(A)$, is
\begin{equation}
    P(A) = \frac{|A|}{|S|}
\end{equation}
with $0 \leq P(A) \leq 1$.

**Cardinality:** the number of elements in a set or other grouping, as a property of that grouping.

Assume that we use a molecular screening test which takes a biological sample (e.g. piece of skin) from a frog and tests for the presence of the fungus.
The test will be "positive" for some samples, and "negative" for
some other samples.

Let's denote with event $B$ the collection of "samples for which the test is positive".

What is the probability that the test will be "positive" for a randomly selected sample?