# Naive Bayes

## Bayes' Theorem

Bayes' theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. The formula is given by

\begin{equation}
    P(A|B) = \frac{P(B|A) P(A)}{P(B)}.
\end{equation}

In words, the probability of A (happening) given B, is given by the probability of B given A, multiplied by the probability of A, divided by the probability of B.

As an example, imagine you have two spanner-making machines, $m_1$ and $m_2$, producing:

\begin{equation}
    m_1: 30\,\mathrm{spanners/hr}\\
    m_2: 20\,\mathrm{spanners/hr}
\end{equation}

Of all the produced spanners, we see that 1\% are defective, with 50% labelled with $m_1$ and 50% with $m_2$. What is the probability that that a part produced by $m_2$ is defective?

Firstly, to get this problem in useful notation, the probabilties of an individual spanner coming from either machine are:

\begin{equation}
    P(m_1) = 30/50 = 0.6\\
    P(m_2) = 20/50 = 0.4
\end{equation}

And the overall probability of defect is

\begin{equation}
    P(\mathrm{X}): 30/50 = 0.6
\end{equation}

We also know that the probability of a single spanner being defective given it came from a particular machine is

\begin{equation}
    P(m_1 | \mathrm{X}) = 0.5,\\
    P(m_2 | \mathrm{X}) = 0.5.
\end{equation}

Bayes' theorem tells us that to answer our question:

\begin{equation}
    P(\mathrm{X} | m_2) = \frac{P(m_2 | \mathrm{X}) P(\mathrm{X})}{P(m_2)}.
\end{equation}

I.e. the probability of machine 2 producing a defective spanner is given by the probability of a defective spanner coming from machine 2, multiplied by the probability of a defective spanner being produced, divided by the probability of any spanner coming from machine 2.

Substituting the information we know

\begin{equation}
    P(\mathrm{X} | m_2) = \frac{0.5 \times 0.01}{0.4} = 0.0125 = 1.25\%.
\end{equation}

\begin{equation}
    P(\mathrm{X} | m_2) = \frac{0.5 \times 0.01}{0.6} = 0.00833 = 0.83\%.
\end{equation}

In a frequentist interpretation, we imagine that of 1000 spanners, 400 will be produced by machine 2. Also, 10 of the total are defective, 5 of which came from machine 2. So, the percentage of defective parts is 5/400 = 1.25\%. So Bayes' theorem is quite intuitive.

One may ask why we couldn't just count the number of defective spanners. However, perhaps the initial variables were standard factory metrics



## Appendix A - Derivation of Bayes' theorem

Start with the definition of conditional probability [1]. The probability of event $A$ given event $B$ is

\begin{equation}
    P(A|B) = \frac{P(A \cap B)}{P(B)}
\end{equation}

Likewise, the probability of event $B$ given event $A$ is

\begin{equation}
    P(B|A) = \frac{P(B \cap A)}{P(A)}
\end{equation}

We assume $P(A), P(B) \ne 0$. Rearranging and combining these equations

\begin{equation}
    P(A|B)P(B) = P(A \cap B) = P(B|A)P(A)
\end{equation}

This lemma is sometimes known as the product rule for probabilities. Assuming $P(B)\ne 0$, we can further rearrange for Bayes' theorem,

\begin{equation}
    P(A|B) = \frac{P(B|A) P(A)}{P(B)}.
\end{equation}


## References 

    [1] https://my.eng.utah.edu/~cs5961/Resources/bayes.pdf
    