---
title: "An Introduction To Bayesian Statistics"
author: "Brandon Scott"
date: "2022-12-24"
categories: [analysis, bayesian]
draft: true
format:
    html:
        code-fold: true
jupyter: python3
---

The foundation for the Bayesian section of the blog

# TL;DR

Statistics has long been a subject of great debate because of the way it is implmented. Due to "arbitrary" rules or guidelines that the academic community or business organizations follow in this field, it can be easy to be confused as to why certain numbers matter, if results from experiments actually have genuine impact, and how can we improve of the results we obtained. As well, what do we do when a test results contradicts what a subject matter expert says they expected to happen? Bayesian methods attempt to clear the confusion surrounding statistics. Bayesian methods allow us to incorporate our own knowledge into statistical modeling practices to properly draw conclusions from data. From those conclusions, we can gain direction and insight on what kind of decisions to make to better guide an organization. 

# Introduction: A New Place

<center><img src="https://www.metrotransit.org/Data/Sites/1/media/buses/MetroTransit-HybridbusL.jpg"></center>

Suppose you have just moved to a new city and want to take advantage of the bus system to get to work. Before your first day of work, you research the bus route that would best serve you to get to work. The bus passes by the stop closest to your house every 15 minutes, starting at the top of the hour. The route is estimated to take about 45 minutes to get to your office, where it will drop you off right in front of it. You make the goal that you want to be at work by 9:00am every day. Therefore, you know the latest bus you want to take is the one that comes at 8:15am. The problem is, it rains a lot where you live, so you don't want to wait too long at the bus station. You want to wait the minimal amount of time possible where you know you can still make the 8:15am bus. How can you reasonably determine what time to arrive at the bus stop in order to not miss your bus?

Quantifying uncertainty? That sounds exactly like a statistics problem! We use statistics to quantify scenarios by using data from those scenarios to make inference, or judgement calls based on what we have observed in the data. Everyone has been exposed to data in some kind. Numbers and data points are thrown around all over the place these days. "60% of people brush their teeth twice a day", "A poll shows that 85% of people prefer (insert candidate name here) with a margin of error of 5%", etc. In order to quantify our above problem, we would need to collect data and perform some calculations in order to answer, with high certainty, what time we should arrive at the bus stop to minimize waiting time. 

What are some potential barriers to this? Well, collecting data is a huge barrier in this case. We already said we don't want to wait out in the rain very long for a bus to come. We most certainly don't want to wait outside for hours watching the bus come and go. If you're in luck, the city may have a website where they keep their bus route logs that you can parse through and gather data that way. More than likely though, there will not be a system for this. How can we solve this problem statistically if we can't collect enough data to make sound statistical judgements?

# Bayesian Inference: A New Framework For Statistical Thinking

Let's think for a moment how you made decisions before learning about statistics and their use for data validation/inference. To stick with the current example, let's throw it back to when you used to wait for the bus to pick you up from school. How did you know when to arrive at the bus stop? What influenced your judgement call? If you are like me, I let the every day occurence affect my thinking. If the first day of school my bus driver was a couple minutes late, I would think this is the first day and it probably won't happen again. Let's say the next day they are late again by the same amount of time. This provokes me to think I should show up a little bit later than scheduled arrival time to minimize my wait time at the bus stop. 

This process is repeated

# The Math Behind Updating Beliefs

The example above is a major simplification of the actual mathematical implementation of Bayesian inference. However, it does capture the idea that there is a way to update our belief in something by drawing on past data (prior belief) and updating that belief with new data that we observe, thus obtaining a a new distribution of uncertainty we use in judgements/calculations. The formula for this kind of thinking is shown below.

<center>
 \begin{equation}
P(H|\theta) = \frac{P(\theta|H) P(H)}{P(\theta)}
\end{equation}
</center>

<center>\begin{equation} H = \text{Our Hypothesis (prior)} \end{equation}<br>
\begin{equation}\theta = \text{Our Data} \end{equation}<br>
\begin{equation}P(H|\theta) = \text{Probability of observing our hypothesis given the data we've seen (posterior)} \end{equation}<br>
\begin{equation}P(H) = \text{Probability of our hypothesis (prior)} \end{equation} <br>
\begin{equation}P(\theta|H) = \text{Probability of our data given our hypothesis is true (likelihood)} \end{equation} <br>
\begin{equation}P(\theta) = \text{Probability of our data} \end{equation}</center>


The above formula is the generalized bayes formula used for bayesian statistical inference. To make the formula feel a bit more intuitive, let's utilize our bus example. Our hypothesis $(H)$ was that the bus would arrive within +- 3 minutes of scheduled arrival time. We update this belief by conditioning on the data $(\theta)$ we've seen. In our bus example, when we went to the bus stop for the first time, arriving 5 minutes early, the bus came within 1 minute of scheduled arrival time (1 minute late to be exact). Therefore, I updated my belief that if this is a normal bus day, this new bus system actually arrives in a closer timeframe than the previous bus system. However, since this was only my first day on the bus, I didn't want to jump to any firm conclusions quite yet. I decided that I would experiment and try arriving 3 minutes early. From that, I observed that again the bus came within 1 minute of scheduled arrival (1 minute early to be exact). 

This process of constantly updating is what the formula above generalizes. We all have <strong>prior</strong> beliefs/opinions about something, be it a bus system, train system, hospital, etc. that gets updated by new events (data) that we observe. This is the beauty of Bayesian statistics. Bayesian methods allow for <strong>constant updates</strong> rather than strong conclusions that other statistical methods endorse.

# Car Accidents: A Bayesian Example

Let's provide another example of the usefulness of Bayes theorem above. Hopefully this example, if the above formula and explanation still confuse you due to the math notation, will provide clarity on what all those symbols mean. Suppose you work for a car insurance company. Your company has three tiers of ages it covers:

\begin{array}{|c|c|} 
\hline Age & \% clients \\\hline
Young & 25\% \\
Middle Age & 50\% \\
Old & 25\%
\\\hline
\end{array}