# Statistical Thinking

In UX, when you're working with data you might need to answer questions like:
- "Is this design better?"
- "How confident can I be this is a solid result?"
- "What's the chance that my findings are wrong?"

How you approach these questions gives away the type of statistical thinking you engage in: Frequentist vs. Bayesian. One of the key differences between these approaches is your starting assumption(s).

So, let's go back to the first question: "Is this design better?" Frequentist thinking may start with the statement: "Let's say there's no effect of this design. If I test this design, how rare will my outcome be?" From this you see there is the assumption of no effect, which as a UXer you will commonly know as the _null hypothesis_, $H_{0}$. By comparison, Bayesian thinking may start with: "Based on what I already know, what's the chance of obtaining my outcome?" Bayesian thinking recognizes that we have some prior information to go off of, and the test we're running now is a way of updating that knowledge.

Let's dissect these a bit more...

## Frequentist Thinking

The frequentist perspective assumes that a value we are interested in is fixed but unknown. For example, average time on task.

If you test someone on a website task and it takes them 52 seconds to complete, then your assumption behind that 52 seconds would be that this person was randomly drawn from the population of prospective website users. For this population, the actual average time on task is unknown, so we rely on a sample distribution to estimate it.

You would then want to compare that 52 seconds result to the mean of your sample distribution and see where in that distribution the 52 seconds falls. If it falls outside of some pre-defined range in that distribution, then you would conclude that the person with the 52 seconds result is statistically unusual and maybe not reflective of the broader population.

So, we are:
- Assuming the population's average time on task in seconds is a fixed value that we don't know.
- Acknowledging that we don't know the true distribution of the population, but we will use a sample to make assumptions about it.
- Treating our observed value of 52 seconds as a random value.
- Comparing 52 seconds to a sample of the population, and rejecting it as unrepresentative of the population if it falls outside of an acceptable range of time on task values.

## Bayesian Thinking

The Bayesian perspective assumes that the value we're interested in--average time on task--is not fixed, but exists as a probability distribution that reflects uncertainty about what the value could be. This prior belief could be based upon past studies, domain expertise, or an educated guess. Regardless of where we get that prior belief from, that probability distribution reflects what we think is reasonable before seeing new data.

Again, imagine that we have a user complete a task on a website in 52 seconds. Now, we have to update our prior belief with this new data point. This creates a posterior distribution such that represents a refined estimate of the average time on task.

So now we can ask questions referring to the probability of scenarios:
- What's the most likely range for the true average time on task?
- What's the probability that the average time on task is under 60 seconds?

Ultimately, we are:
- Treating the average time on tas as a random variable.
- Start with a prior belief based on knowledge or assumptions
- Updating our belief with new information
- Making probability-based conclusions from the updated information

Bayesian thinking lets you say how confident you are in a result, based on everything you know so far.

## Which One Should You Choose?

I've often heard in the past people declare themselves as "Bayesian" or "Frequentist." If I'm being honest, I lean toward the Bayesian approach even though Frequentist methods are often far easier to execute with off-the-shelf tools and defaults. That being said, don't pick a side. Just use the right tool for the job. Therefore, I encourage you to learn a bit about both.

The Frequentist approach is great when you have a large sample size, or when you lack any meaningful prior knowledge (or don't want to consider that knowledge). Also, keep in mind that most environments UX research operates in will expect a Frequentist approach.

Bayesian is better for smaller sample sizes, or when you have previous information you can use. It can be slightly more intuitive for non-statistically inclined audiences to understand in some regards, but the Bayesian approach is rarer to find in a UX research practice.