# Probability (Continuous)

In [1]:
# Import some helper functions (please ignore this!)
from utils import * 

**Context:** Many real-world data sets include non-discrete values (e.g. a patient's body-mass index (BMI), the dosage of medicine, and more). Here, we will introduce what you need to know in order to model continuous-valued data. 

**Challenge:** In many ways, continuous probability is similar to discrete probability. However, there are a few "gotchas" that are important to highlight. 

**Outline:**
* Introduce and practice the concepts, terminology, and notation behind continuous probability distributions.
* Gain familiarity with several common continuous distributions.

## Differences Between Continuous and Discrete Probability

Continuous probability functions the same way as discrete probability, except for a few key differences.

**Sample Space of Support:** The support of continuous probability distributions is over "uncountably infinite sets." If you're not familiar with this term, that's ok! In this course, we'll think about it more as a distribution supported over the real numbers, $\mathbb{R}$ (or some subset thereof). 
> Example: Let $H$ be a continuous RV, describing the distribution of heights in the IHH ER. The support of $H$ is the interval $(0, \infty)$ on the real line. 

**Probability Mass Function (PMF):** Continuous probability distributions *DO NOT have PMFs*; this is because, unlike discrete distributions, we cannot think of continuous distributions in terms of frequency. Let's illustrate this with an example.
> Example: Suppose we are modeling the probability of intoxication as a Bernoulli RV, $I$. If we say that the probability of intoxication (or $I=1$) is $0.5$ (meaning $p_I(1) = 0.5$), meaning half of our patients will have intoxication. Such a statement about the PMF of a discrete distribution can be immediately translated into intuition about frequency. On the other hand, suppose we have a continuous RV, $H$, modeling the height of patients. How can we describe the probability that a patient is 50 inches tall (i.e. what is $p_H(50)$?). Let's try to get some intuition. Of the patients in the data, maybe we have one that's $50.1$ inches tall, another that is $49.9$, or maybe one that's $49.999991$ inches tall---but what are the chances we will observe a patient that is *exactly* (not approximately) $50$ inches tall? The answer is: zero. This is because of the arbitrary precision we have on continuous values. So if we can't describe continuous distributions using PMFs, how else can we describe them? As you will see next, we will have to use a more circuitous route. 

**Cumulative Density Function (CDF):** 

**Probability Density Function (PDF):**

```{admonition} Exercise: Gaining comfort with commonly-used continuous distributions
Browse the Wikipedia pages for the following distributions:
* [Uniform](https://en.wikipedia.org/wiki/Continuous_uniform_distribution)
* [Beta](https://en.wikipedia.org/wiki/Beta_distribution)
* [Normal (or Gaussian)](https://en.wikipedia.org/wiki/Normal_distribution)
* [Truncated Normal](https://en.wikipedia.org/wiki/Truncated_normal_distribution)
* [Laplace](https://en.wikipedia.org/wiki/Laplace_distribution)

Then, answer the following questions:
1. You're modeling the distribution of heights in the US. Which of the above distributions would you choose and why?
2. You have a large collection of antique coins. Unlike modern-day coins, your coins don't have a 50% probability of landing heads. You're interested in modeling the distribution of the probability of them landing heads. That is, each coin has a different probability of landing heads---you want to model the distribution of these probabilities. Which of the above distributions would you choose and why?
3. You've been given a prototype of a new sensor that determines the location of the nearest intergalactic being. The sensor is, on average, correct, but is typically a little off (sometimes it overshoots and sometimes it undershoots the location). Which of the above distributions would you use to describe the error and why?

*Hint: On each Wikipedia page, there's a panel on the right side that summarizes the properties of the distribution (e.g. its support, PDF, example plots, etc.)---all of the information you need is there.*
```