# Continuous Probability Distributions

## Objectives
- Identify continuous probability density functions.
- Calculate probabilities by finding the areas under probability density functions.

## Continuous Probability Distributions
If the outcome of an experiment can take on take on the value of any fraction, decimal, or irrational number within the range of allowed values, we say the experiment has a **continuous probability distribution**. For example, if we conduct an experiment where we measure the height of a randomly chosen person, the outcome could be that the individual is $64.41$ inches tall, or maybe $72.683$ inches tall, or it could be any decimal value in between.

A continuous probability distribution can be described using a **probability density function**. The graph of a probability density function can be used to tell us how likely any range of outcomes is likely to be. Specifically, the area under the curve within a range of values is the probability that the outcome will be within that range of values. In short, when dealing with probability density functions, 

$$\text{Probability} = \text{Area}. $$

Because you can't have less than $0\%$ probability, the graph of a probability density function can only take on non-negative values above the $x$-axis. And since the total combined probability of all outcomes in an experiment is $100\%$, the total area under a probability density function is always $1$.

In [90]:
png("pdf_example.png", width = 1000, height = 500)

library("shape")

par(mar = c(2, 3, 3, 2.5))

x = seq(0, 60, 0.01)
y = dchisq(x, df=8)


plot(x, y, 
     type="l", col="black", 
     main="Example of a Continuous Probability Density Function", 
     cex.main = 2.5, cex.lab = 2.5, bty = "n",
     xaxs = "i", xaxt = "n", xlim = c(0, 30), xlab = "",
     yaxs = "i",  yaxt = "n", ylim = c(0, 0.15), ylab = "")

polygon(x, y, col = "gray90")

x = c(4, seq(4, 8, 0.01), 8)
y = c(0, dchisq(seq(4, 8, 0.01), df=8), 0)
polygon(x, y, col = "lightsteelblue")

Arrows(6, 0.125, 6, 0.105, lwd = 3, arr.type = "triangle", arr.width = 0.3, col = "blue3")
text(6, 0.125, labels = "P(4 < X < 8)", cex = 2, col = "blue3", pos = 3)

x = c(14, seq(14, 18, 0.01), 18)
y = c(0, dchisq(seq(14, 18, 0.01), df=8), 0)
polygon(x, y, col = "lightsteelblue")

Arrows(16, 0.03, 16, 0.01, lwd = 3, arr.type = "triangle", arr.width = 0.3, col = "blue3")
text(16, 0.03, labels = "P(14 < X < 18)", cex = 2, col = "blue3", pos = 3)

axis(1, at = seq(0, 30, 2), cex.axis = 2)
axis(2, at = seq(0.0, 0.15, 0.05), cex.axis = 2)

text(30.3, 0, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)
text(0, 0.155, labels = "y", cex = 2.5, pos = 3, xpd = TRUE)

dev.off()

```{figure} pdf_example.png
---
width: 100%
alt: A curve is plotted on the xy-plane. It starts at height zero at x=0, is highest at about x=6, then slowly drops off to nearly zero at x=30.
name: pdf_example
---
An example of a continuous probability density function. The area under the curve between $x = 4$ and $x = 8$ (the shaded region on the left) is $P(4 < X < 8)$. The area under the curve between $x = 14$ and $x = 18$ (the shaded region on the right) is $P(14 < X < 18)$. Since the area of the left region is clearly larger than the area of the right region, an outcome between $x = 4$ and $x = 8$ is more likely than an outcome between $x = 14$ and $x = 18$.
```

In this section, we will deal only with calculating the area for regions with shapes we are familiar with from geometry, like rectangles, circles, and triangles. We will learn how to calculate the areas of some regions with more complex shapes later.

```{note}
Since, for continuous probability distributions, we find probability by calculating the area under a probability density function, we can only find probabilities for *ranges* of values, not for *individual* values. It makes sense to find $P(14 < X < 18)$ because the length of the base of the area over that range from $x = 14$ to $x = 18$ is $18 - 14 = 4$. It doesn't make sense to find $P(X = 16)$ because the region over $x = 16$ has no base of any length, so it doesn't make sense to talk about the area of the region in this context.

For the same reason, a probability like $P(X < 8)$ would have the same value as $P(X \leq 8)$ when dealing with. The length of the base up to $x = 7.99999...$ is effectively the same as the length of the base up to $x = 8$.
```

***

### Example 3.7.1

Does the function pictured in {numref}`Figure {number} <pdf_rect1>` represent a probability density function? Why or why not?

In [50]:
png("pdf_rect1.png", width = 1000, height = 500)

par(mar = c(2, 1, 3, 2))

x = c(-1, 4)
y = c(0.2, 0.2)

plot(x, y, 
     type="l", col="black", 
     main="", 
     cex.main = 2.5, cex.lab = 2.5, bty = "n",
     xaxs = "i", xaxt = "n", xlim = c(-2, 5), xlab = "",
     yaxs = "i",  yaxt = "n", ylim = c(0, 1), ylab = "")

polyx = c(-1, x, 4)
polyy = c(0, y, 0)
polygon(polyx, polyy, col = "gray90", border = NA)

lines(x,  y)

axis(1, pos=0, at = seq(-2, 5, 1), cex.axis = 2)
axis(2, pos=0, at = seq(0, 1, 0.1), cex.axis = 2)

text(5.03, 0, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)
text(0, 1.03, labels = "y", cex = 2.5, pos = 3, xpd = TRUE)

dev.off()

```{figure} pdf_rect1.png
---
width: 100%
alt: A horizontal line segment of height y = 0.2 from x = -1 to x = 4.
name: pdf_rect1
---
A horizontal line segment of height $y = 0.2$ from $x = -1$ to $x = 4$.
```

#### Solution
For a graph or function to be a probability density function, it needs to satisfy two criteria.

First, is the graph entirely above the $x$-axis so no probabilities are less than zero? Yes, we can see that the graph is entirely above the $x$-axis.

Second, is the total area under the graph equal to one? The region under the graph is a rectangle. We can calculate the area of a rectangle by multiplying the length of the base with the height. We can see from the graph that the height of the rectangle is $y = 0.2$. Since the rectangle ranges from $x = -1$ to $x = 4$, the length of the base is $4 - (-1) = 5$. So the total area under the graph is

$$ A = \text{Base} \cdot \text{Height} = 5 \cdot 0.2 = 1. $$

Since the total area under the graph is $1$, the function is a probability density function.

***

### Example 3.7.2

Does the function pictured in {numref}`Figure {number} <pdf_tri1>` represent a probability density function? Why or why not?

In [88]:
png("pdf_tri1.png", width = 1000, height = 500)

par(mar = c(2, 1, 3, 2))

x = c(-4, 2)
y = c(0.4, -0.2)
intercept = -y[1] * (x[2] - x[1])/(y[2] - y[1]) + x[1]

plot(x, y, 
     type="l", col="black", 
     main="", 
     cex.main = 2.5, cex.lab = 2.5, bty = "n",
     xaxs = "i", xaxt = "n", xlim = c(-5, 3), xlab = "",
     yaxs = "i",  yaxt = "n", ylim = c(-0.4, 0.6), ylab = "")

polyx = c(-4, -4, 0)
polyy = c(0, 0.4, 0)
polygon(polyx, polyy, col = "gray90", border = NA)

polyx = c(0, 2, 2)
polyy = c(0, -0.2, 0)
polygon(polyx, polyy, col = "gray90", border = NA)

lines(x,  y)

axis(1, pos=0, at = seq(-5, 3, 1), cex.axis = 2)
axis(2, pos=0, at = seq(-0.4, 0.6, 0.1), cex.axis = 2)

text(3.03, 0, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)
text(0, 0.63, labels = "y", cex = 2.5, pos = 3, xpd = TRUE)

dev.off()

```{figure} pdf_tri1.png
---
width: 100%
alt: A line segment from (-4, 0.4) to (2, -0.2).
name: pdf_tri1
---
A line segment from $(x, y) = (-4, 0.4)$ to $(x, y) = (2, -0.2)$.
```

#### Solution
For a graph or function to be a probability density function, it needs to satisfy two criteria.

First, is the graph entirely above the $x$-axis so no probabilities are less than zero? No, we can see on that the graph is below the $x$-axis between $x = 0$ and $x = 2$. Since the function fails this first criterion, we know it is not a probability density function. We do not need to check the second criterion.

***

### Example 3.7.3

Does the function pictured in {numref}`Figure {number} <pdf_circle>` represent a probability density function? Why or why not?

In [56]:
png("pdf_circle.png", width = 1000, height = 500)

par(mar = c(2, 1, 3, 1))

x = seq(-2, 2, 0.01)
y = sqrt(4 - x^2) 

plot(x, y, 
     type="l", col="black", 
     main="", 
     cex.main = 2.5, cex.lab = 2.5, bty = "n", asp = 1,
     xaxs = "i", xaxt = "n", xlim = c(-3, 3), xlab = "",
     yaxs = "i",  yaxt = "n", ylim = c(0, 3), ylab = "")

polyx = c(-2, x, 2)
polyy = c(0, sqrt(4 - x^2), 0)
polygon(polyx, polyy, col = "gray90")

axis(1, pos=0, at = seq(-3, 3, 1), cex.axis = 2)
axis(2, pos=0, at = seq(0, 3, 1), cex.axis = 2)

text(3.03, 0, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)
text(0, 3.07, labels = "y", cex = 2.5, pos = 3, xpd = TRUE)

dev.off()

```{figure} pdf_circle.png
---
width: 100%
alt: The upper half-circle of radius 2 centered at the origin.
name: pdf_circle
---
The upper half of a circle of radius $2$.
```

#### Solution
For a graph or function to be a probability density function, it needs to satisfy two criteria.

First, is the graph entirely above the $x$-axis so no probabilities are less than zero? Yes, we can see that the graph is entirely above the $x$-axis.

Second, is the total area under the graph equal to one? We can calculate the area of the entire region under the graph using the formula for the area of a full circle:

$$ A = \pi \cdot r^2, $$

where $\pi \approx 3.14$ and $r$ is the radius of the circle. But in our case, we only have half a circle, so we will multiply the formula by $\frac{1}{2}$. Since this half-circle has a radius of $r = 2$, the area of the half-circle is

$$ A = \frac{1}{2} \cdot \pi \cdot r^2 \approx \frac{1}{2} \cdot 3.14 \cdot 2^2 = 6.28. $$

So the total area under the half-circle is about $6.28$. Because $6.28 \neq 1$, this graph isn't a probability density function.

***

### Example 3.7.4

Consider the probability density function $y = 0.5$ from $x = 1$ to $x = 3$. 

1. Find $P(X > 1.5)$.
2. Find $P(2 < X < 2.5)$.

#### Solution
First, let's graph the probability density function. In this case, it is a horizontal line of height $y = 0.5$ that stretches from $x = 1$ to $x = 3$, as shown in {numref}`Figure {number} <pdf_rect2>`. Note that it really is a probability density function since the area under the graph is

$$ A = \text{Base} \cdot \text{Height} = 2 \cdot 0.5 = 1. $$


In [113]:
png("pdf_rect2.png", width = 1000, height = 500)

par(mar = c(2, 3, 3, 2))

x = c(1, 3)
y = c(0.5, 0.5)

plot(x, y, 
     type="l", col="black", 
     main="", 
     cex.main = 2.5, cex.lab = 2.5, bty = "n",
     xaxs = "i", xaxt = "n", xlim = c(0, 4), xlab = "",
     yaxs = "i",  yaxt = "n", ylim = c(0, 1), ylab = "")

polyx = c(1, x, 3)
polyy = c(0, y, 0)
polygon(polyx, polyy, col = "gray90", border = NA)

lines(x,  y)

axis(1, pos=0, at = seq(0, 4, 0.5), cex.axis = 2)
axis(2, pos=0, at = seq(0, 1, 0.1), cex.axis = 2)

text(4.03, 0, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)
text(0, 1.03, labels = "y", cex = 2.5, pos = 3, xpd = TRUE)

dev.off()

```{figure} pdf_rect2.png
---
width: 100%
alt: A horizontal line segment of height y = 0.5 from x = 1 to x = 3.
name: pdf_rect2
---
A horizontal line segment of height $y = 0.5$ from $x = 1$ to $x = 3$.
```

##### Part 1
To find $P(X > 1.5)$, we need to calculate the area under the graph over the range where $x > 1.5$. (See {numref}`Figure {number} <pdf_rect2_area1>`.)

In [114]:
png("pdf_rect2_area1.png", width = 1000, height = 500)

par(mar = c(2, 3, 3, 2))

x = c(1, 3)
y = c(0.5, 0.5)

plot(x, y, 
     type="l", col="black", 
     main="", 
     cex.main = 2.5, cex.lab = 2.5, bty = "n",
     xaxs = "i", xaxt = "n", xlim = c(0, 4), xlab = "",
     yaxs = "i",  yaxt = "n", ylim = c(0, 1), ylab = "")

polyx = c(1, x, 3)
polyy = c(0, y, 0)
polygon(polyx, polyy, col = "gray90", border = NA)

polyx = c(1.5, 1.5, 3, 3)
polyy = c(0, 0.5, 0.5, 0)
polygon(polyx, polyy, col = "lightsteelblue")

lines(x,  y)

axis(1, pos=0, at = seq(0, 4, 0.5), cex.axis = 2)
axis(2, pos=0, at = seq(0, 1, 0.1), cex.axis = 2)

text(4.03, 0, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)
text(0, 1.03, labels = "y", cex = 2.5, pos = 3, xpd = TRUE)

dev.off()

```{figure} pdf_rect2_area1.png
---
width: 100%
alt: A horizontal line segment of height y = 0.5 from x = 1 to x = 3. The area under the line segment from x = 1.5 to x = 3 is shaded blue.
name: pdf_rect2_area1
---
A horizontal line segment of height $y = 0.5$ from $x = 1$ to $x = 3$. The area of the region shaded blue is $P(X > 1.5)$.
```

This region has a rectangular shape, so we can find the area by multiplying the base of the region times its height. The height is $y = 0.5$. The length of the base from $x = 1.5$ to $x = 3.0$ is $3.0 - 1.5 = 1.5$. So the area of the region is

$$ A = \text{Base} \cdot \text{Height} = 1.5 \cdot 0.5 = 0.75. $$

So $P(X > 1.5) = 0.75$.

##### Part 2
To find $P(2 < X < 2.5)$, we need to calculate the area under the graph over the range between $x = 2$ and $x = 2.5$. (See {numref}`Figure {number} <pdf_rect2_area2>`.)

In [115]:
png("pdf_rect2_area2.png", width = 1000, height = 500)

par(mar = c(2, 3, 3, 2))

x = c(1, 3)
y = c(0.5, 0.5)

plot(x, y, 
     type="l", col="black", 
     main="", 
     cex.main = 2.5, cex.lab = 2.5, bty = "n",
     xaxs = "i", xaxt = "n", xlim = c(0, 4), xlab = "",
     yaxs = "i",  yaxt = "n", ylim = c(0, 1), ylab = "")

polyx = c(1, x, 3)
polyy = c(0, y, 0)
polygon(polyx, polyy, col = "gray90", border = NA)

polyx = c(2, 2, 2.5, 2.5)
polyy = c(0, 0.5, 0.5, 0)
polygon(polyx, polyy, col = "lightsteelblue")

lines(x,  y)

axis(1, pos=0, at = seq(0, 4, 0.5), cex.axis = 2)
axis(2, pos=0, at = seq(0, 1, 0.1), cex.axis = 2)

text(4.03, 0, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)
text(0, 1.03, labels = "y", cex = 2.5, pos = 3, xpd = TRUE)

dev.off()

```{figure} pdf_rect2_area2.png
---
width: 100%
alt: A horizontal line segment of height y = 0.5 from x = 1 to x = 3. The area under the line segment from x = 2 to x = 2.5 is shaded blue.
name: pdf_rect2_area2
---
A horizontal line segment of height $y = 0.5$ from $x = 1$ to $x = 3$. The area of the region shaded blue is $P(2 < X < 2.5)$.
```

This region has a rectangular shape, so we can find the area by multiplying the base of the region times its height. The height is $y = 0.5$. The length of the base from $x = 2$ to $x = 2.5$ is $2.5 - 2 = 0.5$. So the area of the region is

$$ A = \text{Base} \cdot \text{Height} = 0.5 \cdot 0.5 = 0.25. $$

So $P(2 < X < 2.5) = 0.25$.

***

### Example 3.7.5
Consider the probability density function $y = \frac{1}{8}x + \frac{3}{8}$ from $x = -3$ to $x = 1$.

1. Find $P(X < -1.5)$.
2. Find $P(X > -1.5)$.

#### Solution
Begin by plotting the probability density function. If you remember algebra, you may recall that $y = \frac{1}{8}x + \frac{3}{8}$ is the equation of a line. We know this line only goes from $x = -3$ to $x = 1$. If we plug $x = -3$ into the equation, we get

$$ y = \frac{1}{8}(-3) + \frac{3}{8} = 0, $$

so the line starts at the point $(x, y) = (-3, 0)$. Similarly, if we plug $x = 1$ into the equation, we get

$$ y = \frac{1}{8}(1) + \frac{3}{8} = 0.5, $$

so the line ends at the point $(x, y) = (1, 0.5)$. (See {numref}`Figure {number} <pdf_tri2>`)

In [6]:
png("pdf_tri2.png", width = 1000, height = 500)

par(mar = c(2, 3, 3, 2))

x = c(-3, 1)
y = c(0, 0.5)

plot(x, y, 
     type="l", col="black", 
     main="", 
     cex.main = 2.5, cex.lab = 2.5, bty = "n",
     xaxs = "i", xaxt = "n", xlim = c(-4, 2), xlab = "",
     yaxs = "i",  yaxt = "n", ylim = c(0, 1), ylab = "")

polyx = c(-3, x, 1)
polyy = c(0, y, 0)
polygon(polyx, polyy, col = "gray90", border = NA)

lines(x,  y)

axis(1, pos=0, at = seq(-4, 2, 0.5), cex.axis = 2)
axis(2, pos=0, at = seq(0, 1, 0.1), cex.axis = 2)

text(2.03, 0, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)
text(0, 1.03, labels = "y", cex = 2.5, pos = 3, xpd = TRUE)

dev.off()

```{figure} pdf_tri2.png
---
width: 100%
alt: A line segment of from (x, y) = (-3, 0) to (x, y) = (1, 0.5).
name: pdf_tri2
---
A line segment of from $(x, y) = (-3, 0)$ to $(x, y) = (1, 0.5)$.
```

Note the total area under the function has a triangular shape. Since the length of the base of the triangle is $1 - (-3) = 4$, and the height is $0.5$, the area of the triangle is

$$ A = \frac{1}{2} \cdot \text{Base} \cdot \text{Height} = \frac{1}{2} \cdot 4 \cdot 0.5 = 1, $$ 

which means the function really is a probability density function, as the problem statement claims.

##### Part 1
To find $P(X < -1.5)$, we need to calculate the area under the graph over the range where $x < 1.5$. (See {numref}`Figure {number} <pdf_tri2_area1>`.)

In [8]:
png("pdf_tri2_area1.png", width = 1000, height = 500)

par(mar = c(2, 3, 3, 2))

x = c(-3, 1)
y = c(0, 0.5)

plot(x, y, 
     type="l", col="black", 
     main="", 
     cex.main = 2.5, cex.lab = 2.5, bty = "n",
     xaxs = "i", xaxt = "n", xlim = c(-4, 2), xlab = "",
     yaxs = "i",  yaxt = "n", ylim = c(0, 1), ylab = "")

polyx = c(-3, x, 1)
polyy = c(0, y, 0)
polygon(polyx, polyy, col = "gray90", border = NA)

polyx = c(-3, -3, -1.5, -1.5)
polyy = c(0, 0, 0.1875, 0)
polygon(polyx, polyy, col = "lightsteelblue")

lines(x,  y)

axis(1, pos=0, at = seq(-4, 2, 0.5), cex.axis = 2)
axis(2, pos=0, at = seq(0, 1, 0.1), cex.axis = 2)

text(2.03, 0, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)
text(0, 1.03, labels = "y", cex = 2.5, pos = 3, xpd = TRUE)

dev.off()

```{figure} pdf_tri2_area1.png
---
width: 100%
alt: A line segment of from (x, y) = (-3, 0) to (x, y) = (1, 0.5). The area under the line segment from x = -3 to x = -1.5 is shaded blue.
name: pdf_tri2_area1
---
A line segment of from $(x, y) = (-3, 0)$ to $(x, y) = (1, 0.5)$. The area of the shaded blue region is $P(X < -1.5)$.
```

This region has a triangular shape, so we can find the area using the formula for the area of a triangle. The length of the base of the triangle is $(-1.5) - (-3.0) = 1.5$. To find the height of the triangle, we can plug the $x$-value at which the triangle is highest, $x = -1.5$, into the probability density function:

$$ y = \frac{1}{8}(-1.5) + \frac{3}{8} = 0.1875. $$

So the height of the triangle is $0.1875$. Then the area of the triangle is

$$ A = \frac{1}{2} \cdot \text{Base} \cdot \text{Height} = \frac{1}{2} \cdot 1.5 \cdot 0.1875 = 0.1406. $$

That is, $P(X < -1.5) = 0.1406$.

##### Part 2
To find $P(X > -1.5)$, we need to calculate the area under the graph over the range where $x > 1.5$. (See {numref}`Figure {number} <pdf_tri2_area2>`.)

In [9]:
png("pdf_tri2_area2.png", width = 1000, height = 500)

par(mar = c(2, 3, 3, 2))

x = c(-3, 1)
y = c(0, 0.5)

plot(x, y, 
     type="l", col="black", 
     main="", 
     cex.main = 2.5, cex.lab = 2.5, bty = "n",
     xaxs = "i", xaxt = "n", xlim = c(-4, 2), xlab = "",
     yaxs = "i",  yaxt = "n", ylim = c(0, 1), ylab = "")

polyx = c(-3, x, 1)
polyy = c(0, y, 0)
polygon(polyx, polyy, col = "gray90", border = NA)

polyx = c(-1.5, -1.5, 1, 1)
polyy = c(0, 0.1875, 0.5, 0)
polygon(polyx, polyy, col = "lightsteelblue")

lines(x,  y)

axis(1, pos=0, at = seq(-4, 2, 0.5), cex.axis = 2)
axis(2, pos=0, at = seq(0, 1, 0.1), cex.axis = 2)

text(2.03, 0, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)
text(0, 1.03, labels = "y", cex = 2.5, pos = 3, xpd = TRUE)

dev.off()

```{figure} pdf_tri2_area2.png
---
width: 100%
alt: A line segment of from (x, y) = (-3, 0) to (x, y) = (1, 0.5). The area under the line segment from x = -1.5 to x = 1 is shaded blue.
name: pdf_tri2_area2
---
A line segment of from $(x, y) = (-3, 0)$ to $(x, y) = (1, 0.5)$. The area of the shaded blue region is $P(X > -1.5)$.
```

This region has the shape of a trapezoid. The formula for calculating the area of a trapezoid isn't too hard, but you probably don't remember it. In this particular case, there is an easier way to calculate the area of the region. We just need to observe that the area over $x < -1.5$ and the area over $x > -1.5$ are complementary, since together they cover the total area under the probability density function. That means 

$$ P(X < -1.5) + P(X > -1.5) = 1. $$

But we just found that $P(X < -1.5) = 0.1406$, meaning

$$ P(X > -1.5) = 1 - P(X < -1.5) = 1 - 0.1406 = 0.8594. $$

So $P(X > -1.5) = 0.8594$.