# Probability Density Functions and Cumulative Distribution Function

We can take the difference between two overlapping ranges to calculate the probability that a random selection will be within a range of values for continuous distributions. This is essentially the same process as calculating the probability of a range of values for discrete distributions.

![PDF](https://raw.githubusercontent.com/ingridarreola/LearningPython/64027687c5b52f1c2d215f1ccdd395261c4bd449/Probability/Normal-PDF-Range.gif)

If we wanted to calculate the probability of randomly observing a woman between 165 cm to 175 cm, assuming heights still follow the Normal(167.74, 8) distribution. We can calculate the probability of observing these values or less. The difference between these two probabilities will be the probability of randomly observing a woman in this given range. This can be done in python using the `norm.cdf()` method from the `scipy.stats` library. As mentioned before, this method takes on 3 values:

- `x`: the value of interest
- `loc`: the mean of the probability distribution
- `scale`: the standard deviation of the probability distribution

#### P(165 < X < 175) = P(X < 175) - P(X < 165)

```
import scipy.stats as stats
print(stats.norm.cdf(175, 167.74, 8) - stats.norm.cdf(165, 167.74, 8))
```

`Output = 0.45194`

We can also calculate the probability of randomly observing a value or greater by subtracting the probability of observing less than than the given value from 1. This is possible because we know that the total area under the curve is 1, so the probability of observing something greater than a value is 1 minus the probability of observing something less than the given value.

Let’s say we wanted to calculate the probability of observing a woman taller than 172 centimeters, assuming heights still follow the Normal(167.74, 8) distribution. We can think of this as the opposite of observing a woman shorter than 172 centimeters. We can visualize it this way:

![Example2](https://raw.githubusercontent.com/ingridarreola/LearningPython/d7ed060047f8566688480d91b925817f006a501e/Probability/Norm_PDF_Example_2.svg)

We can use the following code to calculate the blue area by taking 1 minus the red area:

```
import scipy.stats as stats
 
# P(X > 172) = 1 - P(X < 172)
# 1 - stats.norm.cdf(x, loc, scale)
print(1 - stats.norm.cdf(172, 167.74, 8))

# Output --->  0.29718
```

#### The weather in the Galapagos islands follows a Normal distribution with a mean of 20 degrees Celcius and a standard deviation of 3 degrees. Uncomment temp_prob_1 and set the variable to equal the probability that the weather on a randomly selected day will be between 18 to 25 degrees Celcius using the norm.cdf() method. Be sure to print temp_prob_1.

In [1]:
import scipy.stats as stats

## P(18 < x < 25) ---> P(x < 25) - P(X < 18)

# stats.norm.cdf(x ,mean, standard_deviation)
temp_prob_1 = stats.norm.cdf(25, 20, 3) - stats.norm.cdf(18, 20, 3)
print(temp_prob_1)

0.6997171101802624


#### Using the same info about the Galapagos Islands, assign the variable to equal the probability that the weather on a randomly selected day will be greater than 24 degrees Celsius.

In [2]:
##  P(x > 24) ----> 1 - P(x < 24)
temp_prob_2 = 1 - stats.norm.cdf(24, 20,3)
print(temp_prob_2)

0.09121121972586788
