## Probability Density Functions

Let’s consider a fictional class of students and their grades on a math quiz. Each student can earn a grade between $0$ and $20$, including fractional grades. If we treat the grade as a random event, the grade itself is a continuous random variable because it can have any value between $0$ and $20$. If we want to calculate the probability of a student getting a grade between $11$ and $12$. To see why, let’s consider the formula, assuming uniform probability:

$$ P(11<x<20) = \frac{n(E)}{n(S)} $$

where $E$ is the set of all grades possible between $11$ and $12$ and $S$ is the set of all possible grades—that is, all real numbers between $1$ and $20$. By our definition of the preceding problem, $n(E)$ is infinite because it’s impossible to count all possible real numbers between $11$ and $12$; the same is true for $n(S)$. Thus, we need a different approach to calculate the probability.

A probability density function, $P(x)$, expresses the probability of the value of a random variable being close to $x$, an arbitrary value. It can also tell us the probability of $x$ falling within an interval. That is, if we knew the probability density function representing the probability of grades in our fictional class, calculating $P(11 < x < 12)$ would give us the probability that we’re looking for. But how do we calculate this? It turns out that this probability is the area enclosed by the graph of the probability density function and the x-axis between the points $x = 11$ and $x = 12$. Assuming an arbitrary probability density function, Figure below demonstrates this.

![alt text](https://4137876152-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-M2g31CUvdCruJm660Ot%2Fuploads%2FDAT141YRNY9o8fZNs8SS%2F967.png?alt=media&token=edc6f7ae-43e5-44a8-809c-5d535215cfc5 "Probability Density Functions")

We already know that this area is equal to the value of the integral,

$$ \int_{a=11}^{b=12}p(x)dx $$

thus, we have an easy way to find the probability of the grade lying between 11 and 12. With the math out of the way, we can now find out what the probability is. The probability density function is the function:

$$ \frac{1}{\sqrt{2\pi }}e^{-\frac{(x-10)^2}{2}} $$

where $x$ is the grade obtained. This function has been chosen so that the probability of the grade being close to $10$ (either greater or less than) is high but then decreases sharply.

Now, let’s calculate the integral:

$$ \int_{11}^{12}p(x)dx $$

with $p(x)$ being the preceding function:

In [4]:
from sympy import Symbol, exp, sqrt, pi, Integral, S
x = Symbol('x')
p = exp(-(x - 10)**2/2)/sqrt(2*pi)
Integral(p, (x, 11, 12)).doit().evalf()

0.135905121983278

We create the Integral object for the function, with p representing the probability density function that specifies that we want to calculate the definite integral between $11$ and $12$ on the x-axis. We evaluate the function using doit() and find the numerical value using $evalf()$. Thus, the probability that a grade lies between $11$ and $12$ is close to $0.14$.

Hence, even if a and b are very large values such that they tend to $−∞$ and $∞$, respectively, the value of the integral will still be $1$, as we can verify ourselves:

In [3]:
Integral(p, (x, S.NegativeInfinity, S.Infinity)).doit().evalf()

1.00000000000000