# Probability Distributions
A **probability distribution** describes how the values of a random variable is distributed. For example, the collection of all possible outcomes of a sequence of coin tossing is known to follow the _binomial distribution_. Whereas the _means_ of sufficiently large samples of a data population are known to resemble the _normal distribution_. Since the characteristics of these theoretical distributions are well understood, they can be used to make statistical inferences on the entire data population as a whole.

## Binomial Distribution
The binomial distribution is a discrete probability distribution. It describes the outcome of n independent trials in an experiment. Each trial is assumed to have only two outcomes, either success or failure. If the probability of a successful trial is p, then the probability of having x successful outcomes in an experiment of n independent trials is as follows.
\begin{equation*}
f(x)=\binom{n}{x}p^{x}(1-p)^{(n-x)}\: \: where: x=0,1,2,...,n
\end{equation*}

**Problem**

Suppose there are twelve multiple choice questions in an English class quiz. Each question has five possible answers, and only one of them is correct. Find the probability of having four or less correct answers if a student attempts to answer every question at random.

__Solution__

Since only one out of five possible answers is correct, the probability of answering a question correctly by random is 1/5=0.2. We can find the probability of having exactly 4 correct answers by random attempts as follows.

In [1]:
dbinom(4, size=12, prob=0.2)

To find the probability of having four or less correct answers by random attempts, we apply the function dbinom with x = 0,…,4.

In [2]:
dbinom(0, size=12, prob=0.2) + 
+ dbinom(1, size=12, prob=0.2) + 
+ dbinom(2, size=12, prob=0.2) + 
+ dbinom(3, size=12, prob=0.2) + 
+ dbinom(4, size=12, prob=0.2) 

Alternatively, we can use the cumulative probability function for binomial distribution pbinom.

In [3]:
pbinom(4, size=12, prob=0.2) 

__Answer__

The probability of four or less questions answered correctly by random in a twelve question multiple choice quiz is 92.7%.

## Poisson Distribution
The __Poisson distribution__ is the probability distribution of independent event occurrences in an interval. If λ is the mean occurrence per interval, then the probability of having x occurrences within a given interval is:
\begin{equation*}
f(x)=\frac{\lambda ^{x}e^{-\lambda} }{x!}\: \: where: x=0,1,2,...
\end{equation*}

**Problem**

If there are twelve cars crossing a bridge per minute on average, find the probability of having seventeen or more cars crossing the bridge in a particular minute.

**Solution**

The probability of having *sixteen or less cars* crossing the bridge in a particular minute is given by the function *ppois.*

In [4]:
ppois(16, lambda=12)   # lower tail 

Hence the probability of having seventeen or more cars crossing the bridge in a minute is in the _upper tail_ of the probability density function.

In [5]:
ppois(16, lambda=12, lower=FALSE)   # upper tail 

__Answer__

If there are twelve cars crossing a bridge per minute on average, the probability of having seventeen or more cars crossing the bridge in a particular minute is 10.1%.

## Continuous Uniform Distribution
The **continuous uniform distribution** is the probability distribution of random number selection from the continuous interval between *a* and *b*.

**Problem**

Select ten random numbers between one and three.

**Solution**

We apply the generation function runif of the uniform distribution to generate ten random numbers between one and three.

In [6]:
runif(10, min=1, max=3)

## Exponential Distribution
The **exponential distribution** describes the arrival time of a randomly recurring independent event sequence.

**Problem**

Suppose the mean checkout time of a supermarket cashier is three minutes. Find the probability of a customer checkout being completed by the cashier in less than two minutes.

**Solution**

The checkout processing rate is equals to one divided by the mean checkout completion time. Hence the processing rate is 1/3 checkouts per minute. We then apply the function pexp of the exponential distribution with rate=1/3.

In [7]:
pexp(2, rate=1/3)

**Answer**

The probability of finishing a checkout in under two minutes by the cashier is 48.7%

## Normal Distribution

**Problem**

Assume that the test scores of a college entrance exam fits a normal distribution. Furthermore, the mean test score is 72, and the standard deviation is 15.2. What is the percentage of students scoring 84 or more in the exam?

**Solution**

We apply the function pnorm of the normal distribution with mean 72 and standard deviation 15.2. Since we are looking for the percentage of students scoring higher than 84, we are interested in the upper tail of the normal distribution.

In [8]:
pnorm(84, mean=72, sd=15.2, lower.tail=FALSE)

**Answer**

The percentage of students scoring 84 or more in the college entrance exam is 21.5%.

## Chi-squared Distribution

**Problem**

Find the 95th percentile of the Chi-Squared distribution with 7 degrees of freedom.

**Solution**

We apply the quantile function qchisq of the Chi-Squared distribution against the decimal values 0.95.

In [9]:
qchisq(.95, df=7)        # 7 degrees of freedom 

**Answer**

The 95th percentile of the Chi-Squared distribution with 7 degrees of freedom is 14.067.

## Student t Distribution

**Problem**

Find the 2.5th and 97.5th percentiles of the Student t distribution with 5 degrees of freedom.

**Solution**

We apply the quantile function qt of the Student t distribution against the decimal values 0.025 and 0.975.

In [10]:
qt(c(.025, .975), df=5)   # 5 degrees of freedom 

**Answer**
The 2.5th and 97.5th percentiles of the Student t distribution with 5 degrees of freedom are -2.5706 and 2.5706 respectively.

## F Distribution
**Problem**

Find the 95th percentile of the F distribution with (5, 2) degrees of freedom.

**Solution**

We apply the quantile function qf of the F distribution against the decimal value 0.95.

In [11]:
qf(.95, df1=5, df2=2)

**Answer**

The 95th percentile of the F distribution with (5, 2) degrees of freedom is 19.296.