# Bernoulli Distribution

## Overview 

In this section we discuss the <a href="https://en.wikipedia.org/wiki/Bernoulli_distribution">Bernoulli distribution</a>.

## Bernoulli distribution

The simplest random variable takes just two possible values. Let's call them 0 and 1. We have the following definition.

----

**Definition 1: Bernoulli variable and Bernoulli trial**


A random variable with two possible values, 0 and 1, is called a Bernoulli variable. Its distribution is the Bernoulli distribution, and any experiment with a binary outcome is called a Bernoulli trial.


----

Some examples of Bernoulli trials are the following. Good or defective components, parts that pass or fail tests, transmitted or lost signals,
working or malfunctioning hardware, benign or malicious attachments, sites that contain
or do not contain a keyword, girls and boys, heads and tails, and so on. All the above  are examples of
Bernoulli trials and therefore fit the same Bernoulli model. We can use
generic names for the two outcomes such as _successes_ and _failures_. These are nothing but commonly used generic names; in fact, successes do not have to be good, and failures do not have to be bad.

Let's assume that $p$ is the probability of success. Then the probability of failure is $q=1-p$. 
The PMF therefore for a Bernoulli variable is [1]

\begin{equation}
f(x)=P(X=x) \begin{cases}
p, ~~ \text{if} ~~ x = 1 \\
q, ~~ \text{if} ~~ x = 0 
\end{cases}
\end{equation}

or

\begin{equation}
f(x)=P(X=x) = p^x(1-p)^{1-x}
\end{equation}

for some $p\in[0,1]$ and  $x \in [0,1]$

The expectation and variance of a Bernoulli variable are given by

$$E\left[X\right] = p$$
$$Var\left[X\right] = pq$$


----
**Remark**

Given that $p$ lies in the range $[0, 1]$ implies that there is a whole family of Bernoulli distributions indexed by $p$. Every $p$ between 0 and 1 defines another Bernoulli distribution. The distribution with
$p = 0.5$ carries the highest level of uncertainty because $Var\left[X\right] = pq$ is maximized by
$p = q = 0.5$. Distributions with lower or higher $p$ have lower variances. 

----


----
**Remark**

Extreme parameters
$p = 0$ and $p = 1$ define non-random variables 0 and 1, respectively; their variance is 0.

----

## Summary

This section introduced the Bernoulli distribution. This can be used to model random variables that their
output can be either 0 or 1, true or false, success or failure and so on. The distribution is definde via 
one parameter $p$ which models the probability of _success_.

## References

1. Larry Wasserman, _All of Statistics. A Concise Course in Statistical Inference_, Springer 2003.