# Statistical Independence

The word “independence” generaly means free from external control or influence, but it also has a lot of connotations in US culture, as it probably does throughout the world. We will apply the concept of independence to many random phenomena, and the implication of independence is generally the same as the definition above: phenomena that are independent cannot influence each other.

In fact, we have already been applying the concept of independence throughout this book when we assume that the outcome of a coin flip, die roll, or simulation does not depend on the values seen in other trials of the same type of experiment. However, now we have the mathematical tools to define the concept of independence precisely.



## Conditional probabilities and independence

Based on the discussion above, try to answer the following question about what independence should mean for conditional probabilities. (Don't worry if you don't intuitively know the answer -- you can keep trying if you don't get it right at first!)

In [1]:
from jupyterquiz import display_quiz
git_path="https://raw.githubusercontent.com/jmshea/Foundations-of-Data-Science-with-Python/main/questions/"

#display_quiz("../questions/si-conditional.json")
display_quiz(git_path + "si-conditional.json")


Click the “+” sign to reveal the answer and discussion -->

```{toggle}

If $B$ is independent of $A$, then knowledge of $A$ occurring should not change the probability of $B$ occurring. I.e., if we are *given* that $A$ occurred, then the conditional probability of $B$ occurring should equal the unconditional probability:

$$
P(B|A) = P(B)
$$

Let's see the implications of this by substituting the formula for $P(B|A)$ from the definition:

<!--
\begin{align}\frac{P(A \cap B)}{P(A)} &= P(B) \\
\Rightarrow P(A \cap B) &= P(A)P(B)
\end{align}
-->
$$
\frac{P(A \cap B)}{P(A)} &= P(B) \\
\Rightarrow P(A \cap B) &= P(A)P(B)
$$ (p-b-given-a)

Now we might ask: if $B$ is independent of $A$, does that imply that $A$ is independent of $B$?  Let's assume that {eq}`p-b-given-a` holds and apply the result to the definition for $P(A|B$), assuming that $P(B)>0$:

\begin{align*}
P(A|B) & =\frac{ P(A \cap B) } {P(B) } \\
& = \frac{ P(A) P( B) } {P(B) } \\
& = P(A) 
\end{align*}

So if $P(B|A) = P(B)$, then $P(A|B)=P(A)$.  


```

## Formal definition of statistically independent events

A simple definition for conditional probability of events that satisfies all the forms of independence discussed above and that can deal with events with probability zero is as follows:

```{card}
DEFINITION
^^^
statistically independent (two events)
: Given a probability space $S, \mathcal{F}, P$ and two events $A\in \mathcal{F}$ and $B \in \mathcal{F}$, $A$ and $B$ are {\it statistically independent} if and only if (iff) 

$$
P(A \cap B) = P(A)P(B).
$$
```

If the context is clear, we will often just write “independent” instead of “statistically independent” or write *s.i.*, which is a commonly used abbreviation. 

````{note}

Please take time to study the definition of *statistically independent* carefully. In particular, note the following:
* **Events** can be statistically independent or not
* Probabilities **are not** something that are statistically independent or not
* The “if and only if” statement means that the definition applies in both directions:
    * If events $A$ and $B$ are statistically independent, then the probability of the intersection of the events factors as the product of the individual events, $P(A \cap B) = P(A)P(B)$.
    * If we have events $A$ and $B$ for which $P(A \cap B) = P(A)P(B)$, then $A$ and $B$ are statistically independent.
````

## When can we assume independence?

Statistical independence is often assumed for many types of events. However, it is important to be careful when applying such a strong assumption because events can be coupled in ways that are subtle. For example, consider the Magician's Coin example. Many people assume that the event of getting Heads on the second flip of the chosen coin will be independent of the outcome of the first flip of the coin. However, we have seen that this assumption is wrong! So, when can we assume that events will be independent?

**Events can be assumed to be statistically independent if they arise from completely separate random phenomena.**

In the case of the Magician's Coin, this assumption is violated in a subtle way. If we knew that the two-headed coin was in use, then we would know the results completely. What is subtle is the fact that observing the outcome of the first flip may give some information about which coin is in use (although we won't be able to show this for observing heads on the first flip until Chapter 6).

Examples that are assumed to result from separate random phenomena are extensive:
* **Devices to generate randomness in games:** Independence can usually be assume dfor different flips of a fair coin, rolls of a fair die, or card hards drawn from shuffled decks.
* **Failures of different devices in systems:** mechanical and electrical devices fail ar random, and the failures at different devices are often assumed to be independent; examples include light bulbs in a building or computers in a lab.
* **Characteristics of people unrelated to any grouping of those people:** for example, for a group of people at a meeting, having a March birthday would generally be independent events across any two people. 

Let's apply this concept to find a simpler way to solve a problem that was introduced in {doc}`../04-probability1/axiomatic-prob`:

**Example**

**(Take 3)** A fair six-sided die is rolled twice.  What is the probability that either of the rolls is a value less than 3? 

As before, let $E_i$ be the event that the top face on roll $i$ is less than 3, for $i=1,2$.

We assume that different different rolls of the die are independent, so $E_1$ and $E_2$ are independent. 

As in {doc}`../04-probability1/corollaries`, we can use Corollary 5 of the Axioms of Probability to write

$$
P(E_1 \cup E_2) = P(E_1) + P(E_2) - P(E_1 \cap E_2)
$$

Before, we had to enumerate $E_1 \cap E_2$ over the sample space for the combine roll of the dice to determine $P(E_1 \cap E_2)$.  Now, we can just apply statistical independence to write $P(E_1 \cap E_2) = P(E_1)P(E_2)$, yielding

\begin{align*}
P(E_1 \cup E_2) &= P(E_1) + P(E_2) - P(E_1)P(E_2) \\
&= \frac{1}{3}  + \frac{1}{3} - \left(\frac{1}{3}\right)\left(\frac{1}{3} \right) \\
&= \frac 5 9 .
\end{align*}

**Exercises**

Answer these questions to practice this form of statistical independence:


In [2]:
#display_quiz("../questions/si1.json")
display_quiz(git_path + "si1.json")


 If $A$ and $B$ are s.i. events, then the following pairs of events are also s.i.:
* $A$ and $\overline{B}$
* $\overline{A}$ and $B$
* $\overline{A}$ and $\overline{B}$

I.e., if the probability of an event $A$ occurring does not depend on whether some event $B$ occurs, then it cannot depend on whether the event $B$ does not occur.  This probably matches your intuition. However, we should verify it. Let's check the first example. We need to evaluate $P(A \cap \overline{B})$ to see if it factors as $P(A)P(\overline{B})$. Referring to the Venn diagram below, we can see that $A$ consists of the union of the mutually exclusive parts,  $A \cap B$ and $A \cap \overline{B}$. So we can write $P\left(A \cap \overline{B} \right)= P(A) - P(A \cap B)$. 


<img src="figs/si-intersection.png" alt="Venn Diagram Showing Relation of $A$, $A \cap \overline{B}$, and $A \cap B$" width="400px" style="margin-left:auto;margin-right:auto;">



Then by utilizing the fact that $A$ and $B$ are s.i., we have
\begin{align}
P\left(A \cap \overline{B} \right) &= P(A) - P(A \cap B) \\
 &= P(A) - P(A) P(B) \\
 &= P(A) \left[ 1- P\left(B\right) \right] \\
 &= P(A) P\left( \overline{B} \right)
\end{align}

So, if $A$ and $B$ are s.i., so are $A$ and $\overline{B}$. The other expressions follow through similar manipulation. This is important because we often use this fact to simplify solving problems. We start with a simple example to demonstrate the basic technique:

**Example**

**(Take 4)** A fair six-sided die is rolled twice.  What is the probability that either of the rolls is a value less than 3? 

As before, let $E_i$ be the event that the top face on roll $i$ is less than 3, for $i=1,2$, and $E_1$ and $E_2$ are s.i. then
\begin{align}
P(E_1 \cup E_2) &= 1 - P\left(\overline{E_1 \cup E_2}\right) \\
&= 1 - P\left( \overline{E_1} \cap \overline{E_2} \right) \\
&= 1 - P\left( \overline{E_1} \right) P\left( \overline{E_2} \right) \\
&= 1 - \left[ 1 - P\left( {E_1} \right)\right]
\left[ 1- P\left( {E_2} \right) \right]\\
&= 1- \left[ 1 - \left( \frac 2 6 \right) \right] \left[ 1 - \left( \frac 2 6 \right) \right] \\
&= \frac 5 9
\end{align}
Of course for this simple example, it is easiest to directly compute $P\left(\overline{E_1} \right)$, but the full approach shown here is a template that is encountered often when dealing with unions of s.i. events. 

To see the power of this method, we first need to define s.i. for more than two events:

````{card}
DEFINITION
^^^
statistically independent (for any number of events)
: Given a probability space $S, \mathcal{F}, P$, a collection of events $E_0, E_1, \ldots E_{n-1}$ in $\mathcal{F}$ are {\it statistically independent} if and only if (iff) 

\begin{align}
P(E_i \cap E_j) &= P(E_i) P(E_j), ~~ \forall i \ne j \\
P(E_i \cap E_j \cap E_k) &= P(E_i) P(E_j) P(E_k), ~~ \forall i \ne j \ne k \\
\ldots
P(E_0 \cap E_1 \cap \ldots \cap E_{n-1}) &= P(E_0) P(E_1) \cdots P(E_{n-1}), \\
\end{align}
````

It is not sufficient to just check that the probability of every pair of events factors as the product of the probabilities of the individual events. That defines a weaker form of independence:

````{card}
DEFINITION
^^^
pairwise statistically independent 
: Given a probability space $S, \mathcal{F}, P$, a collection of events $E_0, E_1, \ldots E_{n-1}$ in $\mathcal{F}$ are {\it pairwise statistically independent} if and only if (iff) 

\begin{align}
P(E_i \cap E_j) &= P(E_i) P(E_j), ~~ \forall i \ne j 
\end{align}
````



We want to use complements to convert the unions to intersections and the resulting general form looks like
\begin{align}
P\left( \bigcup_i E_i \right) &=
1- \prod_i \left[ 1- P\left( E_i \right) \right].
\end{align}
It may be helpful to interpret this as follows: The complement of any of a collection of events occurring is that none of those events occurs; thus the probability that any of a collection of events occurs is one minus the probability that none of those events occurs. 

Compare the simplicity of this approach to the form for directly solving for the probability of unions of events (Corrolary 7 from {doc}`../04-probability1/corollaries`):


\begin{eqnarray*}
P\left( \bigcup_{k=1}^{n} A_k \right) &=& 
\sum_{k=1}^{n} P\left(A_j\right)
    -\sum_{j<k} P \left( A_j \cap A_k \right) + \cdots \\
    && + 
    (-1)^{(n+1) } P\left(A_1 \cap A_2 \cap \cdots \cap A_n \right)
\end{eqnarray*} 

Now apply this approach to solve the following practice problems:


In [3]:
from jupyterquiz import display_quiz
git_path="https://raw.githubusercontent.com/jmshea/Foundations-of-Data-Science-with-Python/main/"

#display_quiz("quiz/si-unions.json")
display_quiz(git_path + "06-conditional-prob/quiz/si-unions.json")

## Relating Statistical Independent and Mutually Exclusive Events

In [4]:
git_path1="https://raw.githubusercontent.com/jmshea/Foundations-of-Data-Science-with-Python/main/06-conditional-prob/quiz/"

#display_quiz("quiz/si-me.json")
display_quiz(git_path1 + "si-me.json")



Click the “+” sign to reveal the discussion -->

```{toggle}


Suppose $A$ and $B$ are events that are both mutually exclusive and statistically independent. 

Since $A$ and $B$ are m.e., $A \cap B = \emptyset$, which further implies $P(A \cap B) = P(\emptyset) =0$.

Since $A$ and $B$ are s.i., $P(A \cap B) = P(A) P(B)$.

Combining these, we have that $P(A \cap B) = P(A)P(B) = 0$, which can only occur if either or both of $P(A)=0$ or $P(B)=0$.

Thus, events **cannot be both statistically independent and mutually exclusive unless at least one of the events has probability zero**.  

To gain some further insight into this, consider further the m.e. condition, $A \cap B = \emptyset$. This condition implies that if $A$ occurs, then $B$ cannot have occurred, and vice versa. Thus, knowing that either $A$ or $B$ occurred provides a lot of information about the other event. Thus, $A$ and $B$ cannot be independent if they are m.e., except in the special case already identified.



```

## Terminology Review

Use the flashcards below to help you review the terminology introduced in this section.

In [5]:
from jupytercards import display_flashcards

#display_flashcards('flashcards/'+'independence.json')

github='https://raw.githubusercontent.com/jmshea/Foundations-of-Data-Science-with-Python/main/'
github+='06-conditional-prob/flashcards/'
display_flashcards(github+'independence.json')
