![QMUL](Images/QMUL-logo.jpg)

# Statistics for Biologists


## Probability theory - set operations

### Set operations

Consider an experiment of drawing a nucleotide at random. The sample space is $S=\{A,C,G,T\}$ and some possible events are $A=\{G,C,A\}$ and $B=\{T,G\}$. 

What is the __union__ of $A$ and $B$? 

In [None]:
# define the sample space
S <- c("A","C","G","T")
# define A and B
A <- c("G","C","A")
B <- c("T","G") 

# union
union(A,B)

Given any two events $A$ and $B$, set operations include:
* Union: the union of $A$ and $B$ is the set of elements that belong to either $A$ or $B$ or both:
\begin{equation}
A \cup B = \{x: x \in A \text{ or } x \in B\}
\end{equation}

The sample space is $S=\{A,C,G,T\}$ and some possible events are $A=\{G,C,A\}$ and $B=\{T,G\}$. 

What is the __intersection__ between $A$ and $B$?

In [None]:
# define the sample space
S <- c("A","C","G","T")
# define A and B
A <- c("G","C","A")
B <- c("T","G")

# intersection
intersect(A,B)

* Intersection: the intersection of $A$ and $B$ is the set of elements that belong to both $A$ and $B$:
\begin{equation}
A \cap B = \{x: x \in A \text{ and } x \in B\}
\end{equation}

The sample space is $S=\{A,C,G,T\}$ and some possible events are $A=\{G,C,A\}$ and $B=\{T,G\}$. 

What is the __complement__ of $A$? 

In [None]:
# define the sample space
S <- c("A","C","G","T")
# define A
A <- c("G","C","A")

# complement of A
setdiff(S,A)

* Complementation: the complement of $A$ is the set of all elements that are not in $A$:
\begin{equation}
A^c = \{x:x \notin A\}
\end{equation}

###  Exercise 

The sample space is $S=\{A,C,G,T\}$ and some possible events are $A=\{G,C,A\}$ and $B=\{T,G\}$. 

What is the complement of the union of $A$ and $B$?

Note that there are several functions in R suitable for set operations, as outlined [here](https://stat.ethz.ch/R-manual/R-devel/library/base/html/sets.html). 

From this exercise, we discover a special set, the empty set $\emptyset$.

Using Venn diagrams is another useful way to visualise (but not prove) set operations.

It is also worth noting that the elementary set operations can be combined. We can define several useful properties of set operations.

### Properties of set operations

* Commutativity
\begin{equation}
A \cup B = B \cup A \\
A \cap B = B \cap A
\end{equation}

* Associativity
\begin{equation}
A \cup (B \cup C) = (A \cup B) \cup C \\
A \cap (B \cap C) = (A \cap B) \cap C
\end{equation}

* Distributive laws
\begin{equation}
A \cap (B \cup C) = (A \cap B) \cup (A \cap C) \\
A \cup (B \cap C) = (A \cup B) \cap (A \cup C)
\end{equation}

* DeMorgan's laws
\begin{equation}
(A \cup B)^c = A^c \cap B^c \\
(A \cap B)^c = A^c \cup B^c
\end{equation}

The operations of union and intersection can be extended to infinite collections of sets.
As an example, let $S=(0,1]$ and define $A_i = [(1/i),1]$.
Then $\bigcup_{i=1}^\infty A_i = ?$ and $\bigcap_{i=1}^\infty A_i = ?$.