# Measure Theory
### $\sigma$-algebra
Given a set $\Omega$, a $\sigma$-algebra on $\Omega$ is a collection, $A$, that is a subset of the powerset of $\Omega$, $A \subset 2^{\Omega}$, such that $A$ is non empty and $A$ is:
1. Closed under **complements**. In other words, if there is some set $E \in A$, that that implies the complement of $E$ is in $A$, $E^c \in A$
2. Closed under countable unions. In other words, if we have a countable collections-some sequence of sets $(E_1, E_2, \dots)$-countable through $\infty$, for all members of $A$, then the union of all of them is also in $A$: $\bigcup_{i=1}^{\infty} E_i \in A$, 

#### Remarks
1. The set $\Omega$ itself is always a member of any $\sigma$-algebra on $\Omega$. In other words, $\Omega \in A$. 

#### Notes
* Note, the **power set** is a collection containing all subsets of $\Omega$. For example, if $\Omega = \{0, 1 \}$ then the power set would be $2^{\Omega} = { \big\{ \emptyset, \{0\}, \{1\}, \{0,1\} \big\} }$
* Note, a [countable set](https://en.wikipedia.org/wiki/Countable_set) is a set with the same cardinality of some subset of the natural numbers. 

TODO: Open vs. closed sets?

---

# $\sigma$ Algebra
We can start by looking at subsets on the real line, $\mathbb{R}$ and ask:

> How can we measure this subset?

This is what **measure theory** is all about! We want to give the subsets a meaningful measure, or in other words, a generalized volume. In the case of the real line we are looking to find a generalized length. 

If we had a simple case where we had the interval $[a, b]$ on the real line, we can easily see that the length is simply:

$$Length = a - b$$

However, what do we do if the subset that we are interested in is more complicated than such an easy interval? How can we then calculate the length? 

Enter: Measure Theory. We may also want to deal with different notions of length all together, so we want to generalize the notion of length. We also want to measure areas in $\mathbb{R}^2$, or even higher dimensional volumes (for instance, $\mathbb{R}^3$). 

So, it clearly makes sense to immediately start an abstract measure theory. We start by looking at an abstract set $X$. For this set we want to measure the generalized volume of the subsets. 

We can start with the **power set** of $X$, $P(X)$, which is the set of all subsets of $X$. As a quick example, consider the set:

$$X = \{ a, b \}$$

Then, the power set is:

$$P(X) = \Big\{ \{a, b\}, \{a\}, \{b\},  \emptyset \Big\}$$

We can now give the definition of a *measurable set*:

> **Measureable set**: A subset of the powersets, $A$, where $A \subset P(X)$. 

Realize the $A$ has as elements subsets of $X$. A collection such as this is called a **$\sigma$-algebra** if it fullfills the following rules:

1. $\emptyset$ and $X$ $\in$ $A$. This is the case because we want these sets to be measureable. 
2. If we know we can measure a set $a$, $a \in A$, then we must be able to measure the complement: $a^c = X \setminus  a$. So on the whole we have:

$$a \in A \longrightarrow a^c = X \setminus a \in A$$

3. We start with countable many subsets, $a_i \in A, i \in \mathbb{N}$. We could have a finite number, but the important thing is that if we do have infinitely many that they are _countable_. Then, we can write the union of all the sets is also in the $\sigma$ algebra:

$$a_i \in A, i \in \mathbb{N} \longrightarrow \bigcup_{i=1}^{\infty}a_i \in A$$

This means that we cannot leave the sigma algebra by using the normal union, the union of two sets, and even not if we use a countable union of infinity many sets. Keep in mind that $\bigcup_{i=1}^{\infty}a_i \in A$ is a subset of $X$. 

Note that the 3rd rule comes in from a measure process point of view, and gives the $\sigma$ it's meaning. 

If the above rules are fullfilled then $A$ is a $\sigma$ algebra. 

So, we can arrive at a final definition:

> $a \in A$ is called a **measurable set**. 

Note that the elements of the $\sigma$-algebra are the _measureable sets_; in other words, these are the sets we can measure in the end. 

### Example 1
We know that the $\sigma$-algebra needs at least two elements, the empty set and the set $X$ itself:

$$A = \Big\{\emptyset, X \Big\}$$

This is always the smallest possible $\sigma$-algebra. Rules 1, 2, 3 are trivially fulfilled. 

### Example 2 
We can also ask what the largest possible $\sigma$-algebra is. This is also very easy to see; it is the powerset:

$$A = P(X)$$

The powerset fulfills all of our rules (by definition, all possible subsets are in the powerset).

Note that we generally may not have the powerset available, but it is ideal to be as close as possible (and hence have as many measureable sets as possible). 

# Borel $\sigma$-Algebra
Imagine that we have a lot of different $\sigma$-algebras on the given set $X$:

$$A_i \; \sigma\text{-algebra's on } X, i \in I \text{ (index set)} $$

It does not matter if the index set is countable or not. We can now look at all the intersections of these $\sigma$-algebra's:

$$\bigcap_{i \in I} A_i \longrightarrow \text{These intersections are also a }\sigma \text{-algebra on }X$$

So, we see that this intersection of $\sigma$-algebra's will also be a $\sigma$-algebra, albeit a smaller one, on $X$. 

> **Definition** $\longrightarrow$ Let us have a fixed family of subsets, $M \subseteq P(X)$, that at first do not need to form a $\sigma$-algebra. There is a smallest $\sigma$-algebra (with respect to the set inclusion) that contains $M$: 
$$\bigcap_{A \supseteq M} A$$
Where we must remember that the $A$'s are $\sigma$-algebra's themselves, and by taking the intersection of all of them, we end up with the smallest $A$ that is a $\sigma$-algebra that also includes $M$ (by definiton in the intersection above, we are taking the intersection of $A$'s that are supersets of $M$). This is a long defintion to write, so we generally just write is as:
$$\overbrace{\sigma(M) = \bigcap_{A \supseteq M} A}^\text{$\sigma$-algebra generated by $M$}$$

### Example
To understand the above definition, let's look at an example. Let us have a set $X$ with four elements:

$$X = \big\{ a, b, c, d \big\}$$

We can then define a set of subsets; in this case we can just chose singletons:

$$M = \Big\{ \{a\}, \{b\} \Big\}$$

Note that $M$ is _not_ a $\sigma$-algebra yet. We can form our $\sigma(M)$, the smallest $\sigma$-algebra that contains this family of subsets:

$$\sigma(M) = 
\Big\{ \emptyset, X, \{a\}, \{b\}, \{a, b\}, \{ b, c, d\}, \{ a, c, d\}, \{ c, d\} \Big\}
$$

We can annotate our equation below:

$$
\sigma(M) = 
\Big\{ 
\overbrace{\emptyset, X}^\text{1st property}, 
\overbrace{\{a\}, \{b\}}^\text{includes $M$}, 
\overbrace{\{a, b\}}^\text{3rd property, countable unions},
\overbrace{ \{ b, c, d\}, \{ a, c, d\}, \{ c, d\}}^\text{2nd property, complements},
\Big\}$$

We see that $\sigma(M)$ contains both the empty set and the entire set $X$, which satisfies the first property of our $\sigma$-algebra definition. We also see that our entire set $M = \Big\{ \{a\}, \{b\} \Big\}$ is also contained in $\sigma(M)$. We also see that all of the countable unions are also present. Finally, we see that all complements are present. 

So, this is the smallest possible $\sigma$-algebra that can be made from $M$. So, it is not that difficult to get to a $\sigma$-algebra if we start with a finite set. However, if we start with an infinite set, things are much harder (since we need to do infinitely many steps in order to get to the $\sigma$-algebra). This leads us to a final definition.

### Borel $\sigma$-algebra
Let $X$ be a [**topological space**](https://en.wikipedia.org/wiki/Topological_space) (or let $X$ be a [**metric space**](https://en.wikipedia.org/wiki/Metric_space), or even more concretely let $X$ be a subset of $\mathbb{R}^n$). The idea here is that we need **open sets**. 

We want to have all of the open sets in our $\sigma$-algebra. Hence, we will look at the $\sigma$-algebra that the open sets generate. This is know as the **Borel $\sigma$-algebra**, $B(X)$, and again it is the $\sigma$-algebra generate by the open sets. If we are working in a topological space, we generally are working with a set $X$, together with it's topology $\tau$, where $\tau$ is a [collection of _open sets_](https://en.wikipedia.org/wiki/Topological_space#Definition_via_open_sets) that are subsets of $X$:

$$B(X) = \sigma(\tau)$$

Where, again: 

> $B(X)$ is the Borel $\sigma$-algebra on $X$. 

Notice that the notation of the topology vanishes; this is because most of the time the topology is clear! For example in $\mathbb{R}^n$ we use our standard topology, so we immediately know what the open sets in $\mathbb{R}^n$ are. 

We must keep in mind that by definition of the Borel $\sigma$-algebra, it includes the topological structure (what open means) into a $\sigma$-algebra. In the case of $\mathbb{R}^n$ this is indeed a really big $\sigma$-algebra, but it is _not_ a power set; in other words, it is not the biggest possible $\sigma$-algebra. However, it is the most suitable $\sigma$-algebra in our context, because it contains all of the sets that we want to measure. 

For a good resource on topological spaces, see [here](https://folk.ntnu.no/gereonq/MA3403H2018/MA3403_Lecture02.pdf).

# What is a measure?
To define a measure we must start with a set $X$ and a $\sigma$-algebra, $A$, on the set $X$:

$$(X, A)$$

This pair is then called a **measureable space**. Remember, the $\sigma$-algebra is simply a special collection of subsets of the set $X$. 

Now we will look at special maps that are defined on the $\sigma$-algebra $A$. These maps will be refered to as $\mu$, and they will map into the positive real numbers.  

> A map $\mu$: $A \longrightarrow [0, \infty]$ is called a **measure** if it satisfies:
1. $\mu(\emptyset) = 0$
2. **$\sigma$-additive**: Consider the area of the green set below:
<img src="https://intuitive-ml-images.s3-us-west-1.amazonaws.com/mathematics/measure_theory/measure_additivity.png">
We can split that up into disjoint subsets, $A_1,\dots, A_5$, where $A_i \cap A_j = \emptyset$, if $i \neq j$. Added together they should be equivalent to the area of the green box. We can write this as:
$$\mu \Big( \bigcup_{i=1}^5 A_i \Big) = \sum_{i=1}^5 \mu(A_i) 
\;\;\;\; \text{for all $A_i \in A$}
$$

Notice the small detail that was included here. Instead of using the normal interval $[0, \infty)$, we actually used $[0, \infty] = [0, \infty) \cap \{ \infty \}$.

Note that our rules 1 and 2 were derived as follows. We want to measure subsets of this set $X$, which means we want to give a volume to such a subset (i.e. a generalized length or volume). We see that we do _not_ include negative numbers. 

Now, we also want to include the intuition that we can approximate volumes. Consider the green rectangle below:

<img src="https://intuitive-ml-images.s3-us-west-1.amazonaws.com/mathematics/measure_theory/measure_additivity_2.png" width=300>

If we want to calculate the volume of the rectangle, again we can split it up into subsets: 

<img src="https://intuitive-ml-images.s3-us-west-1.amazonaws.com/mathematics/measure_theory/measure_additivity_3.png" width=300>

Again we get the decomposition of our original set into disjoint subsets. However, we now have infinitely many subsets! _But_, they are **countable**. Hence, we have a sequence of subsets:

$$A_1, A_2, A_3, \dots$$

We can still form the union of the subsets to get out the original sets! 

$$\mu \Big( \bigcup_{i=1}^{\infty} A_i \Big) = \sum_{i=1}^{\infty} \mu(A_i) 
\;\;\;\; \text{for all $A_i \in A$}
$$

So, instead of a finite sum on the right, we now have a **series**; however, it is a series of non-negative numbers. Again, we call this $\sigma$-additive. 

# Appendix 

### A.1 Open vs. Closed Sets
#### Open Sets
An open set is a set in which all of the points are interior points. Mathematically, an open set $U \subseteq X$ is a set which contains points $u \in U$ such that $u$ is an interior point of $U$. Visually:

<img src="https://intuitive-ml-images.s3-us-west-1.amazonaws.com/mathematics/measure_theory/open_set.png">

The definition of an **interior point** is that there exists an $\epsilon > 0$ neighborhood around the little $u$ which is completely contained in the big set $U$. So:

$$\forall u \in U \;\;\; \exists \; \epsilon(u) \; such \; that B(u,\epsilon(u) ) \subseteq U $$

In english:

> For all $u$ in $U$ there exists and $\epsilon$ such that the ball centered at $u$ of size $\epsilon$ is completed contained in the set $U$. 

#### Closed Sets
The defintion of a closed set is that it is the _complement_ of an open set. So, a set $C \subseteq X$ is a closed set _if_ it's complement is open. Recall, it's complement is:

$$C^c = X \setminus C$$

Closed set's most definitely have **boundaries**. 

For more see [here](https://www.youtube.com/watch?v=PcnSgIoASSk). 


### A.2 Topological and Metric Spaces
#### Metric Space
A **Metric Space** is rather easy to define:

> **Metric Space**: A set together wit ha metric on the set.

What is a metric? A metric is a function that defines a concept of _distance_ between any two members of the set, which are usually called points. Mathematically, a metric space is an ordered pair $(M, d)$ where $M$ is a set and $d$ is a metric no $M$, i.e. a function:

$$d: M x M \rightarrow \mathbb{R}$$

See more [here](https://en.wikipedia.org/wiki/Metric_space#Definition).

#### Topological Space
A **Topological Space** is defined as an ordered pair $(X, \tau)$, where $X$ is a set and $\tau$ is a collection of subsets of $X$ satisfying: 
* The empty set and $X$ are in $\tau$.
* The union of any collection of sets in $\tau$ is also in $\tau$.
* The intersection of any pair of sets in $\tau$ is also in $\tau$.

For more see [here](https://en.wikipedia.org/wiki/Topological_space#Definitions).