<a href="https://colab.research.google.com/github/fbeilstein/topological_data_analysis/blob/master/lecture_8_persistence_homology_boundary_operator.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

###Doing Algebra on Simplicial Complexes

Now we know how to create simplicial complexes from the data and we can study the topology of the appropriate polyhedrons.
Ironically, to study the topology of the simplicial complexes we need to transfer to algebra: the ultimate goal of this appendix--- the definition of Betti numbers--- is achieved in the realm of algebra.

**Definition 2.**
An **oriented simplex $\sigma^n$** is an abstract simplex (**Definition 1**) with orientation chosen.
That is all the vertices of $\sigma^n$ are arbitrarily ordered, say $[v_0,\dots,v_p]$, and this order is given sign $+$.
For any different ordering of the vertices, the sign is $+$ if it can be obtained from the chosen ordering by an even number of swaps of two vertices at a time, otherwise, it is $-$.
Obviously, an **oriented abstract simplicial complex $\mathcal{K}$** is constructed from oriented simplices.


**Definition 3.**
The **$p$-chain** of $\mathcal{K}$, $C_p(\mathcal{K})$, is a free finitely generated abelian group (formally $\mathbb{Z}$-module), generated by oriented $p$-simplices of $\mathcal{K}$
$$
C_p(\mathcal{K}) = \left\{
\left.
\sum_{i=1}^{l_p} f_i \sigma_i^p
\right|
\forall i: f_i \in \mathbb{Z},~ \sigma_i^p \in \mathcal{K},~\sigma_i^p+\underbrace{(-\sigma_i^p)}_{\scriptstyle\text{orientation}}=0,~ 0\,\sigma_i^p = 0
\right\},
$$
where $l_p$ is the number of $p$-simplices in $\mathcal{K}$ and group operation "$+$" is defined as
$$
c = \sum_{i=1}^{l_p} f_i \sigma_i^p,\quad k = \sum_{i=1}^{l_p} g_i \sigma_i^p, \quad c + k = \sum_{i=1}^{l_p} (f_i + g_i) \sigma_i^p.
$$
For $p$ larger than the dimension of $\mathcal{K}$ we define $C_p(\mathcal{K})=\{0\}$.


**Definitions 2 and 3** effectively transform a simplicial complex into an abelian group.
But it turns out to be not enough to characterize the topology.
Betti numbers characterize "holes" in a manifold, so we need something to probe for the boundary--- the boundary operator.



**Boundary maps**

**Definition.**
The **boundary operator $\partial_p$** is the map $\partial_p: C_p(\mathcal{K}) \to C_{p-1}(\mathcal{K})$ such that
* basis, i.e. oriented simplices, are transformed as follows
$$
\partial_p \sigma^p = \partial_p \underbrace{[v_0,\dots,v_p]}_{\text{ordered vertices}} \mapsto \sum_{i=0}^p (-1)^i [v_0,\dots,\underbrace{v_{i-1},v_{i+1}}_{\text{no }v_i\text{!}},\dots,v_p];
$$
* $\partial_p$ is extended by linearity
$$
\partial_p: \sum_{n=1}^N c_n \sigma_n^p \mapsto \sum_{n=1}^N c_n (\partial_p \sigma_n^p);
$$
* the boundary of the zero chain is zero.
$$
\require{AMScd}
\begin{CD}
\cdots @>\partial_{n+1}>> C_n @>\partial_{n}>> C_{n-1} @>\partial_{n-1}>>\cdots @>>> C_0 @>\partial_0>> 0\\
\end{CD}
$$


Operator $\partial$ connects simplices of different dimensions.
Now we need something to probe whether a simplex is "internal" to the polyhedron $|K|$ or faces "ambient space".
That will help us to "define holes" and the following substructures are exactly what is needed.



A very **crucial relation** (sorry, no time = no proof)
$$
\partial_{k-1} \circ \partial_{k} = 0
$$
![img](https://www.mathphysicsbook.com/wp-content/uploads/2013/01/26.chain-complex.png)

This is called **exact sequence**


Notice that if we have the following exact sequence:
$$
\require{AMScd}
\begin{CD}
0 @>>> A @>f>> B @>g>> C @>>> 0\\
\end{CD}
$$
Then:
* $f$ is injective
* $g$ is surjective

Such a sequence is called **a short exact sequence**.

**Definition 4.**
**$p$-Cycles $Z_p(\mathcal{K})$** is a set of **$p$-cycles** $z_p$: $Z_p(\mathcal{K}) = \{z_p \in C_p(\mathcal{K}) | \partial_p z_p = 0\}$, i.e. the \textbf{kernel of $\partial_p$}.



**Definition 5.**
**$p$-Boundaries $B_p(\mathcal{K})$** is a set of **$p$-cycles** $b_p$:
$$
B_p(\mathcal{K}) = \{b_p \in C_p(\mathcal{K}) | \exists c_{p+1} \in C_{p+1}(\mathcal{K}): \partial_{p+1} c_{p+1} = b_p\},
$$
i.e. the **image of $C_{p+1}(\mathcal{K})$ under $\partial_{p+1}$**.



**Theorem.**
$\partial_{p-1} \circ \partial_p = 0$.
Thus $B_p(\mathcal{K})\triangleleft Z_p(\mathcal{K})$.

$\blacktriangleleft$ see~\cite{maunder,nash}; see **Definitions 4, 5**, note all subgroups of abelian groups are normal.$\blacksquare$


**Definition 6.**
The **$p$-dimensional homology group** of $\mathcal{K}$ is the quotient group $H_p(\mathcal{K}) = Z_p(\mathcal{K}) / B_p(\mathcal{K})$.



**Definition 4** rigorously defines cycles for us, while **Definition 5** tells us which of them are "filled in," i.e. contain no holes. The last **Definition 6** says "consider closed cycles but disregard anything that is filled in," i.e. we are only interested in "something with holes." Now the problem is that elements of $H_p(\mathcal{K})$ are not only those with one hole, but they are rather "generated by holes." So we need to "extract basis" somehow and the following **Theorems 2 and 3** come in handy.


**Theorem 2.**
Homology group $H_p(\mathcal{K})$ of complex $\mathcal{K}$ is a finitely generated abelian group.

$\blacktriangleleft$ see~\cite{maunder,nash} $\blacksquare$


**Theorem 3.**
Let $A$ be a finitely generated (not free!) abelian group with $n$ generators, then there exists a unique (except for the order of its members) list of primes $p_1$,...,$p_m$ (not necessarily distinct) and positive integers $s_1$,...,$s_m$, such that
$$
A \cong G \oplus \underbrace{\mathbb{Z}_{p_1^{s_1}} \oplus \cdots \oplus \mathbb{Z}_{p_m^{s_m}}}_{T},
$$
where $T$ is called the **torsion subgroup**, $\mathbb{Z}_{{p_i^{s_i}}}$ are cyclic groups of order $p_i^{s_i}$, and $G$ is free abelian group.
The rank of $G$ is $n - m$.

$\blacktriangleleft$ see~\cite{hungerford} Theorem 2.6$\blacksquare$


The procedure is somewhat similar to the decomposition of a number into prime factors.
In practice, it is performed by representing operators $\partial_p$ as matrices and employing the Smith normal form~\cite{smith}, but here we only outline the theoretical basis.


**Definition 7.**
The rank of $G$ from **Theorem 3** for $A = H_p(\mathcal{K})$ is called the **$p$-th Betti number $\beta_p$** of the geometric simplicial complex $K$.


Please note: despite the fact we have defined Betti numbers for abstract simplicial complex $\mathcal{K}$, they are inherently connected to its geometric realization $K$.
Thus Betti numbers can be treated as topological characteristics of the polyhedron $|K|$.
Moreover, topology makes no distinction between homeomorphic spaces, thus the same characteristic can be prescribed to any space $\mathbb{X}$ that is homeomorphic to $|K|$.
This property is summarized by the following.


**Definition 8.**
A **triangulation** of topological space $\mathbb{X}$ is a geometric simplicial complex $K$ together with a homeomorphism $f: |K|\to \mathbb{X}$.
If there exists such $K$ the space $\mathbb{X}$ is called **triangulable**.
The homology groups of a triangulable space $\mathbb{X}$ are defined $H_p(\mathbb{X}) = H_p(\mathcal{K})$.


**Theorem.**
Homology groups $H_p(\mathbb{X})$ and $H_p(\mathbb{Y})$ of homeomorphic topological spaces are isomorphic for each $p$.

$\blacktriangleleft$ see~\cite{vick} Theorem 1.7$\blacksquare$


The latter means that homology groups of the triangulable space (**Definition 8**) are well-defined and that the notion of Betti numbers can be extended to topological spaces that are homeomorphic to some polyhedron $|K|$

[For visuals please check](https://fbeilstein.github.io/topological_data_analysis/homology_explorer/homology_explorer.html).




###Betti Numbers

Let $A$ be abelian group with $n$ generators, $F$ and $R$ - finitely generated free abelian groups, $R \subset F$, then
$$
A \cong F / R \cong G \bigoplus_{i=1}^m Z_{h_i},
$$
where $G$ is free abelian group of rank $n-m$, $Z_{h_i}$ - cyclic groups.

In the context of simplices we can say
$$
H_k(S) = \text{Ker}(\partial_k) / \text{Im}(\partial_{k+1}) \cong G \oplus T.
$$
The portion of $H_k$ that is given by $T = \bigoplus_{i=1}^m Z_{h_i}$ is called the torsion part of $H_k(S)$. The rank of $H_k$ is defined to be the rank of $G$, is denoted $\beta_k$ and called the $k$-th Betti number of the simplicial complex.


**How it's calculated (Smith normal form)**

In practice we represent operators $\partial$ as matrices. For any $n \times m$ matrix $A$, there exist an invertible $n \times n$ matrix $U$ and an invertible $m \times m$ matrix $V$ such that $UAV$ is equal to a diagonal matrix $D$.

Let $A$ be an $m \times n$ integer matrix, and $B$ be an $l \times m$ integer matrix such that $BA=0$. Then
$$
\text{Ker}(B)/\text{Im}(A) = \bigoplus_{i=1}^r \mathbb{Z}/\alpha_i \oplus \mathbb{Z}^{m-r-s},
$$
where $r=\text{rank}(A)$, $s=\text{rank}(B)$, and $\alpha_1,\dots,\alpha_r$ are the non-zero elements on the diagonal of $D$ (Smith normal form of $A$).

The **main result** about Smith normal form, of course, is that every integer matrix has one. It is unique up to signs. There is an algorithm to compute it which is cross between row reduction and the Euclidean algorithm for computing greatest common divisors.