# 15 min Intro to Persistent Homology
### <a href="https://ben300694.github.io/" target="_blank">Benjamin Matthias Ruppik</a>, Max-Planck Institute for Mathematics, Bonn

## Filtrations of topological spaces and homology groups

A lot of topological spaces $X$ come with a natural filtration,
that is, an exhaustion by subsets $U_{i}$ with the property
that for $i < j$ we have $U_{i} \subseteq U_{j}$

<img src="img/surface_morse_position.jpg" width=400 title="Surface of genus 2 in Morse position with 3 minima, 7 saddles, 2 maxima"/>

## Importing and preprocessing the data

We get our data - a graph representing a social network - from https://networkrepository.com/index.php



In [2]:
import pandas
import gudhi 

In [5]:
toy_matrix = [[0,1,1,2,3], [1,0,1,1,2], [1,1,0,1,2], [2,1,1,0,1], [3,2,2,1,0]]

In [6]:
graph_distance_matrix = toy_matrix

# Build Vietoris-Rips complex
skeleton = gudhi.RipsComplex(
    distance_matrix = graph_distance_matrix,
    max_edge_length = 5.0
)

Rips_simplex_tree = skeleton.create_simplex_tree(max_dimension = 3)

In [7]:
# Compute persistence of the simplex tree
BarCodes_Rips = Rips_simplex_tree.persistence()

In [8]:
BarCodes_Rips

[(0, (0.0, inf)),
 (0, (0.0, 1.0)),
 (0, (0.0, 1.0)),
 (0, (0.0, 1.0)),
 (0, (0.0, 1.0))]

## Applying persistent homology to the data

persist = to stick around for a long time

homology = cycles module boundaries

persistent homology measures non-trivial cycles which can be detected in significant parts ofthe filtrations

## Conclusion

### Properties of the data we could detect with persistent homology

 * qualitative data about large scale features
 * **TODO**

### Drawbacks of using persistent homology
 * have to deal with noise in the data, for example, points close to the diagonal in the persistence diagram only appear for a short time
 * **TODO**

### References

 * Bot Detection on Social Networks Using Persistent Homology: https://www.semanticscholar.org/paper/Bot-Detection-on-Social-Networks-Using-Persistent-Nguyen-Aktas/e3944fac408415965b3d24d52d3ac7b7a0e9aa17
 * Persistent Homology of Collaboration Networks: https://www.hindawi.com/journals/mpe/2013/815035/
 
 * An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists: https://arxiv.org/abs/1710.04019
 
 
### Python libraries used

#### TDA tools

 * [GUDHI Python module](https://gudhi.inria.fr/)
 * https://github.com/GUDHI/TDA-tutorial

#### Graphs

 * https://networkx.org/



### Datasets
 
 @inproceedings{nr,
      title = {The Network Data Repository with Interactive Graph Analytics and Visualization},
      author={Ryan A. Rossi and Nesreen K. Ahmed},
      booktitle = {AAAI},
      url={https://networkrepository.com},
      year={2015}
 }
