# Ripser C++ code Notebook Tutorial

This notebook is a survey tutorial for the C++ Ripser code module by Ulrich Bauer. This tutorial is a work in progress created by Alvaro Torras Casas, Cardiff University, 2022 for didactic and educational purposes. This is aimed at non-`C++` experts and non-TDA experts. If you are an expert, you might find this boring.

In this notebook, we consider the source code from the famous C++ Ripser module which computes persistent homology. The original code can be found on this repository: 

[1] Ulrich Bauer, https://github.com/Ripser/ripser, 2015–2021 

Also, the ideas of this code are very well explained at the following article:

[2] Bauer, U. Ripser: efficient computation of Vietoris–Rips persistence barcodes. J Appl. and Comput. Topology 5, 391–423 (2021). https://doi.org/10.1007/s41468-021-00071-5

This tutorial has no claims of originality, rather than for didactic and educational purposes. 
It is rather a notebook to go through some of the parts of the original code, breaking down it into small-easy-to-understand pieces. We also put some references to [1] and [2] along this text.

## Notebook setup:

Notice that for running this notebook you need to have a `c++` kernel installed in jupyterlab. This can be done thanks to `xeus-cling` (https://github.com/jupyter-xeus/xeus-cling) for which you will probably need to install before `miniconda`. Notice that even though `xeus-cling` maintainers say that they do not support packages for the Windows platform, you can still get around this problem by using Windows Subsystem for Linux. To know whether you have successfully installed the `c++` kernel, type:

`jupyter kernelspec list`

If `xcpp11`, `xcpp14` and `xcpp17` do not appear on the list, you might have to got to the folders where the `xeus-cling` kernels where installed and run

`jupyter kernelspec install xcpp11 xcpp14 xcpp17`

It took me a while to get this working though. 


## Ripser Tutorial:

First of all, to avoid any licence problems, we copy the associated licence below. Also, we omit some parts of the original code and might have modified it at some places. Also, the order in which we present the code here does not necessarily need to follow the original. However, the idea is that once one understands this tutorial, it should be easy to read the original code from [2].

In [1]:
/*
 Ripser: a lean C++ code for computation of Vietoris-Rips persistence barcodes
 MIT License
 Copyright (c) 2015–2021 Ulrich Bauer
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
 in the Software without restriction, including without limitation the rights
 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:
 The above copyright notice and this permission notice shall be included in all
 copies or substantial portions of the Software.
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 SOFTWARE.
 You are under no obligation whatsoever to provide any bug fixes, patches, or
 upgrades to the features, functionality or performance of the source code
 ("Enhancements") to anyone; however, if you choose to make your Enhancements
 available either publicly, or directly to the author of this software, without
 imposing a separate written license agreement for such Enhancements, then you
 hereby grant the following license: a non-exclusive, royalty-free perpetual
 license to install, use, modify, prepare derivative works, incorporate into
 other computer software, distribute, and sublicense such enhancements or
 derivative works thereof, in binary and source code form.
*/

Next, there are some imports of usual `C++` modules

In [2]:
#include <algorithm>
#include <cassert>
#include <chrono>
#include <cmath>
#include <fstream>
#include <iostream>
#include <numeric>
#include <queue>
#include <sstream>
#include <unordered_map>

Then, types for values, indices and coefficients are defined.

In [31]:
typedef float value_t;
typedef int64_t index_t;
typedef uint16_t coefficient_t;

Also, define number of coefficients and maximum simplex index

In [29]:
static const size_t num_coefficient_bits = 8;
static const index_t max_simplex_index = (index_t(1) << (8 * sizeof(index_t) - 1 - num_coefficient_bits)) - 1;

To avoid overflow of `index_t` variables, an overflow check is also created.

In [30]:
void check_overflow(index_t i) {
    if
    #ifdef USE_COEFFICIENTS
        (i > max_simplex_index)
    #else
        (i < 0)
    #endif
        throw std::overflow_error(
            "simplex index " + std::to_string((uint64_t)i) +
            " in filtration is larger than maximum index " +
            std::to_string(max_simplex_index)
        );
}

The class below seems to implement the binomial coefficient computation.

In [7]:
class binomial_coeff_table {
    std::vector<std::vector<index_t>> B;
    
    public:
        binomial_coeff_table(index_t n, index_t k) : B(k + 1, std::vector<index_t>(n + 1, 0)) {
            for (index_t i = 0; i <= n; ++i) {
                B[0][i] = 1;
                for (index_t j = 1; j < std::min(i, k + 1); ++j)
                    B[j][i] = B[j - 1][i - 1] + B[j][i - 1];
                if (i <= k) B[i][i] = 1;
                check_overflow(B[std::min(i >> 1, k)][i]);
            }
        }

        index_t operator()(index_t n, index_t k) const {
            assert(n < B.size() && k < B[n].size() && n >= k - 1);
            return B[k][n];
        }
};

Then, some hash maps and tables are created (tere is an option for Robin Hood hashing which here we omit.)

In [2]:
template <class Key, class T, class H, class E> using hash_map = std::unordered_map<Key, T, H, E>;
template <class Key> using hash = std::hash<Key>;

In [6]:
check_overflow(-1)

Standard Exception: simplex index 18446744073709551615 in filtration is larger than maximum index 36028797018963967

The following checks if a coefficient is prime:
 - First, we check whether a number is even (using the bitwise operation `&`) or it is smaller than $2$. If so, it checks that it is 2.
 - Then, for all odd numbers, starting from $p=3$ and up to $p$ such that $p^2 <= n$, we check whether $p$ divides $n$. If this happens at some $p$ we return `False`.

In [12]:
bool is_prime(const coefficient_t n) {
    if (!(n & 1) || n < 2) return n == 2;
    for (coefficient_t p = 3; p <= n / p; p += 2)
        if (!(n % p)) return false;
    return true;
}

In [19]:
std::cout << std::boolalpha; //This boolalpha makes sure that booleans are displayed as strings
std::cout << "5 is prime: " << bool(is_prime(5)) << std::endl;
std::cout << "-1 is prime: " << is_prime(-1) << std::endl;
std::cout << "2 is prime: " << is_prime(2) << std::endl;

5 is prime: true
-1 is prime: false
2 is prime: true
