# TD 4 - Topological Persistence


*by Joseph DE ROFFIGNAC and Ten NGUYEN HANAOKA* 

The purpose of this notebook is to address all the exercises from Lab Session 4 (INF556 – TD4), which focuses on implementing an algorithm to compute persistent homology with coefficients in the field ℤ/2ℤ (also denoted ℤ₂), and on testing it across various filtrations.

### Let's start with some imports !

In [23]:
%pip install tqdm


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [24]:
import sys
import time
from tqdm import tqdm
from utils import read_filtration

We are provided with a Simplex class (see simplex.py for more details) that contains three attributes:
* val (float): the time of appearance in the filtration,
* dim (int): the dimension,
* vert (list[int]): the list of vertex IDs (integers).

In addition, a read_filtration function in utils.py is available, which takes a filename (str) as input and returns a filtration represented as a list of simplices.

An example of how to use read_filtration is given just below :

In [28]:
filtration = read_filtration("filtration.txt")
for simplex in filtration:
    print(simplex)
type(filtration[0])

{val=1.0; dim=0; [2]}
{val=1.0; dim=0; [4]}
{val=1.0; dim=0; [1]}
{val=2.0; dim=1; [2, 4]}
{val=2.0; dim=1; [1, 2]}
{val=3.0; dim=0; [7]}
{val=4.0; dim=1; [4, 7]}
{val=4.0; dim=1; [1, 7]}
{val=5.0; dim=1; [1, 4]}
{val=6.0; dim=2; [1, 4, 7]}


simplex.Simplex

To simplify our process, we've added a line in read_filtration, that outputs a time sorted filtration

## Question 1 - Boundary matrix

**Question 1**:Compute the boundary matrix B of the filtration from the vector of simplices F. 

In [26]:
def boundary_matrix(filtration: list[dict]) -> list[list[int]]:
    
    # Dictionnaire : clé = frozenset(vertices), valeur = index dans la filtration
    index_map = {frozenset(s["vert"]): i for i, s in enumerate(filtration)}

    n = len(filtration)
    boundary = [set() for _ in range(n)]

    for j, simplex in tqdm(enumerate(filtration), desc="Computing boundary matrix", total=n):
        verts = simplex["vert"]
        dim = simplex["dim"]

        # Génération des faces en retirant un sommet
        if dim > 0:
            for v in verts:
                face = frozenset(verts - {v})
                i = index_map.get(face)
                if i is not None:
                    boundary[j].add(i)

    return boundary

print(boundary_matrix(filtration))

TypeError: 'Simplex' object is not subscriptable

## Questions 2 & 3 - Reduction algorithm

**Question 2**  : Implement the reduction algorithm for your representation of the boundary matrix. Evaluate its complexity.

In [None]:
def reduce_boundary_matrix(boundary : list[list[int]]) -> list[list[int]]:
    
    reduced_boundary = boundary.copy()
    m = len(reduced_boundary)

    pivots = {}

    for j in tqdm(range(m), desc="Reducing boundary matrix"):

        low_j = max(reduced_boundary[j]) if reduced_boundary[j] else -1
        while low_j != -1 and low_j in pivots:

            i = pivots[low_j]

            # Perform column addition (mod 2) : XOR
            reduced_boundary[j] = set(reduced_boundary[j]) ^ set(reduced_boundary[i])
            low_j = max(reduced_boundary[j]) if reduced_boundary[j] else -1

            #Pour la démo, il faudra montrer que low_j est strictement décroissant 

        if low_j != -1:
            pivots[low_j] = j

    return reduced_boundary # à optimiser

print(reduce_boundary_matrix(boundary_matrix(filtration)))

Computing boundary matrix: 100%|██████████| 10/10 [00:00<00:00, 118483.16it/s]
Reducing boundary matrix: 100%|██████████| 10/10 [00:00<00:00, 121927.44it/s]

[set(), set(), set(), {0, 1}, {0, 2}, set(), {1, 5}, set(), set(), {8, 6, 7}]





**Question 3** Reduce the complexity of the reduction to O(m^3) in the worst case, and to O(m) in cases where the matrix remains sparse throughout, where m is the number of simplices in the filtration. Argue that your code does have the desired worst-case and best-case complexities.

#TODO Il faudra écrire la démo de notre complexité ici (en particulier justifier que low_j est bien un monovariant de notre boucle while)

## Question 4 - Barcode extraction

In [None]:
def extract_barcodes(reduced_boundary : list[list[int]], filtration : list[dict]) -> list[tuple[int, int, int]]:

    seen_indexes = set()
    barcodes = []

    for j in range(len(reduced_boundary)):
        if reduced_boundary[j]:
            seen_indexes.add(j)
            low_j = max(reduced_boundary[j])
            seen_indexes.add(low_j)
            barcode = (filtration[low_j]["dim"], low_j,j)  # (index, death index, dimension)
            barcodes.append(barcode)

    print("Seen indexes:", seen_indexes)
    unseen_indexes = set(range(len(filtration))) - seen_indexes
    for i in unseen_indexes:
        barcode = (filtration[i]["dim"], i, -1)  # (index, death index = ∞, dimension)
        barcodes.append(barcode)

    barcodes.sort(key=lambda x: (x[0], x[1], x[2] if x[2] != -1 else float('inf')))  # Sort by (dimension, birth index)
    return barcodes


def print_barcodes(barcodes : list[tuple[int, int, int]], filtration : list[dict]) -> None:
    for (dim, birth_idx, death_idx) in barcodes:
        birth_time = filtration[birth_idx]["time"]
        death_time = filtration[death_idx]["time"] if death_idx != -1 else float('inf')
        print(f"{dim} {birth_time} {death_time}")

time_start = time.time()
print_barcodes(extract_barcodes(reduce_boundary_matrix(boundary_matrix(filtration)), filtration), filtration)
print(extract_barcodes(reduce_boundary_matrix(boundary_matrix(filtration)), filtration))
time_end = time.time()
print("Execution time:", time_end - time_start)

Computing boundary matrix: 100%|██████████| 10/10 [00:00<00:00, 102801.57it/s]
Reducing boundary matrix: 100%|██████████| 10/10 [00:00<00:00, 118149.41it/s]


Seen indexes: {1, 2, 3, 4, 5, 6, 8, 9}
0 1.0 inf
0 1.0 2.0
0 1.0 2.0
0 3.0 4.0
1 4.0 inf
1 5.0 6.0


Computing boundary matrix: 100%|██████████| 10/10 [00:00<00:00, 145635.56it/s]
Reducing boundary matrix: 100%|██████████| 10/10 [00:00<00:00, 26214.40it/s]

Seen indexes: {1, 2, 3, 4, 5, 6, 8, 9}
[(0, 0, -1), (0, 1, 3), (0, 2, 4), (0, 5, 6), (1, 7, -1), (1, 8, 9)]
Execution time: 0.012145757675170898





## Question 5 - Complexity analysis

In [None]:
## Question 5 - Complexity analysis
filtration_a = read_filtration("filtrations/filtration_A.txt")
print("Filtration initialization")
B = boundary_matrix(filtration_a)
print("Boundary matrix computed.")
print_barcodes(extract_barcodes(reduce_boundary_matrix(B), filtration_a), filtration_a)

KeyboardInterrupt: 

## TODO list


In [None]:
# TODO : report, answer questions, complexity analysis, plots, analysis of graphs, 2 3 pages. 
# >>> jupyter notebook