# Compatification of spaces

#### Author: Matteo Caorsi

Analysing decison boundaries is not an easy task, especially given the fact that the feature space is non compact.

On compact spaces it is easier to work, as they a re close and bounded (Heine-Borel). 


## Scope

We propose here a method to compactifiy the feature space $\mathbb R^n$ to the projective space $\mathbb RP^n$.

The decision boundary, gets therefore sampled in each chart of $\mathbb RP^n$ uniformly. When charts are put together, the resulting point cloud (defined abstractly via a dissimilarity matrix `d_final`), can be used to compute the topology of the *compactified* decision boundary.

We believe that the topology so obtained can be further exploited for regularisation purposes.

In [None]:
%reload_ext autoreload
%autoreload 2
# deep learning
import torch
from torch.optim import Adam, SGD
import numpy as np
from torch import nn
from gdeep.models import FFNet
from gdeep.data.datasets import DatasetBuilder, DataLoaderBuilder
from gdeep.trainer import Trainer
from torch import autograd

# plot
import plotly.express as px
import pandas as pd
from gdeep.search import GiottoSummaryWriter

# ML
from sklearn.preprocessing import MinMaxScaler
from sklearn.datasets import make_blobs
from sklearn.metrics import pairwise_distances

# TDA
from gtda.homology import VietorisRipsPersistence
from gtda.plotting import plot_diagram


## Initialize the tensorboard writer

In order to analyse the results of your models, you need to start tensorboard.
On the terminal, move inside the `/examples` folder. There run the following command:

```
tensorboard --logdir=runs
```

Then go [here](http://localhost:6006/) after the training of your model to see all the visualization results.

In [None]:
writer = GiottoSummaryWriter()

## Build datatset

We want to test our method on a 3D dataset made of 3 separate blob. We expect that the neural network decision boundary looks like and hyperplane in $\mathbb R^3$.

Hence, after compactification, we would expect to find $\mathbb RP^2$ as final result.

In [None]:
# build the dataset
bd = DatasetBuilder(name="Blobs")
ds_tr, ds_val, _ = bd.build()

# build the dataloaders
dl = DataLoaderBuilder((ds_tr, ds_val))
dl_tr, dl_val, dl_ts = dl.build()


In [None]:
print("One batch from the dataloader:", next(iter(dl_tr)))


## Train the model

We propose here to train a simple feed forward neural network on the 3D tabular dataset.

In [None]:
# train NN
model = FFNet(arch=[3, 3])

pipe = Trainer(model, (dl_tr, dl_ts), nn.CrossEntropyLoss(), writer)

pipe.train(SGD, 5, False, {"lr": 0.01}, {"batch_size": 1})


## Plot the decision boundary

We are making a 3D interactive plot of the decision boundary on tensorbord: you can go to the projectors section and check it out after you run the following cells!

In [None]:
from gdeep.visualization import Visualiser

vs = Visualiser(pipe)
vs.plot_interactive_model()  # send to tensorboard the interactive model of FFNet
db, d_final, _ = vs.plot_decision_boundary(True)


## Topology of the compactified decision boundary

We check with Giotto-tda that the topology of the decison boundary is indeed that one of $\mathbb RP^2$, as expected.

In [None]:
# check topology from d_final

vr = VietorisRipsPersistence(
    collapse_edges=True,
    max_edge_length=1,
    metric="precomputed",
    n_jobs=-1,
    homology_dimensions=(0, 1, 2),
)
diag = vr.fit_transform([d_final])

plot_diagram(diag[0])
