# Toplogy of Deep Neural Networks

This notebook will show you how easy it is to use gdeep to reproduce the experiments of the paper [Topology of Deep Neural Networks](https://arxiv.org/pdf/2004.06093.pdf), by Naizat et. al. In this work, the authors studied the evolution of the topology of a dataset as embedded in the successive layers of a Neural Network, trained for classification on this dataset.

Their main findings can be summarized as follows:

- Neural networks tend to simplify the topology of the dataset accross layers.

- This decrease in topological complexity is more efficient when the activation functions are non-homeomorphic, as it is the case for ReLu or leakyReLu.

Here is an illustration from the paper:

![illustration](/notebook_images/tda_dl/intro.png)

The main steps of this tutorial will be as follows:

1. Import the Entangled Tori dataset.
2. Build several fully connected networks, with different activation functions.
3. Train these networks to classify the Entangled Tori datasets.
4. Visualise in tensorboard the persistence diagrams of the dataset embedded in each layers of each network.
5. Study the decrease in topological complexity of the dataset accross layers



## Import the packages that will be needed for the notebook

In [9]:
%reload_ext autoreload
%autoreload 2

# deep learning
import torch
from torch.optim import Adam, SGD
import numpy as np
from torch import nn
from torch import autograd  

#gdeep
from gdeep.data.datasets import DatasetBuilder, DataLoaderBuilder
from gdeep.models import FFNet
from gdeep.visualisation import persistence_diagrams_of_activations
from gdeep.data.preprocessors import ToTensorImage
from gdeep.trainer import Trainer
from gdeep.search import Benchmark



# plot
import plotly.express as px
import pandas as pd
from torch.utils.tensorboard import SummaryWriter

writer = SummaryWriter()

# ML
from sklearn.preprocessing import MinMaxScaler
from sklearn.datasets import make_blobs
from sklearn.metrics import pairwise_distances

# TDA
from gtda.homology import VietorisRipsPersistence
from gtda.plotting import plot_diagram

#Tensorboard

import tensorflow as tf
import tensorboard as tb
tf.io.gfile = tb.compat.tensorflow_stub.io.gfile


## 1. Initialize the tensorboard writer and import the Entangled Tori dataset

In order to analyse the reuslts of your models, you need to start tensorboard.
On the terminal, move inside the `/example` folder. There run the following command:

```
tensorboard --logdir=runs
```

Then go [here](http://localhost:6006/) after the training to see all the visualisation results.


In [10]:
db = DatasetBuilder(name="EntangledTori")
ds_tr, ds_val, ds_ts = db.build(n_pts = 50)
dl_tr, dl_val, dl_ts = DataLoaderBuilder((ds_tr, ds_val, ds_ts)).build()

## 2. Choose the architecture and activations functions of the models

In [11]:
import torch.nn.functional as F
architecture = [3,5,5,5,5,2]
loss_function = nn.CrossEntropyLoss()
activation_string = ["relu", "leakyrelu", "tanh", "sigmoid"]
activation_functions = [F.relu, F.leaky_relu, F.tanh, F.sigmoid]









In [12]:
models = []
writers = []
trainers = []
for i in range(len(activation_functions)):
    model_temp = FFNet(arch = architecture, activation = activation_functions[i])
    writer_temp = SummaryWriter(log_dir='runs/' + model_temp.__class__.__name__ + activation_string[i])
    trainer_temp = Trainer(model_temp, [dl_tr, dl_ts], loss_function, writer_temp)
    models.append(model_temp)
    writers.append(writer_temp)
    trainers.append(trainer_temp)

In [13]:
epochs = 10

for pipe in trainers:
    pipe.train(
    Adam,
    epochs,
    False,
    {"lr": 0.01},
    {"batch_size": 100})

Epoch 1
-------------------------------
Epoch training loss: 0.587334 	Epoch training accuracy: 64.36%                                       ]                       	[ 174 / 320 ]                     
Time taken for this epoch: 2.00s
Learning rate value: 0.01000000



Cannot store data in the PR curve



Validation results: 
 accuracy: 72.67%,                 Avg loss: 0.508755 

Epoch 2
-------------------------------
Epoch training loss: 0.453703 	Epoch training accuracy: 74.62%                                                 320 ]                     
Time taken for this epoch: 2.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 73.71%,                 Avg loss: 0.421559 

Epoch 3
-------------------------------
Epoch training loss: 0.415970 	Epoch training accuracy: 76.32%                                                 
Time taken for this epoch: 2.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 75.65%,                 Avg loss: 0.399604 

Epoch 4
-------------------------------
Epoch training loss: 0.382622 	Epoch training accuracy: 77.47%                                                 
Time taken for this epoch: 2.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 77.62%,                 Avg loss: 0.369404 

Epoch 5
-------


nn.functional.tanh is deprecated. Use torch.tanh instead.



Epoch training loss: 0.531943 	Epoch training accuracy: 69.94%                                                            
Time taken for this epoch: 2.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 76.39%,                 Avg loss: 0.445163 

Epoch 2
-------------------------------
Epoch training loss: 0.405069 	Epoch training accuracy: 78.56%                                                 74.0  	[ 221 / 320 ]                     
Time taken for this epoch: 2.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 78.94%,                 Avg loss: 0.381649 

Epoch 3
-------------------------------
Epoch training loss: 0.381078 	Epoch training accuracy: 79.67%                                                 
Time taken for this epoch: 2.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 77.97%,                 Avg loss: 0.454133 

Epoch 4
-------------------------------
Epoch training loss: 0.366991 	Epoch training accuracy: 80.45%    


nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.



Epoch training loss: 0.654219 	Epoch training accuracy: 56.51%                                                              320 ]                     
Time taken for this epoch: 2.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 60.75%,                 Avg loss: 0.612072 

Epoch 2
-------------------------------
Epoch training loss: 0.597070 	Epoch training accuracy: 60.03%                                                              	[ 272 / 320 ]                     
Time taken for this epoch: 2.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 60.41%,                 Avg loss: 0.570718 

Epoch 3
-------------------------------
Epoch training loss: 0.564355 	Epoch training accuracy: 62.84%                                                               	[ 244 / 320 ]                     
Time taken for this epoch: 2.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 64.60%,                 Avg loss: 0.548756 

Epoch 4
--------------

In [14]:
from gdeep.analysis.interpretability import Interpreter
from gdeep.visualisation import Visualiser

vs = Visualiser(trainers[0]) 
vs.plot_3d_dataset()



In [15]:
one_batch_dataset, _, _ = DataLoaderBuilder((ds_tr, ds_val, ds_ts)).build([{"batch_size":32000}, {"batch_size":32000},{"batch_size":32000}]) 


for pipe in trainers:
    vs = Visualiser(pipe)
    vs.plot_persistence_diagrams(next(iter(one_batch_dataset)))

: 

: 

TypeError: '_SingleProcessDataLoaderIter' object is not subscriptable

In [None]:
# train NN
model = FFNet(arch=[3,10,10,10,10,2])
print(model)
pipe = Trainer(model, (dl_tr, dl_ts), nn.CrossEntropyLoss(), writer)
pipe.train(Adam, 100, False, {"lr":0.01}, {"batch_size":50})

FFNet(
  (linears): ModuleList(
    (0): Linear(in_features=3, out_features=10, bias=True)
    (1): Linear(in_features=10, out_features=10, bias=True)
    (2): Linear(in_features=10, out_features=10, bias=True)
    (3): Linear(in_features=10, out_features=10, bias=True)
    (4): Linear(in_features=10, out_features=2, bias=True)
  )
)
Epoch 1
-------------------------------
Epoch training loss: 0.674135 	Epoch training accuracy: 54.95%                                                           
Time taken for this epoch: 0.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 59.86%,                 Avg loss: 0.646908 

Epoch 2
-------------------------------
Epoch training loss: 0.643848 	Epoch training accuracy: 57.15%                                                           
Time taken for this epoch: 0.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 62.86%,                 Avg loss: 0.607025 

Epoch 3
-------------------------------
Epoch training l

(0.2513222247362137, 90.85714285714286)

In [None]:
from gdeep.analysis.interpretability import Interpreter
from gdeep.visualisation import Visualiser

vs = Visualiser(pipe)
one_batch_dataset, _, _ = DataLoaderBuilder((ds_tr, ds_val, ds_ts)).build([{"batch_size":1600}, {"batch_size":1600}, {"batch_size":1600}]) 



# the diagrams can be seen on tensorboard!
vs.plot_persistence_diagrams(next(iter(one_batch_dataset)))


In [52]:
vs.plot_3d_dataset()

In [71]:
 model_temp = FFNet(arch = [2,3,3])

TypeError: super(type, obj): obj must be an instance or subtype of type

In [69]:
architecture

[3, 10, 10, 10, 10, 2]