# Toplogy of Deep Neural Networks

This notebook will show you how easy it is to use gdeep to reproduce the experiments of the paper [Topology of Deep Neural Networks](https://arxiv.org/pdf/2004.06093.pdf), by Naizat et. al. In this work, the authors studied the evolution of the topology of a dataset as embedded in the successive layers of a Neural Network, trained for classification on this dataset.

Their main findings can be summarized as follows:

- Neural networks tend to simplify the topology of the dataset accross layers.

- This decrease in topological complexity is more efficient when the activation functions are non-homeomorphic, as it is the case for ReLu or leakyReLu.

Here is an illustration from the paper:

![illustration](/notebook_images/tda_dl/intro.png)

The main steps of this tutorial will be as follows:

1. 


In [1]:
%reload_ext autoreload
%autoreload 2

# deep learning
import torch
from torch.optim import Adam, SGD
import numpy as np
from torch import nn
from torch import autograd  

#gdeep
from gdeep.data.datasets import DatasetBuilder, DataLoaderBuilder
from gdeep.models import FFNet
from gdeep.visualisation import persistence_diagrams_of_activations
from gdeep.data.preprocessors import ToTensorImage
from gdeep.trainer import Trainer
from gdeep.search import Benchmark



# plot
import plotly.express as px
import pandas as pd
from torch.utils.tensorboard import SummaryWriter

writer = SummaryWriter()

# ML
from sklearn.preprocessing import MinMaxScaler
from sklearn.datasets import make_blobs
from sklearn.metrics import pairwise_distances

# TDA
from gtda.homology import VietorisRipsPersistence
from gtda.plotting import plot_diagram

#Tensorboard

import tensorflow as tf
import tensorboard as tb
tf.io.gfile = tb.compat.tensorflow_stub.io.gfile


No TPUs...


# Initialize the tensorboard writer

In order to analyse the reuslts of your models, you need to start tensorboard.
On the terminal, move inside the `/example` folder. There run the following command:

```
tensorboard --logdir=runs
```

Then go [here](http://localhost:6006/) after the training to see all the visualisation results.


# Import the Entangled Tori dataset and prepare the dataloaders

In [2]:
from torch.utils.data import  RandomSampler
db = DatasetBuilder(name="EntangledTori")
ds_tr, ds_val, ds_ts = db.build( n_pts = 50)
dl_tr, dl_val, dl_ts = DataLoaderBuilder((ds_tr, ds_val, ds_ts)).build(    
     [{"batch_size":100, "sampler":RandomSampler(ds_tr)}, 
     {"batch_size":100, "sampler":RandomSampler(ds_tr)}, 
     {"batch_size":100, "sampler":RandomSampler(ds_tr)}]
     )

## Define models with different activations functions

In [3]:
import torch.nn.functional as F
architecture = [3,5,5,5,5,5,2]
loss_function = nn.CrossEntropyLoss()
activation_string = ["relu", "leakyrelu", "tanh", "sigmoid"]
activation_functions = [F.relu, F.leaky_relu, torch.tanh, torch.sigmoid]
models = []
writers = []
trainers = []
for i in range(len(activation_functions)):
    model_temp = FFNet(arch = architecture, activation = activation_functions[i])
    writer_temp = SummaryWriter(log_dir='runs/' + model_temp.__class__.__name__ + activation_string[i])
    trainer_temp = Trainer(model_temp, [dl_tr, dl_ts], loss_function, writer_temp)
    models.append(model_temp)
    writers.append(writer_temp)
    trainers.append(trainer_temp)








In [4]:
for pipe in trainers:
    pipe.train(
    Adam,
    10,
    False,
    {"lr": 0.01},
    {"batch_size": 200})

Epoch 1
-------------------------------
Epoch training loss: 0.631131 	Epoch training accuracy: 59.53%                                                             
Time taken for this epoch: 1.00s
Learning rate value: 0.01000000



Cannot store data in the PR curve



Validation results: 
 accuracy: 62.70%,                 Avg loss: 0.603961 

Epoch 2
-------------------------------
Epoch training loss: 0.604639 	Epoch training accuracy: 62.15%                                                             
Time taken for this epoch: 1.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 61.73%,                 Avg loss: 0.613148 

Epoch 3
-------------------------------
Epoch training loss: 0.599063 	Epoch training accuracy: 62.39%                                                             
Time taken for this epoch: 1.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 62.83%,                 Avg loss: 0.599826 

Epoch 4
-------------------------------
Epoch training loss: 0.590372 	Epoch training accuracy: 63.09%                                                             
Time taken for this epoch: 1.00s
Learning rate value: 0.01000000
Validation results: 
 accuracy: 64.39%,                 Avg loss: 0.585976 

Epoch

In [5]:
from gdeep.analysis.interpretability import Interpreter
from gdeep.visualisation import Visualiser

vs = Visualiser(trainers[0]) 
vs.plot_3d_dataset()

In [6]:
one_batch_dataset, _, _ = DataLoaderBuilder((ds_tr, ds_val, ds_ts)).build([{"batch_size":1600, "sampler":RandomSampler(ds_tr)}, {"batch_size":1600, "sampler":RandomSampler(ds_tr)}, {"batch_size":1600, "sampler":RandomSampler(ds_tr)}]) 


for pipe in trainers:
    vs = Visualiser(pipe)
    vs.plot_persistence_diagrams(next(iter(one_batch_dataset)))

In [52]:
vs.plot_3d_dataset()