<a href="https://colab.research.google.com/github/andrewjh9/CenBench/blob/MLP/CenBench_MLP.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CenBench MLP
This Juypter notebook is the same as the CenBench one, except it is for a fully connected MLP. All the sparsity constraint have been removed. For explaination of parts of this notebook please refer to CenBench.ipynb


In [None]:
# Author: Decebal Constantin Mocanu et al.;
# Proof of concept implementation of Sparse Evolutionary Training (SET) of Multi Layer Perceptron (MLP) on CIFAR10 using Keras and a mask over weights.
# This implementation can be used to test SET in varying conditions, using the Keras framework versatility, e.g. various optimizers, activation layers, tensorflow
# Also it can be easily adapted for Convolutional Neural Networks or other models which have dense layers
# However, due the fact that the weights are stored in the standard Keras format (dense matrices), this implementation can not scale properly.
# If you would like to build and SET-MLP with over 100000 neurons, please use the pure Python implementation from the folder "SET-MLP-Sparse-Python-Data-Structures"

# This is a pre-alpha free software and was tested with Python 3.5.2, Keras 2.1.3, Keras_Contrib 0.0.2, Tensorflow 1.5.0, Numpy 1.14;
# The code is distributed in the hope that it may be useful, but WITHOUT ANY WARRANTIES; The use of this software is entirely at the user's own risk;
# For an easy understanding of the code functionality please read the following articles.

# If you use parts of this code please cite the following articles:
#@article{Mocanu2018SET,
#  author =        {Mocanu, Decebal Constantin and Mocanu, Elena and Stone, Peter and Nguyen, Phuong H. and Gibescu, Madeleine and Liotta, Antonio},
#  journal =       {Nature Communications},
#  title =         {Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity inspired by Network Science},
#  year =          {2018},
#  doi =           {10.1038/s41467-018-04316-3}
#}

#@Article{Mocanu2016XBM,
#author="Mocanu, Decebal Constantin and Mocanu, Elena and Nguyen, Phuong H. and Gibescu, Madeleine and Liotta, Antonio",
#title="A topological insight into restricted Boltzmann machines",
#journal="Machine Learning",
#year="2016",
#volume="104",
#number="2",
#pages="243--270",
#doi="10.1007/s10994-016-5570-z",
#url="https://doi.org/10.1007/s10994-016-5570-z"
#}

#@phdthesis{Mocanu2017PhDthesis,
#title = "Network computations in artificial intelligence",
#author = "D.C. Mocanu",
#year = "2017",
#isbn = "978-90-386-4305-2",
#publisher = "Eindhoven University of Technology",
#}\\\

# Alterations made by Andrew Heath



!pip3 install networkit
!pip3 install networkx

## Set up

In [None]:
from __future__ import division
from __future__ import print_function
import tensorflow as tf
from datetime import datetime
import time
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras import optimizers
from tensorflow.python.client import device_lib

import numpy as np
from numpy import asarray
from numpy import savetxt
import pydot
from tensorflow.keras import models, layers  
from tensorflow.keras import backend as K
from tensorflow.keras import layers
from tensorflow.keras import activations
from tensorflow.keras import utils as k_utils
import time
from copy import copy, deepcopy
import networkx.algorithms.isomorphism as iso
from  more_itertools import take
from scipy.sparse import dok_matrix
import networkx as nx
import networkit as nk
from random import sample


#Please note that in newer versions of keras_contrib you may encounter some import errors. You can find a fix for it on the Internet, or as an alternative you can try other activations functions.
# import tf.keras.activations.relu as SReLU
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.datasets import cifar100
from tensorflow.keras.datasets import fashion_mnist 
from tensorflow.keras.utils import to_categorical
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
%matplotlib inline  

class Constraint(object):

    def __call__(self, w):
        return w

    def get_config(self):
        return {}

class MaskWeights(Constraint):

    def __init__(self, mask):
        self.mask = mask
        self.mask = K.cast(self.mask, K.floatx())

    def __call__(self, w):
        w = w.assign(w * self.mask)
        return w

    def get_config(self):
        return {'mask': self.mask}


def find_first_pos(array, value):
    idx = (np.abs(array - value)).argmin()
    return idx


def find_last_pos(array, value):
    idx = (np.abs(array - value))[::-1].argmin()
    return array.shape[0] - idx




## Init & Parameters

In [None]:
class CenBench_MLP():
    def __init__(self, maxepoches, dataset, pruning_approach, num_sds=0, batch_size = 100, centrality_metric=None, zeta=0.05):

        # Move
        def prod(val) : 
            res = 1 
            for ele in val: 
                res *= ele 
            return res 

        # Fetch the parameters for a given dataset
        dataset_name = dataset. __name__.split(".")[3]

        self.hidden_layer_sizes, self.num_classes, self.dataset_input_shape = get_dataset_params(dataset_name)

        self.sd_l_scores = []
        self.epoch_centrality_lap_dis = []

        # set model parameters
        self.num_sds = num_sds #Used for CenSET removal based on SD
        self.number_of_connections_per_epoch = 0
        self.layer_sizes = [prod(self.dataset_input_shape), self.hidden_layer_sizes[0], self.hidden_layer_sizes[1], self.hidden_layer_sizes[2]]
        self.batch_size = batch_size # batch sgenerate_weights_matrix_from_networkize
        self.maxepoches = maxepoches     # number of epochs
        self.learning_rate = 0.01 # SGD learning rate
        self.momentum = 0.9 # SGD momentum
        self.dataset = dataset
        self.pruning_approach = pruning_approach
        self.centrality_metric = centrality_metric

        self.current_epoc = 0
        self.mean_kc_scores = []
        self.mean_l_scores =[]

        self.w1 = None
        self.w2 = None
        self.w3 = None
        self.w4 = None

        # initialize weights for SReLu activation function
        self.wSRelu1 = None
        self.wSRelu2 = None
        self.wSRelu3 = None

        # create a SET-MLP model
        self.create_model()


In [None]:
def get_dataset_params(dataset_name):

    if dataset_name == "cifar10":
        hidden_layer_sizes = [4000,1000,4000]
        num_classes = 10
        dataset_input_shape = (32, 32, 3)
        return hidden_layer_sizes, num_classes, dataset_input_shape

    elif dataset_name == "cifar100":
        hidden_layer_sizes = [4000,1000,4000]
        num_classes = 100
        dataset_input_shape = (32, 32, 3)
        return hidden_layer_sizes, num_classes, dataset_input_shape

    elif dataset_name == "fashion_mnist":
        hidden_layer_sizes = [256, 128, 100]
        num_classes = 10
        dataset_input_shape = (28,28,1) 
        return hidden_layer_sizes, num_classes, dataset_input_shape

    elif dataset_name == "higgs":
        hidden_layer_sizes, num_classes, dataset_input_shape = None, None, None
        print("Dataset HIGGS not implemented !")
        return hidden_layer_sizes, num_classes, dataset_input_shape



## Create model

In [None]:
class CenBench_MLP(CenBench_MLP):
    def create_model(self):

        # create a SET-MLP model for CIFAR10 with 3 hidden layers
        self.model = Sequential()
        #Input layer ---  
        self.model.add(Flatten(input_shape=self.dataset_input_shape))
        
        # Hidden layer 1
        self.model.add(Dense(self.hidden_layer_sizes[0], name="dense_1",weights=self.w1))
        self.model.add(layers.Activation(activations.relu,name="srelu1",weights=self.wSRelu1))
        self.model.add(Dropout(0.3))#Helps with overfitting, only present in training
        # Hidden layer 2
        self.model.add(Dense(self.hidden_layer_sizes[1], name="dense_2",weights=self.w2))
        self.model.add(layers.Activation(activations.relu,name="srelu2",weights=self.wSRelu2))
        self.model.add(Dropout(0.3))#Helps with overfitting, only present in training
        # Hidden layer 3
        self.model.add(Dense(self.hidden_layer_sizes[2], name="dense_3",weights=self.w3))
        self.model.add(layers.Activation(activations.relu,name="srelu3",weights=self.wSRelu3))
        self.model.add(Dropout(0.3)) #Helps with overfitting, only present in training
        # Output layer
        self.model.add(Dense(self.num_classes, name="dense_4",weights=self.w4)) #please note that there is no need for a sparse output layer as the number of classes is much smaller than the number of input hidden neurons
        self.model.add(Activation('softmax'))

## Read dataset

In [None]:
class CenBench_MLP(CenBench_MLP):
    def read_data(self):
        # May need rewriting
        (x_train, y_train), (x_test, y_test) = self.dataset.load_data()
        y_train = to_categorical(y_train, self.num_classes)
        y_test = to_categorical(y_test, self.num_classes)
        x_train = x_train.astype('float32')
        x_test = x_test.astype('float32')
        # reshape dataset to have a single channel fashionmist
        print("Dataset name: ", self.dataset.__name__.split(".")[3])
        if self.dataset.__name__.split(".")[3] == "fashion_mnist":
            x_train = x_train.reshape((x_train.shape[0], 28, 28, 1))
            x_test = x_test.reshape((x_test.shape[0], 28, 28, 1))  
        #normalize data
        xTrainMean = np.mean(x_train, axis=0)
        xTtrainStd = np.std(x_train, axis=0)
        x_train = (x_train - xTrainMean) / xTtrainStd
        x_test = (x_test - xTrainMean) / xTtrainStd

        return [x_train, x_test, y_train, y_test]

## Training


In [None]:
class CenBench_MLP(CenBench_MLP):
    def train(self):
        # read CIFAR10 data
        [x_train,x_test,y_train,y_test]=self.read_data()
        #data augmentation
        datagen = ImageDataGenerator(
            featurewise_center=False,  # set input mean to 0 over the dataset
            samplewise_center=False,  # set each sample mean to 0
            featurewise_std_normalization=False,  # divide inputs by std of the dataset
            samplewise_std_normalization=False,  # divide each input by its std
            zca_whitening=False,  # apply ZCA whitening
            rotation_range=10,  # randomly rotate images in the range (degrees, 0 to 180)
            width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
            height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
            horizontal_flip=True,  # randomly flip images
            vertical_flip=False)  # randomly flip images
        datagen.fit(x_train)

        self.model.summary()

        # training process in a for loop
        self.accuracies_per_epoch=[]
        self.loss_per_epoch=[]
        self.connections_per_epoch=[]
        for epoch in range(0, self.maxepoches):
            self.current_epoch = epoch
            self.number_of_connections_per_epoch = 0.0
            print("Enter epoch: ", epoch)
            sgd = optimizers.SGD(lr=self.learning_rate, momentum=self.momentum)
            self.model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

            history = self.model.fit(datagen.flow(x_train, y_train,
                                                batch_size=self.batch_size),
                            steps_per_epoch=x_train.shape[0]//self.batch_size,
                                epochs=epoch,
                                validation_data=(x_test, y_test),
                                    initial_epoch=epoch-1)
            # print(history.history.)

            if not(self.current_epoch % 25 or (self.maxepoches -1  == self.current_epoch)):
              self.current_accuracy = history.history['val_accuracy'][0]
              w1 = self.model.get_layer("dense_1").get_weights()
              w2 = self.model.get_layer("dense_2").get_weights()
              w3 = self.model.get_layer("dense_3").get_weights()
              G = generate_NN_network(self.hidden_layer_sizes, [w1[0], w1[0], w1[0]])  
              btwn = nk.centrality.LaplacianCentrality(G, normalized=False)
              btwn.run()
              scores_cen = [i[1] for i in btwn.ranking()]
              self.epoch_centrality_lap_dis.append((self.current_epoch, asarray(scores_cen)))
            self.mean_l_scores.append(np.mean(scores_cen))
            self.sd_l_scores.append(np.std(scores_cen))
            self.mean_l_scores.append(np.mean(scores_cen))
    
            # Generate Network calculate metics and save




            # Tracking current accuracy for AccSET and possible exentions
            self.accuracies_per_epoch.append(history.history['val_accuracy'][0])
            self.loss_per_epoch.append(history.history["val_loss"])
            print("adding to connections per epoch: ", self.number_of_connections_per_epoch)
            self.connections_per_epoch.append(self.number_of_connections_per_epoch)
 



        return [self.accuracies_per_epoch,  self.connections_per_epoch, self.loss_per_epoch, self.mean_l_scores, self.sd_l_scores, self.epoch_centrality_lap_dis]
           


## Generate Network from From weight array

In [None]:
# TODO change this to only use networkit
# TODO change to use a lil sparse representation as this will likely be faster
def generate_NN_network(layers, layer_weights):
    iterations = 0
    n_nodes = sum(layers)
    adj_matrix = dok_matrix((n_nodes, n_nodes), dtype=np.float32)
    start = time.time()
    for layer_i, layer in enumerate(layers):    
        if not layer_i == len(layers) - 1 :
            # Multiply the current layer by the weight mask to remove nodes, TODO check this
            sparse_layer_weights = layer_weights[layer_i] 
          

            current_layer_start_offset = 0 if layer_i == 0 else sum(layers[0 : layer_i])
            current_layer_end_offset = current_layer_start_offset + layer - 1
            next_layer_start_offset = current_layer_end_offset + 1 
            next_layer_end_offset = next_layer_start_offset +  layers[layer_i + 1] -1

            layer_index_value_dic = {(x + current_layer_start_offset, y + next_layer_start_offset):value for (x ,y), value in np.ndenumerate(sparse_layer_weights) if not value == 0 }

            adj_matrix._update(layer_index_value_dic)

    print("W -> N  time: s",(time.time() - start))
    
    G = nx.convert_matrix.from_scipy_sparse_matrix(adj_matrix, create_using=nx.DiGraph, edge_attribute='weight')
    Gnk = nk.nxadapter.nx2nk(G, weightAttr="weight")
    return  Gnk


# Plot accuracy

In [None]:
def plot_save_accuracy(title, results_accu, results_connections, results_loss, results_cen, results_cen_sd, results_cen_dis , dataset_name, pruning_approach, epochs, centrality_metric=None, num_sd = None, tag=None):
    if centrality_metric is not None:
        save_name = pruning_approach +"_"+centrality_metric+"_"+dataset_name+"_for_"+str(epochs)+"_epochs_"+time.strftime("%Y%m%d-%H%M%S")
    else:
         save_name = pruning_approach +"__"+dataset_name+"_for_"+str(epochs)+"_epochs_"+time.strftime("%Y%m%d-%H%M%S")
    if num_sd is not None:
         save_name = save_name + "_num_sd_" + str(num_sd)
    tag = str(tag) if tag else ""
    for (epoch, data) in results_cen_dis:
      savetxt("PATH"+save_name+"_cen_dis_lap_epoch_"+str(epoch)+"_"+tag+".csv", asarray(data), delimiter=',')
    savetxt("PATH"+save_name+"_accuracy_"+tag+".csv", asarray(results_accu), delimiter=',')
    savetxt("PATH"+save_name+"_connections_"+tag+".csv", asarray(results_connections), delimiter=',')
    savetxt("PATH"+save_name+"_loss_"+tag+".csv", asarray(results_loss), delimiter=',')
    savetxt("PATH"+save_name+"_mean_lap_"+tag+".csv", asarray(results_cen), delimiter=',')
    savetxt("PATH"+save_name+"_sd_lap_"+tag+".csv", asarray(results_cen_sd), delimiter=',')



# Run experiments
A method for running multiple experiments

In [None]:
def run_experiments(datasets, maxepoches, pruning_approachs, experiment_titles, sds = None,  centrality_metrics=None, tags=None):
    if  len(datasets) == len(maxepoches) == len(pruning_approachs) == len(experiment_titles)  :
        for experiment_i, experiment_title in enumerate(experiment_titles):
            dataset_name = datasets[experiment_i]. __name__.split(".")[3]
            print("------------START of experiment '"+experiment_title+"' for dataset: "+dataset_name+"------------")
            smlp = CenBench_MLP(maxepoches=maxepoches[experiment_i], dataset=datasets[experiment_i], num_sds= sds[experiment_i],  pruning_approach=pruning_approachs[experiment_i],centrality_metric=centrality_metrics[experiment_i] )
            # Saving results
            [res_acc, res_conn, res_loss, res_cen, results_cen_sd, res_cen_dis] = smlp.train()
            plot_save_accuracy(experiment_title, res_acc, res_conn, res_loss,res_cen, results_cen_sd, res_cen_dis, dataset_name,pruning_approachs[experiment_i], maxepoches[experiment_i], centrality_metrics[experiment_i], str(sds[experiment_i]), tags[experiment_i] )
          
            print("------------END of experiment '"+experiment_title+"' for dataset: "+dataset_name+"------------")
    else:
        raise ValueError("Incorrect experiment setup")

## Fit Zeta

In [None]:
def fit_sds(maxepoches, dataset, pruning_approach, experiment_title, sd_range, sd_step, centrality_metric=None, tag= None):
    for num_sd in np.arange(sd_range[0], sd_range[1], sd_step):
        dataset_name = dataset. __name__.split(".")[3]
        smlp = CenBench_MLP(maxepoches=maxepoches, dataset=dataset,  num_sds= num_sd, pruning_approach=pruning_approach, centrality_metric=centrality_metric)
        # Saving results
        [res_acc, res_conn, res_loss, res_cen, results_cen_sd, res_cen_dis] = smlp.train()
        plot_save_accuracy(experiment_title,res_acc, res_conn, res_loss,res_cen, results_cen_sd, res_cen_dis, dataset_name ,pruning_approach, maxepoches, centrality_metric, str(num_sd), tag )

# Configure Experiments - Start Experiments
Configure the Experiments and run them

In [None]:
K.clear_session()

print(device_lib.list_local_devices())
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

datasets=[fashion_mnist] 

maxepoches=[10]
pruning_approachs=["MLP"]
centrality_metrics = [None]
sds= [None]
experiment_titles = ["Testing_MLP"]
tags = ["_testing_MLP"]
run_experiments(datasets, maxepoches, pruning_approachs, experiment_titles,sds, centrality_metrics, tags)

# fit_sds(300, fashion_mnist, "CenSET", "Model accuracy using CenSET", (3, 3.1), 0.1, "laplacian", "finding_opti_sd_removal_rate" )
# fit_sds(2, fashion_mnist, "SET", "Model accuracy using SET", (1, 2), 1, None, "_test_run_" )


[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 15036385842002480500
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 16183459840
locality {
  bus_id: 1
  links {
  }
}
incarnation: 3331097077076529170
physical_device_desc: "device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0"
]
Num GPUs Available:  1
------------START of experiment 'Testing_MLP' for dataset: fashion_mnist------------
Dataset name:  fashion_mnist
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten (Flatten)            (None, 784)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 256)               200960    
_________________________________________________________________
srelu1 (Activation)          (None, 256)               0         
____________

  "The `lr` argument is deprecated, use `learning_rate` instead.")


W -> N  time: s 0.9409182071685791
adding to connections per epoch:  0.0
Enter epoch:  1
W -> N  time: s 0.953606367111206
adding to connections per epoch:  0.0
Enter epoch:  2
Epoch 2/2
W -> N  time: s 0.9630002975463867
adding to connections per epoch:  0.0
Enter epoch:  3
Epoch 3/3
W -> N  time: s 0.9482855796813965
adding to connections per epoch:  0.0
Enter epoch:  4
Epoch 4/4
W -> N  time: s 0.9700522422790527
adding to connections per epoch:  0.0
Enter epoch:  5
Epoch 5/5
W -> N  time: s 0.9230055809020996
adding to connections per epoch:  0.0
Enter epoch:  6
Epoch 6/6
W -> N  time: s 0.9595603942871094
adding to connections per epoch:  0.0
Enter epoch:  7
Epoch 7/7
W -> N  time: s 0.9413561820983887
adding to connections per epoch:  0.0
Enter epoch:  8
Epoch 8/8
W -> N  time: s 1.0742771625518799
adding to connections per epoch:  0.0
Enter epoch:  9
Epoch 9/9
W -> N  time: s 0.944267988204956
adding to connections per epoch:  0.0
------------END of experiment 'Testing_MLP' for 



### Tickets
- How to find the inverse function, find where the centraility stops increasing this is 100% of centraility, then the centraility measures can become a normalised percentage based on this. Then there is a centraility percentage and epoch function. This can be used to scale the pruning rate. The function should be reverse compare to the one seen in the data.
- The fuction of the rate of removal of nodes should be the inverse of the function of the increase of centraility observered
    - SET on FashionMNST should be rerun recording lap centraility
    - Perhaps 2 More datasets should be run recording lap centraility
    - Using all of these datasets I can try and come up with a matching function 
    - Possible candidates: https://en.wikipedia.org/wiki/Exponential_growth#/media/File:Exponential.svg x^3 looks good 

- Improve access speed on sparse adj matrix in W -> N - test using list of list sparse matrices
- Read into Lap centraility
    - Does it work for directed graphs ? 
    - 
 
- Allow for changing of metric
- At each epoch in SET record the ranking of centraility
- use above to determine a centraility threshold to prune beneth.
- Choose better metrics
- Create framework to find pruning threshold for a metric
- Fix tex saving
- Show MLP in comparison charts ?
- Track number of connections per epoch
- Track number of connections and centraility across network at end of training
- Convert between iterations on SET to check conversion methods
- Get VPN 
- Get collab Pro
- Set up Collab with github: https://towardsdatascience.com/google-drive-google-colab-github-dont-just-read-do-it-5554d5824228

### Broken
- FashionMNST is not supported




