# MODEL CHARACTERIZATION

In this notebook we wil try to characterize the main feature of a TTN classifier depending on some of its parameters.
In particular we will compare a pure TTN classifier, with no activation regularization and batch normalization, with a fully optimzed one. After that the performance (accuracy) and time scaling in analyzed with different bond dimension, feature map and batch size.

This notebook is mainly composed of a series of for loop where models are trained with different hyperparameters. To see a complete explanation of each step of the training procedure please refer to the TTN_train_example notebook in this folder.

The results of this run are save in the JsonFile/Characterization/ folder and their results are plotted in the plot.ipynb notebook in the plots/ folder.

In [1]:
import os
import pandas as pd
import json
import time
import os
import math
import tensorflow        as tf
import tensornetwork     as tn
import numpy             as np
import matplotlib.pyplot as plt
import pandas            as pd
import datetime

from tensorflow.keras.models      import Sequential
from tensorflow.keras             import regularizers

from sklearn.preprocessing        import MinMaxScaler
from sklearn.metrics              import roc_auc_score

In [3]:
import utils.preprocess as preprocess

from layers.TTN_SingleNode        import TTN_SingleNode
from ModelMaker                   import Make_SingleNode_Model

Using TensorFlow backend.


## DATA LOAD

Loading of the dataset and preprocessing

In [5]:
#dataset path
DATA_PATH = "../data/"
N = 1100000 #number of sample to load

#dataset loading
if os.path.isfile(DATA_PATH + "HIGGS.csv.gz"):
    data = pd.read_csv(
                DATA_PATH         + 'HIGGS.csv.gz'       ,  
                compression='gzip', error_bad_lines=False, 
                nrows=N           , header=None          
            )
elif os.path.isfile(DATA_PATH + "HIGGS.csv"):
    data = pd.read_csv(
                DATA_PATH + 'HIGGS.csv',    nrows=N  ,
                error_bad_lines=False  , header=None 
            )
else:
    print("Error: Data file not found")

In [6]:
#dataset preprocessing
x_train, x_val, x_test, y_train, y_val, y_test = \
    preprocess.Preprocess(
        data                       , 
        feature_map  = 'spherical' , #map typology
        map_order    = 2           , #map order
        con_order    = 2           , #number of contraction per site
        verbose      = True        , #verbose (print shapes for debugging)
        N_train      = 1000000     , #train set size
        N_val        = 50000       , #validation set size
        N_test       = 50000         #test set size
    )

Data shape
x_data shape:  (1100000, 28) y_data shape:  (1100000,)
Padded data shape
x_data shape:  (1100000, 32) y_data shape:  (1100000,)
Mapped data shape
x_data shape:  (1100000, 32, 3) y_data shape:  (1100000,)
Train, validation, test data shape
x_train shape:  (1000000, 32, 3) y_train shape:  (1000000,)
x_val   shape:  (50000, 32, 3) y_val   shape:  (50000,)
x_test  shape:  (50000, 32, 3) y_test  shape:  (50000,)


## "PURE" TTN vs OPTIMIZED ONE

Training of two model, the first one is a pure TTN without any sort of regularization and normalization, the other is a TTN with many standard ML optimization included.

In [14]:
#pure ttn model 
pure_ttn = Make_SingleNode_Model( input_shape=(x_train.shape[1:]),bond_dim=15, activation=None, use_batch_norm=False,use_reg =False, n_contr=2)

In [15]:
#optimized ttn model
opti_ttn = Make_SingleNode_Model( input_shape=(x_train.shape[1:]),bond_dim=10, activation='elu', use_batch_norm=True , n_contr=2)

In [16]:
pure_ttn.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
opti_ttn.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

In [17]:
#train pure model
pure_history = pure_ttn.fit(x_train, y_train, validation_data=(x_val,y_val), epochs=150,  batch_size=5000 )

Train on 1000000 samples, validate on 50000 samples
Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
Epoch 11/150
Epoch 12/150
Epoch 13/150
Epoch 14/150
Epoch 15/150
Epoch 16/150
Epoch 17/150
Epoch 18/150
Epoch 19/150
Epoch 20/150
Epoch 21/150
Epoch 22/150
Epoch 23/150
Epoch 24/150
Epoch 25/150
Epoch 26/150
Epoch 27/150
Epoch 28/150
Epoch 29/150
Epoch 30/150
Epoch 31/150
Epoch 32/150
Epoch 33/150
Epoch 34/150
Epoch 35/150
Epoch 36/150
Epoch 37/150
Epoch 38/150
Epoch 39/150
Epoch 40/150
Epoch 41/150
Epoch 42/150
Epoch 43/150
Epoch 44/150
Epoch 45/150
Epoch 46/150
Epoch 47/150
Epoch 48/150
Epoch 49/150
Epoch 50/150
Epoch 51/150
Epoch 52/150
Epoch 53/150
Epoch 54/150
Epoch 55/150
Epoch 56/150
Epoch 57/150
Epoch 58/150
Epoch 59/150
Epoch 60/150
Epoch 61/150
Epoch 62/150
Epoch 63/150
Epoch 64/150
Epoch 65/150
Epoch 66/150
Epoch 67/150
Epoch 68/150
Epoch 69/150
Epoch 70/150
Epoch 71/150
Epoch 72/150
Epoch 73/150
Epoch 74

In [18]:
#save pure model results
with open("../JsonFiles/Characterizations/pure_model_history.json", 'w') as jf:
    json.dump({k:list(np.array(v).astype(float)) for k,v in pure_history.history.items()}, jf)

In [19]:
#train optimized model
opti_history = opti_ttn.fit(x_train, y_train, validation_data=(x_val,y_val), epochs=150,  batch_size=5000 )

Train on 1000000 samples, validate on 50000 samples
Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
Epoch 11/150
Epoch 12/150
Epoch 13/150
Epoch 14/150
Epoch 15/150
Epoch 16/150
Epoch 17/150
Epoch 18/150
Epoch 19/150
Epoch 20/150
Epoch 21/150
Epoch 22/150
Epoch 23/150
Epoch 24/150
Epoch 25/150
Epoch 26/150
Epoch 27/150
Epoch 28/150
Epoch 29/150
Epoch 30/150
Epoch 31/150
Epoch 32/150
Epoch 33/150
Epoch 34/150
Epoch 35/150
Epoch 36/150
Epoch 37/150
Epoch 38/150
Epoch 39/150
Epoch 40/150
Epoch 41/150
Epoch 42/150
Epoch 43/150
Epoch 44/150
Epoch 45/150
Epoch 46/150
Epoch 47/150
Epoch 48/150
Epoch 49/150
Epoch 50/150
Epoch 51/150
Epoch 52/150
Epoch 53/150
Epoch 54/150
Epoch 55/150
Epoch 56/150
Epoch 57/150
Epoch 58/150
Epoch 59/150
Epoch 60/150
Epoch 61/150
Epoch 62/150
Epoch 63/150
Epoch 64/150
Epoch 65/150
Epoch 66/150
Epoch 67/150
Epoch 68/150
Epoch 69/150
Epoch 70/150
Epoch 71/150
Epoch 72/150
Epoch 73/150
Epoch 74

In [20]:
#save optimized results
with open("../JsonFiles/Characterizations/opti_model_history.json", 'w') as jf:
    json.dump({k:list(np.array(v).astype(float)) for k,v in opti_history.history.items()}, jf)

## BOND DIMENSION 

Here we estimate the training time and model complexity dependence on the bond dimension. Due to the intensive computational task the performance evalutaion is not included in this notebook (jupyter is not able to manage the full the dataset). The performance evaluation can be done running the scripts/run_bond_dim.sh script to train several models and using the evaluate.py script to evaluate the performance of the trained models.

In [8]:
#time callback

class timecallback(tf.keras.callbacks.History):
    def __init__(self):
        self.times  = []
        self.epochs = []
        # use this value as reference to calculate cummulative time taken
        self.timetaken = tf.timestamp()
    def on_epoch_end(self,epoch,logs = {}):
        time_epoch = tf.timestamp()
        self.times.append(time_epoch - self.timetaken)
        self.timetaken = time_epoch
        self.epochs.append(epoch)
    def on_train_end(self,logs = {}):
        self.times = [t.numpy() for t in self.times]
        
timeCall = timecallback()

In [20]:
bond_params = {}


#loop over bond dimension 5 to 100 with step of 5
#save results of each run to a specific dict
for i in range(1,21):
    model = Make_SingleNode_Model( input_shape=(x_train.shape[1:]),bond_dim=5*i, activation='elu', use_batch_norm=True , n_contr=2)
    model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])

    temp_dict={}
    temp_dict['params'] = model.count_params()
    
    timeCall = timecallback()
    model.fit(x_train, y_train,validation_data=(x_val,y_val),
        epochs=10,  batch_size=5000, callbacks=[timeCall])
    temp_dict['mean_train_time'] = np.mean(np.array(timeCall.times))
    temp_dict['std_train_time'] = np.std(np.array(timeCall.times))
    start_time = time.time()
    model.predict(x_test)
    temp_dict['eval_time'] = time.time()-start_time
    bond_params['bd'+str(5*i)]= temp_dict
 

In [22]:
#save dictionary results
with open('../JsonFiles/Characterizations/bond_dim_behav.json', 'w') as jf:
    json.dump(bond_params, jf)

In [16]:
#study of the evaluation time
bond_params = {}

for i in range(1,21):
    model = Make_SingleNode_Model( input_shape=(x_train.shape[1:]),bond_dim=5*i, activation='elu', use_batch_norm=True , n_contr=2)
    model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])

    temp_dict={}
    ev_list = []
    for j in range(100):
        start_time = time.time()
        model.predict(x_test[0:1])
        ev_list.append(time.time()-start_time)
    temp_dict['mean_eval_time'] = np.mean(np.array(ev_list[1:]))
    temp_dict['std_eval_time'] = np.std (np.array(ev_list[1:]))
    bond_params['bd'+str(5*i)]= temp_dict
 

In [17]:
with open('../JsonFiles/Characterizations/bond_dim_behav_eval.json', 'w') as jf:
    json.dump(bond_params, jf)

## FEATURE MAP

Here we estimate the training and evaluation time and depending on the feature map. Due to the intensive computational task the performance evalutaion is not included in this notebook (jupyter is not able to manage all the dataset). The performance evaluation can be done running the scripts/run_maps.sh file.

In [36]:
sphe_dict = {}

for i in range(2,6):
    temp_dict={}
    x_train, x_val, x_test, y_train, y_val, y_test = preprocess.Preprocess(
        data, feature_map='spherical', map_order=i,  con_order=2, verbose=False, N_train= 1000000, N_val  =50000,N_test =50000)
    
    model = Make_SingleNode_Model( input_shape=(x_train.shape[1:]),bond_dim=35, activation='elu', use_batch_norm=True , n_contr=2)
    model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
    timeCall = timecallback()
    model.fit(x_train, y_train,validation_data=(x_val,y_val),
        epochs=10,  batch_size=5000, callbacks=[timeCall])
    temp_dict['mean_train_time'] = np.mean(np.array(timeCall.times))
    temp_dict['std_train_time'] = np.std(np.array(timeCall.times))
    start_time = time.time()
    model.predict(x_test)
    temp_dict['eval_time'] = time.time()-start_time
    sphe_dict['order_'+str(i)]= temp_dict

Train on 1000000 samples, validate on 50000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 1000000 samples, validate on 50000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 1000000 samples, validate on 50000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 1000000 samples, validate on 50000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [37]:
with open('../JsonFiles/Characterizations/map_spher_times.json', 'w') as jf:
    json.dump(sphe_dict, jf)

In [38]:
pol_dict = {}

for i in range(2,6):
    temp_dict={}
    x_train, x_val, x_test, y_train, y_val, y_test = preprocess.Preprocess(
        data, feature_map='polynomial', map_order=i,  con_order=2, verbose=False, N_train= 1000000, N_val  =50000,N_test =50000)
    
    model = Make_SingleNode_Model( input_shape=(x_train.shape[1:]),bond_dim=35, activation='elu', use_batch_norm=True , n_contr=2)
    model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
    timeCall = timecallback()
    model.fit(x_train, y_train,validation_data=(x_val,y_val),
        epochs=10,  batch_size=5000, callbacks=[timeCall])
    temp_dict['mean_train_time'] = np.mean(np.array(timeCall.times))
    temp_dict['std_train_time'] = np.std(np.array(timeCall.times))
    start_time = time.time()
    model.predict(x_test)
    temp_dict['eval_time'] = time.time()-start_time
    pol_dict['order_'+str(i)]= temp_dict

Train on 1000000 samples, validate on 50000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 1000000 samples, validate on 50000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 1000000 samples, validate on 50000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Train on 1000000 samples, validate on 50000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [39]:
with open('../JsonFiles/Characterizations/map_poli_times.json', 'w') as jf:
    json.dump(pol_dict, jf)

## BATCH SIZE


Here we estimate the training and evaluation time and depending on the batch size.

In [7]:
batch_dict = {}

bs = np.logspace(1, 4, num=30, endpoint=True, dtype=np.int32, base = 10)
bs

array([   10,    12,    16,    20,    25,    32,    41,    52,    67,
          85,   108,   137,   174,   221,   280,   356,   452,   573,
         727,   923,  1172,  1487,  1887,  2395,  3039,  3856,  4893,
        6210,  7880, 10000], dtype=int32)

In [9]:
for b in bs:
    x_train, x_val, x_test, y_train, y_val, y_test = preprocess.Preprocess(data, feature_map='spherical', map_order=3,
                                                                       con_order=2, verbose=True, N_train= 1000000, N_val  =50000,N_test =50000)
    temp_dict={}
    model = Make_SingleNode_Model( input_shape=(x_train.shape[1:]),bond_dim=35, activation='elu', use_batch_norm=True , n_contr=2)
    model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])
    timeCall = timecallback()
    model.fit(x_train[:100000], y_train[:100000],validation_data=(x_val,y_val),
        epochs=5,  batch_size=b, callbacks=[timeCall])
    temp_dict['mean_train_time'] = np.mean(np.array(timeCall.times))
    temp_dict['std_train_time'] = np.std(np.array(timeCall.times))
    start_time = time.time()
    batch_dict['bs_'+str(b)]= temp_dict

Data shape
x_data shape:  (1100000, 28) y_data shape:  (1100000,)
Padded data shape
x_data shape:  (1100000, 32) y_data shape:  (1100000,)
Mapped data shape
x_data shape:  (1100000, 32, 3) y_data shape:  (1100000,)
Train, validation, test data shape
x_train shape:  (1000000, 32, 3) y_train shape:  (1000000,)
x_val   shape:  (50000, 32, 3) y_val   shape:  (50000,)
x_test  shape:  (50000, 32, 3) y_test  shape:  (50000,)
Train on 100000 samples, validate on 50000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Data shape
x_data shape:  (1100000, 28) y_data shape:  (1100000,)
Padded data shape
x_data shape:  (1100000, 32) y_data shape:  (1100000,)
Mapped data shape
x_data shape:  (1100000, 32, 3) y_data shape:  (1100000,)
Train, validation, test data shape
x_train shape:  (1000000, 32, 3) y_train shape:  (1000000,)
x_val   shape:  (50000, 32, 3) y_val   shape:  (50000,)
x_test  shape:  (50000, 32, 3) y_test  shape:  (50000,)
Train on 100000 samples, validate on 50000 samples
Epoc

In [10]:
with open('../JsonFiles/Characterizations/batch_size_times.json', 'w') as jf:
    json.dump(batch_dict, jf)