<a href="https://colab.research.google.com/github/RachelRamirez/misclassification_matrix/blob/main/Potential_Misclassification_Cost_Matrix_Example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Now that I've shown the TB and ISantaro methods are equivalent when seeded properly, I believe what I need to do is train a "good" neural network with the same seed 30 times to get an idea of what the typical confusion matrix looks like.  I will run the code within Colab 30 times without restarting because from previous reproducibility studies if I restart and run-all I will get the same results.  This time, I'm interested in the typical variety you can get on a confusion matrix when the neural network is seeded the same way each time, that way when I gve it another cost-matrix to train on, and run that 30 times I can do a more informative comparison of the results.  Since the Isantaro and TB methods were identical I went with the Isantaro method because it was simpler, more efficient, and seemed less time consuming. 

Changes from previous Reproducibility notebook:
1. Dropout is back in.
2. Batch Size is not as large to help with variety.
3. Num of Epochs is more than 4 now that I care about achieving good overall accuracy
4. Callback for EarlyStop added
5. Model Shuffle during Fit is still False (I'm calling it out to see if I need to change that)
6. but Model.Fit(use multiprocessors = True)
7. Still Cost Matrix of all 1's

### Reproducible Seeds

In [4]:
#For Reproducibility
import numpy as np
# np.random.seed(1337)  # for reproducibility

import tensorflow as tf
# tf.random.set_seed(33)

import random as python_random
# python_random.seed(4)

# https://www.tensorflow.org/api_docs/python/tf/keras/utils/set_random_seed
tf.keras.utils.set_random_seed(342) #Possibly use next iteration if the above doesn't work


# Running more than once causes variation.  try adding this:
# Set seed value
seed_value = 56
import os
os.environ['PYTHONHASHSEED']=str(seed_value)

print("TF version: " , tf.__version__ )
print("Keras version: " , tf.keras.__version__ )

TF version:  2.9.2
Keras version:  2.9.0


### Import rest of Library

In [26]:

# from https://github.com/keras-team/keras/issues/2115#issuecomment-204060456
# witha correction on the weighted function in the middle 

'''Train a simple deep NN on the MNIST dataset.
Get to 98.40% test accuracy after 20 epochs
(there is *a lot* of margin for parameter tuning).
2 seconds per epoch on a K520 GPU.
'''

from __future__ import print_function  #do i still need this?
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adam, RMSprop
from keras.utils import np_utils
import keras.backend as K
from itertools import product
import functools
from functools import partial
from time import ctime
from time import sleep
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd 
from sklearn.metrics import confusion_matrix

## MORE REPEATABILITY STUFF NEEDED - If theres a way to update this to V2 of Tensorflow great, otherwise I had to use TF 1.0 code
# 5. Configure a new global `tensorflow` session (https://stackoverflow.com/questions/50659482/why-cant-i-get-reproducible-results-in-keras-even-though-i-set-the-random-seeds)
# from keras import backend as K


#I believe thecode below is to help things be repeatable each time different sections in my google colab notebook execute
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)

### Define batch, epochs, and format data

In [6]:
batch_size = 256 # I originally had it very  high batch size to reduce the variation in the data each batch and hope it makes the model training more nearly identical which it did, then i bring it back down to something reasonable to get better results training the NN
nb_classes = 10
nb_epoch = 15

# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()


X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)


60000 train samples
10000 test samples


### Define weighted_categorical_crossentropy()

In [7]:

# # https://github.com/keras-team/keras/issues/2115#issuecomment-207765342

# def w_categorical_crossentropy(y_true, y_pred, weights):
#     nb_cl = len(weights)
#     final_mask = K.zeros_like(y_pred[:, 0])
#     y_pred_max = K.max(y_pred, axis=1)
#     y_pred_max = K.expand_dims(y_pred_max, 1)
#     y_pred_max_mat = K.equal(y_pred, y_pred_max)
 
#     for c_t, c_p in product(range(nb_cl), range(nb_cl)):
#         final_mask += (K.cast(weights[c_t, c_p],K.floatx()) * K.cast(y_pred_max_mat[:, c_p] ,K.floatx())* K.cast(y_true[:, c_t],K.floatx()))
    
#     # result = K.categorical_crossentropy(y_true, y_pred)*final_mask
#     # tf.print(result, "Show Result of CE * Final_Mask")  #this was basically useless to display, and it showed like, 500 lines of print statements each epoch

#     return K.categorical_crossentropy(y_true, y_pred)*final_mask   #I changed the order of y_true and y_pred




### Define Same Model but use normal Categorical CrossEntropy with no extra cost-matrix of Weights

In [8]:
def normal_method():

  model = Sequential()
  model.add(Dense(512, input_shape=(784,) ,kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model.add(Activation('relu'))
  model.add(Dropout(0.2))
  model.add(Dense(512, kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model.add(Activation('relu'))
  model.add(Dropout(0.2))
  model.add(Dense(10, kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model.add(Activation('softmax'))

  rms = RMSprop()
  # model.compile(loss=ncce, optimizer=rms)
  model.compile(loss=tf.keras.losses.CategoricalCrossentropy(), optimizer=rms, metrics='categorical_accuracy', )

  #add early_stop to prevent overfittings
  # callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)

  model_history = model.fit(X_train, Y_train,
            batch_size=batch_size, epochs=nb_epoch, verbose=2,
            validation_data=(X_test, Y_test),shuffle=False, use_multiprocessing=True
            , callbacks = [callback])

  
  # model.evaluate(X_test, Y_test, verbose=1)  # I know this isn't the typical use of train/val/test sets, please dont' comment on that
  
  #Predict
  y_prediction = model.predict(X_test)
  y_prediction  = np.argmax(y_prediction, axis=1)
  # Y_prediction = np_utils.to_categorical(y_prediction, nb_classes) #If I want to do SparseCategoricalAccuracy

  #Create confusion matrix and normalizes it over predicted (columns)
  # result = confusion_matrix(y_test, y_prediction , normalize='pred')  #if I want percentages instead of raw counts

  
  cm = confusion_matrix(y_test, y_prediction)
  cm = pd.DataFrame(cm, range(10),range(10))

  #This shows a pretty confusion matrix which I don't neeed to show right now
  # plt.figure(figsize = (10,10))
  # sns.heatmap(cm, annot=True, annot_kws={"size": 12}) # font size
  # plt.show()
  # cm_normal = cm

  return cm


### Weighted Categorical Cross Entropy Function

In [9]:
# from https://github.com/keras-team/keras/issues/2115#issuecomment-815825633 from Isaranto

def weighted_categorical_crossentropy_new(y_true, y_pred, weights):
          idx1 = K.argmax(y_pred, axis=1)
          idx2 = K.argmax(y_true, axis=1)
          mask = tf.gather_nd(weights, tf.stack((idx1, idx2), -1))
          return K.categorical_crossentropy(y_true, y_pred) * mask

In [10]:
 #what does ncce stand for?

 
def isaranto_method():
  w_array = np.ones((10,10))
  # w_array[9, 7] = 1.5
  # w_array = w_array - np.eye(10)
  # print("W_array:  ", w_array)

  weighted_list = w_array.tolist()

  wcce = partial(weighted_categorical_crossentropy_new, weights=weighted_list)
  wcce.__name__ ='w_categorical_crossentropy'

  model3 = Sequential()
  model3.add(Dense(512, input_shape=(784,), kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model3.add(Activation('relu'))
  model3.add(Dropout(0.2))
  model3.add(Dense(512, kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model3.add(Activation('relu'))
  model3.add(Dropout(0.2))
  model3.add(Dense(10,kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model3.add(Activation('softmax'))

  rms = RMSprop()

  model3.compile(loss=wcce, optimizer=rms,  metrics='categorical_accuracy',)
  callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)

  model3_history = model3.fit(X_train, Y_train,
            batch_size=batch_size, epochs=nb_epoch, verbose=2,
            validation_data=(X_test, Y_test), shuffle=False, use_multiprocessing=True
            ,callbacks = [callback]
            )

 

  #Predict
  y_prediction = model3.predict(X_test)
  y_prediction  = np.argmax(y_prediction, axis=1)
  # Y_prediction = np_utils.to_categorical(y_prediction, nb_classes)

  #Create confusion matrix and normalizes it over predicted (columns)
  # result = confusion_matrix(y_test, y_prediction , normalize='pred')

  

  cm3 = confusion_matrix(y_test, y_prediction)
  cm3 = pd.DataFrame(cm3, range(10),range(10))
  # plt.figure(figsize = (10,10))
  # cm3
  # sns.heatmap(cm2, annot=True, annot_kws={"size": 12}) # font size
  # plt.show()

  # cm_using_weighted_new = cm3

  return cm3

In [11]:
cm = np.zeros([10,10])


for i in range(0,30):
  print(i)
  cm2 =  isaranto_method()
  print("CM: \n", cm2)
  cm += cm2

cm_new = cm/30

0
Epoch 1/15
235/235 - 16s - loss: 0.3032 - categorical_accuracy: 0.9058 - val_loss: 0.2383 - val_categorical_accuracy: 0.9259 - 16s/epoch - 70ms/step
Epoch 2/15
235/235 - 10s - loss: 0.1173 - categorical_accuracy: 0.9645 - val_loss: 0.1106 - val_categorical_accuracy: 0.9666 - 10s/epoch - 41ms/step
Epoch 3/15
235/235 - 7s - loss: 0.0788 - categorical_accuracy: 0.9756 - val_loss: 0.0989 - val_categorical_accuracy: 0.9701 - 7s/epoch - 32ms/step
Epoch 4/15
235/235 - 8s - loss: 0.0593 - categorical_accuracy: 0.9825 - val_loss: 0.1007 - val_categorical_accuracy: 0.9699 - 8s/epoch - 35ms/step
Epoch 5/15
235/235 - 7s - loss: 0.0480 - categorical_accuracy: 0.9842 - val_loss: 0.0817 - val_categorical_accuracy: 0.9772 - 7s/epoch - 28ms/step
Epoch 6/15
235/235 - 9s - loss: 0.0393 - categorical_accuracy: 0.9876 - val_loss: 0.0869 - val_categorical_accuracy: 0.9751 - 9s/epoch - 37ms/step
Epoch 7/15
235/235 - 7s - loss: 0.0325 - categorical_accuracy: 0.9897 - val_loss: 0.0983 - val_categorical_accur

In [12]:
print("This is the average confusion matrix using 30 runs")
round(cm_new,1)

This is the average confusion matrix using 30 runs


Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,972.3,0.9,0.5,0.5,0.4,1.1,1.6,0.5,1.3,0.9
1,0.0,1129.2,1.2,0.9,0.1,0.3,1.3,0.0,2.0,0.0
2,2.5,2.3,1012.4,2.4,1.8,0.1,1.4,3.7,5.1,0.3
3,0.1,0.0,3.6,990.1,0.1,5.5,0.0,3.7,2.2,4.7
4,0.7,0.0,3.4,0.0,966.4,0.0,3.0,0.6,0.5,7.4
5,2.0,0.0,0.0,5.1,1.2,878.1,1.2,0.3,2.3,1.8
6,4.0,2.7,0.2,0.7,5.6,16.6,926.8,0.0,1.5,0.0
7,1.4,3.2,9.4,0.7,3.7,0.1,0.0,991.4,4.0,14.0
8,3.0,1.1,3.0,5.4,4.1,7.4,0.5,2.4,940.1,7.1
9,1.0,2.2,0.0,2.3,10.9,2.3,0.3,1.3,0.5,988.1


The above confusion matrix matches the last saved version exactly!  Therefore the 30 replicates were exactly reproducible.  

In [13]:
cm_new

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,972.3,0.933333,0.466667,0.5,0.4,1.066667,1.633333,0.466667,1.333333,0.9
1,0.033333,1129.2,1.2,0.866667,0.066667,0.333333,1.3,0.0,1.966667,0.033333
2,2.533333,2.3,1012.4,2.433333,1.8,0.1,1.366667,3.7,5.066667,0.3
3,0.1,0.0,3.633333,990.133333,0.066667,5.466667,0.033333,3.7,2.166667,4.7
4,0.7,0.0,3.433333,0.0,966.4,0.033333,3.0,0.6,0.466667,7.366667
5,2.0,0.0,0.0,5.1,1.2,878.1,1.233333,0.266667,2.266667,1.833333
6,4.0,2.666667,0.166667,0.733333,5.6,16.566667,926.766667,0.0,1.466667,0.033333
7,1.366667,3.2,9.433333,0.733333,3.733333,0.1,0.0,991.4,4.033333,14.0
8,3.0,1.133333,2.966667,5.366667,4.066667,7.366667,0.5,2.4,940.1,7.1
9,1.033333,2.2,0.033333,2.266667,10.933333,2.3,0.3,1.333333,0.5,988.1


In [14]:
import pickle
# Open a file and use dump()
with open('file.pkl', 'wb') as file:
      
    # A new file will be created
    pickle.dump(cm_new, file)




# Open the file in binary mode
with open('file.pkl', 'rb') as file:
      
    # Call load method to deserialze
    myvar_cm = pickle.load(file)
  
    print(myvar_cm)


from google.colab import files
files.download( "file.pkl" )  

            0            1            2           3           4           5  \
0  972.300000     0.933333     0.466667    0.500000    0.400000    1.066667   
1    0.033333  1129.200000     1.200000    0.866667    0.066667    0.333333   
2    2.533333     2.300000  1012.400000    2.433333    1.800000    0.100000   
3    0.100000     0.000000     3.633333  990.133333    0.066667    5.466667   
4    0.700000     0.000000     3.433333    0.000000  966.400000    0.033333   
5    2.000000     0.000000     0.000000    5.100000    1.200000  878.100000   
6    4.000000     2.666667     0.166667    0.733333    5.600000   16.566667   
7    1.366667     3.200000     9.433333    0.733333    3.733333    0.100000   
8    3.000000     1.133333     2.966667    5.366667    4.066667    7.366667   
9    1.033333     2.200000     0.033333    2.266667   10.933333    2.300000   

            6           7           8           9  
0    1.633333    0.466667    1.333333    0.900000  
1    1.300000    0.000000 

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Started at 10:43 AM  ended at 11:27 

In [28]:
ctime()

'Tue Feb  7 16:49:05 2023'

In [19]:
# How many categories are there in the test set?

truth_num_per_category = Y_test.sum(axis=0)

In [25]:
# I'm actually surprised this ended up working ( I double checked in excel)
cm_percents = cm_new/truth_num_per_category

with open('cm_percents.pkl', 'wb') as file:
      
    # A new file will be created
    pickle.dump(cm_percents, file)




# Open the file in binary mode
with open('cm_percents.pkl', 'rb') as file:
      
    # Call load method to deserialze
    myvar_cm_percents = pickle.load(file)
  
    print(cm_percents)


from google.colab import files
files.download( "cm_percents.pkl" )  

          0         1         2         3         4         5         6  \
0  0.992143  0.000822  0.000452  0.000495  0.000407  0.001196  0.001705   
1  0.000034  0.994890  0.001163  0.000858  0.000068  0.000374  0.001357   
2  0.002585  0.002026  0.981008  0.002409  0.001833  0.000112  0.001427   
3  0.000102  0.000000  0.003521  0.980330  0.000068  0.006129  0.000035   
4  0.000714  0.000000  0.003327  0.000000  0.984114  0.000037  0.003132   
5  0.002041  0.000000  0.000000  0.005050  0.001222  0.984417  0.001287   
6  0.004082  0.002349  0.000161  0.000726  0.005703  0.018572  0.967397   
7  0.001395  0.002819  0.009141  0.000726  0.003802  0.000112  0.000000   
8  0.003061  0.000999  0.002875  0.005314  0.004141  0.008259  0.000522   
9  0.001054  0.001938  0.000032  0.002244  0.011134  0.002578  0.000313   

          7         8         9  
0  0.000454  0.001369  0.000892  
1  0.000000  0.002019  0.000033  
2  0.003599  0.005202  0.000297  
3  0.003599  0.002225  0.004658  
4  0

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

The highest Classification Accuracy is 

To reference later: 

https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/structured_data/imbalanced_data.ipynb#scrollTo=UJ589fn8ST3x

To train a model with class weights:

```
class_weight = {0: weight_for_0, 1: weight_for_1}

weighted_model = make_model()
weighted_model.load_weights(initial_weights)

weighted_history = weighted_model.fit(
    train_features,
    train_labels,
    batch_size=BATCH_SIZE,
    epochs=EPOCHS,
    callbacks=[early_stopping],
    validation_data=(val_features, val_labels),

    # The class weights go here
    class_weight=class_weight)
```

