<a href="https://colab.research.google.com/github/RachelRamirez/misclassification_matrix/blob/main/Potential_Misclassification_Cost_Matrix_Example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## What happens when I replace Weighted_Matrix with w_array[7, 9] = 1.5

Now that I've shown the TB and ISantaro methods are equivalent when seeded properly, I believe what I need to do is train a "good" neural network with the same seed 30 times to get an idea of what the typical confusion matrix looks like.  I will run the code within Colab 30 times without restarting because from previous reproducibility studies if I restart and run-all I will get the same results.  This time, I'm interested in the typical variety you can get on a confusion matrix when the neural network is seeded the same way each time, that way when I gve it another cost-matrix to train on, and run that 30 times I can do a more informative comparison of the results.  Since the Isantaro and TB methods were identical I went with the Isantaro method because it was simpler, more efficient, and seemed less time consuming. 

Changes from previous Reproducibility notebook:
1. Dropout is back in.
2. Batch Size is not as large to help with variety.
3. Num of Epochs is more than 4 now that I care about achieving good overall accuracy
4. Callback for EarlyStop added
5. Model Shuffle during Fit is still False (I'm calling it out to see if I need to change that)
6. but Model.Fit(use multiprocessors = True)
7. Still Cost Matrix of all 1's

### Reproducible Seeds

In [None]:
#For Reproducibility
import numpy as np
# np.random.seed(1337)  # for reproducibility

import tensorflow as tf
# tf.random.set_seed(33)

import random as python_random
# python_random.seed(4)

# https://www.tensorflow.org/api_docs/python/tf/keras/utils/set_random_seed
tf.keras.utils.set_random_seed(342) #Possibly use next iteration if the above doesn't work


# Running more than once causes variation.  try adding this:
# Set seed value
seed_value = 56
import os
os.environ['PYTHONHASHSEED']=str(seed_value)

print("TF version: " , tf.__version__ )
print("Keras version: " , tf.keras.__version__ )

TF version:  2.9.2
Keras version:  2.9.0


### Import rest of Library

In [None]:

# from https://github.com/keras-team/keras/issues/2115#issuecomment-204060456
# witha correction on the weighted function in the middle 

'''Train a simple deep NN on the MNIST dataset.
Get to 98.40% test accuracy after 20 epochs
(there is *a lot* of margin for parameter tuning).
2 seconds per epoch on a K520 GPU.
'''

from __future__ import print_function  #do i still need this?
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adam, RMSprop
from keras.utils import np_utils
import keras.backend as K
from itertools import product
import functools
from functools import partial
from time import ctime
from time import sleep
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd 
from sklearn.metrics import confusion_matrix

## MORE REPEATABILITY STUFF NEEDED - If theres a way to update this to V2 of Tensorflow great, otherwise I had to use TF 1.0 code
# 5. Configure a new global `tensorflow` session (https://stackoverflow.com/questions/50659482/why-cant-i-get-reproducible-results-in-keras-even-though-i-set-the-random-seeds)
# from keras import backend as K


#I believe thecode below is to help things be repeatable each time different sections in my google colab notebook execute
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)

### Define batch, epochs, and format data

In [None]:
batch_size = 256 # I originally had it very  high batch size to reduce the variation in the data each batch and hope it makes the model training more nearly identical which it did, then i bring it back down to something reasonable to get better results training the NN
nb_classes = 10
nb_epoch = 15

# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()


X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)


60000 train samples
10000 test samples


### Define weighted_categorical_crossentropy()

In [None]:

# # https://github.com/keras-team/keras/issues/2115#issuecomment-207765342

# def w_categorical_crossentropy(y_true, y_pred, weights):
#     nb_cl = len(weights)
#     final_mask = K.zeros_like(y_pred[:, 0])
#     y_pred_max = K.max(y_pred, axis=1)
#     y_pred_max = K.expand_dims(y_pred_max, 1)
#     y_pred_max_mat = K.equal(y_pred, y_pred_max)
 
#     for c_t, c_p in product(range(nb_cl), range(nb_cl)):
#         final_mask += (K.cast(weights[c_t, c_p],K.floatx()) * K.cast(y_pred_max_mat[:, c_p] ,K.floatx())* K.cast(y_true[:, c_t],K.floatx()))
    
#     # result = K.categorical_crossentropy(y_true, y_pred)*final_mask
#     # tf.print(result, "Show Result of CE * Final_Mask")  #this was basically useless to display, and it showed like, 500 lines of print statements each epoch

#     return K.categorical_crossentropy(y_true, y_pred)*final_mask   #I changed the order of y_true and y_pred




### Define Same Model but use normal Categorical CrossEntropy with no extra cost-matrix of Weights

In [None]:
def normal_method():

  model = Sequential()
  model.add(Dense(512, input_shape=(784,) ,kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model.add(Activation('relu'))
  model.add(Dropout(0.2))
  model.add(Dense(512, kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model.add(Activation('relu'))
  model.add(Dropout(0.2))
  model.add(Dense(10, kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model.add(Activation('softmax'))

  rms = RMSprop()
  # model.compile(loss=ncce, optimizer=rms)
  model.compile(loss=tf.keras.losses.CategoricalCrossentropy(), optimizer=rms, metrics='categorical_accuracy', )

  #add early_stop to prevent overfittings
  # callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)

  model_history = model.fit(X_train, Y_train,
            batch_size=batch_size, epochs=nb_epoch, verbose=2,
            validation_data=(X_test, Y_test),shuffle=False, use_multiprocessing=True
            , callbacks = [callback])

  
  # model.evaluate(X_test, Y_test, verbose=1)  # I know this isn't the typical use of train/val/test sets, please dont' comment on that
  
  #Predict
  y_prediction = model.predict(X_test)
  y_prediction  = np.argmax(y_prediction, axis=1)
  # Y_prediction = np_utils.to_categorical(y_prediction, nb_classes) #If I want to do SparseCategoricalAccuracy

  #Create confusion matrix and normalizes it over predicted (columns)
  # result = confusion_matrix(y_test, y_prediction , normalize='pred')  #if I want percentages instead of raw counts

  
  cm = confusion_matrix(y_test, y_prediction)
  cm = pd.DataFrame(cm, range(10),range(10))

  #This shows a pretty confusion matrix which I don't neeed to show right now
  # plt.figure(figsize = (10,10))
  # sns.heatmap(cm, annot=True, annot_kws={"size": 12}) # font size
  # plt.show()
  # cm_normal = cm

  return cm


### Weighted Categorical Cross Entropy Function

In [None]:
# from https://github.com/keras-team/keras/issues/2115#issuecomment-815825633 from Isaranto

def weighted_categorical_crossentropy_new(y_true, y_pred, weights):
          idx1 = K.argmax(y_pred, axis=1)
          idx2 = K.argmax(y_true, axis=1)
          mask = tf.gather_nd(weights, tf.stack((idx1, idx2), -1))
          return K.categorical_crossentropy(y_true, y_pred) * mask

In [None]:
 #what does ncce stand for?

 
def isaranto_method():
  w_array = np.ones((10,10))
  w_array[7, 9] = 1.5     #HAVE YET TO SEE Whether this is REAL 7 PREDICTED AS 9 or vs vsa
  # w_array = w_array - np.eye(10)
  # print("W_array:  ", w_array)

  weighted_list = w_array.tolist()

  wcce = partial(weighted_categorical_crossentropy_new, weights=weighted_list)
  wcce.__name__ ='w_categorical_crossentropy'

  model3 = Sequential()
  model3.add(Dense(512, input_shape=(784,), kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model3.add(Activation('relu'))
  model3.add(Dropout(0.2))
  model3.add(Dense(512, kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model3.add(Activation('relu'))
  model3.add(Dropout(0.2))
  model3.add(Dense(10,kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model3.add(Activation('softmax'))

  rms = RMSprop()

  model3.compile(loss=wcce, optimizer=rms,  metrics='categorical_accuracy',)
  callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)

  model3_history = model3.fit(X_train, Y_train,
            batch_size=batch_size, epochs=nb_epoch, verbose=2,
            validation_data=(X_test, Y_test), shuffle=False, use_multiprocessing=True
            ,callbacks = [callback]
            )

 

  #Predict
  y_prediction = model3.predict(X_test)
  y_prediction  = np.argmax(y_prediction, axis=1)
  # Y_prediction = np_utils.to_categorical(y_prediction, nb_classes)

  #Create confusion matrix and normalizes it over predicted (columns)
  # result = confusion_matrix(y_test, y_prediction , normalize='pred')

  

  cm3 = confusion_matrix(y_test, y_prediction)
  cm3 = pd.DataFrame(cm3, range(10),range(10))
  # plt.figure(figsize = (10,10))
  # cm3
  # sns.heatmap(cm2, annot=True, annot_kws={"size": 12}) # font size
  # plt.show()

  # cm_using_weighted_new = cm3

  return cm3

In [None]:
cm = np.zeros([10,10])


for i in range(0,30):
  print(i)
  cm2 =  isaranto_method()
  print("CM: \n", cm2)
  cm += cm2

cm_new = cm/30

0
Epoch 1/15
235/235 - 16s - loss: 0.3058 - categorical_accuracy: 0.9062 - val_loss: 0.2383 - val_categorical_accuracy: 0.9267 - 16s/epoch - 68ms/step
Epoch 2/15
235/235 - 6s - loss: 0.1190 - categorical_accuracy: 0.9639 - val_loss: 0.1467 - val_categorical_accuracy: 0.9552 - 6s/epoch - 27ms/step
Epoch 3/15
235/235 - 7s - loss: 0.0802 - categorical_accuracy: 0.9758 - val_loss: 0.0989 - val_categorical_accuracy: 0.9696 - 7s/epoch - 31ms/step
Epoch 4/15
235/235 - 6s - loss: 0.0616 - categorical_accuracy: 0.9814 - val_loss: 0.1025 - val_categorical_accuracy: 0.9703 - 6s/epoch - 28ms/step
Epoch 5/15
235/235 - 7s - loss: 0.0495 - categorical_accuracy: 0.9844 - val_loss: 0.0904 - val_categorical_accuracy: 0.9749 - 7s/epoch - 30ms/step
Epoch 6/15
235/235 - 7s - loss: 0.0407 - categorical_accuracy: 0.9875 - val_loss: 0.0819 - val_categorical_accuracy: 0.9776 - 7s/epoch - 29ms/step
Epoch 7/15
235/235 - 7s - loss: 0.0329 - categorical_accuracy: 0.9892 - val_loss: 0.0818 - val_categorical_accurac

In [None]:
print("This is the average confusion matrix using 30 runs")
round(cm_new,1)

This is the average confusion matrix using 30 runs


Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,972.7,0.9,0.7,0.4,0.7,0.7,1.4,0.6,1.2,0.7
1,0.1,1128.5,1.3,1.2,0.1,0.5,1.3,0.1,1.8,0.0
2,3.2,2.4,1012.2,2.3,1.8,0.0,1.3,3.7,4.8,0.3
3,0.1,0.0,3.9,990.1,0.1,4.9,0.0,3.9,1.7,5.4
4,0.5,0.0,3.2,0.0,967.0,0.1,3.4,0.5,0.4,6.9
5,2.0,0.0,0.0,6.3,1.2,876.2,1.3,0.5,2.1,2.2
6,4.0,2.8,0.3,0.7,5.6,15.3,927.5,0.0,1.8,0.0
7,1.4,3.3,9.0,0.7,4.6,0.0,0.0,991.0,4.4,13.5
8,2.7,1.3,3.4,5.1,4.1,6.4,0.6,2.4,940.4,7.6
9,1.1,2.3,0.1,2.6,10.6,2.3,0.2,1.6,0.4,987.9


The above confusion matrix matches the last saved version exactly!  Therefore the 30 replicates were exactly reproducible.  

In [None]:
cm_new

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,972.666667,0.933333,0.7,0.433333,0.7,0.7,1.366667,0.6,1.166667,0.733333
1,0.133333,1128.466667,1.333333,1.2,0.1,0.5,1.266667,0.133333,1.833333,0.033333
2,3.2,2.4,1012.166667,2.333333,1.8,0.0,1.266667,3.7,4.8,0.333333
3,0.066667,0.033333,3.866667,990.1,0.1,4.866667,0.0,3.933333,1.666667,5.366667
4,0.533333,0.033333,3.2,0.033333,967.033333,0.1,3.366667,0.466667,0.366667,6.866667
5,2.033333,0.033333,0.033333,6.333333,1.2,876.233333,1.333333,0.5,2.133333,2.166667
6,4.0,2.833333,0.266667,0.7,5.633333,15.266667,927.466667,0.0,1.8,0.033333
7,1.366667,3.333333,9.0,0.7,4.633333,0.033333,0.0,991.0,4.433333,13.5
8,2.7,1.266667,3.366667,5.1,4.133333,6.433333,0.633333,2.366667,940.366667,7.633333
9,1.133333,2.266667,0.066667,2.566667,10.566667,2.266667,0.2,1.633333,0.366667,987.933333


In [None]:
import pickle
# Open a file and use dump()
with open('file.pkl', 'wb') as file:
      
    # A new file will be created
    pickle.dump(cm_new, file)




# Open the file in binary mode
with open('file.pkl', 'rb') as file:
      
    # Call load method to deserialze
    myvar_cm_7_9 = pickle.load(file)
  
    print(myvar_cm_7_9)


from google.colab import files
files.download( "file.pkl" )  

            0            1            2           3           4           5  \
0  972.666667     0.933333     0.700000    0.433333    0.700000    0.700000   
1    0.133333  1128.466667     1.333333    1.200000    0.100000    0.500000   
2    3.200000     2.400000  1012.166667    2.333333    1.800000    0.000000   
3    0.066667     0.033333     3.866667  990.100000    0.100000    4.866667   
4    0.533333     0.033333     3.200000    0.033333  967.033333    0.100000   
5    2.033333     0.033333     0.033333    6.333333    1.200000  876.233333   
6    4.000000     2.833333     0.266667    0.700000    5.633333   15.266667   
7    1.366667     3.333333     9.000000    0.700000    4.633333    0.033333   
8    2.700000     1.266667     3.366667    5.100000    4.133333    6.433333   
9    1.133333     2.266667     0.066667    2.566667   10.566667    2.266667   

            6           7           8           9  
0    1.366667    0.600000    1.166667    0.733333  
1    1.266667    0.133333 

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Started at 10:43 AM  ended at 11:27 

In [None]:
ctime()

'Tue Feb  7 19:29:47 2023'

In [None]:
# How many categories are there in the test set?

truth_num_per_category = Y_test.sum(axis=0)

In [None]:
# I'm actually surprised this ended up working ( I double checked in excel)
cm_percents = cm_new/truth_num_per_category

with open('cm_percents_7_9.pkl', 'wb') as file:
      
    # A new file will be created
    pickle.dump(cm_percents, file)




# Open the file in binary mode
with open('cm_percents.pkl', 'rb') as file:
      
    # Call load method to deserialze
    myvar_cm_percents = pickle.load(file)
  
    print(cm_percents)


from google.colab import files
files.download( "cm_percents_7_9.pkl" )  

          0         1         2         3         4         5         6  \
0  0.992517  0.000822  0.000678  0.000429  0.000713  0.000785  0.001427   
1  0.000136  0.994244  0.001292  0.001188  0.000102  0.000561  0.001322   
2  0.003265  0.002115  0.980782  0.002310  0.001833  0.000000  0.001322   
3  0.000068  0.000029  0.003747  0.980297  0.000102  0.005456  0.000000   
4  0.000544  0.000029  0.003101  0.000033  0.984759  0.000112  0.003514   
5  0.002075  0.000029  0.000032  0.006271  0.001222  0.982324  0.001392   
6  0.004082  0.002496  0.000258  0.000693  0.005737  0.017115  0.968128   
7  0.001395  0.002937  0.008721  0.000693  0.004718  0.000037  0.000000   
8  0.002755  0.001116  0.003262  0.005050  0.004209  0.007212  0.000661   
9  0.001156  0.001997  0.000065  0.002541  0.010760  0.002541  0.000209   

          7         8         9  
0  0.000584  0.001198  0.000727  
1  0.000130  0.001882  0.000033  
2  0.003599  0.004928  0.000330  
3  0.003826  0.001711  0.005319  
4  0

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

The highest Classification Accuracy is 

# Analyze 

I am now going to load the Average CM and try to get it in a format where I can make it a 1x100 and load all 30 CMs so that we can visualize their distributions in a a big histogram_matrix. At this point the Google Colab variables are gone and I have to reoad them 

In [54]:
import pickle 
with open('file.pkl', 'rb') as file:
      
    # Call load method to deserialze
    myvar_cm_average = pickle.load(file)
  
    print(myvar_cm_average)  #this is a dataframe

            0            1            2           3           4           5  \
0  972.666667     0.933333     0.700000    0.433333    0.700000    0.700000   
1    0.133333  1128.466667     1.333333    1.200000    0.100000    0.500000   
2    3.200000     2.400000  1012.166667    2.333333    1.800000    0.000000   
3    0.066667     0.033333     3.866667  990.100000    0.100000    4.866667   
4    0.533333     0.033333     3.200000    0.033333  967.033333    0.100000   
5    2.033333     0.033333     0.033333    6.333333    1.200000  876.233333   
6    4.000000     2.833333     0.266667    0.700000    5.633333   15.266667   
7    1.366667     3.333333     9.000000    0.700000    4.633333    0.033333   
8    2.700000     1.266667     3.366667    5.100000    4.133333    6.433333   
9    1.133333     2.266667     0.066667    2.566667   10.566667    2.266667   

            6           7           8           9  
0    1.366667    0.600000    1.166667    0.733333  
1    1.266667    0.133333 

In [56]:
# I want to change the names of the Columns and Rows so that it may improve the default ScatterPlot Matrix options:
# import pandas
list(myvar_cm_average)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [93]:
myvar_cm_average.columns = ['0p', '1p', '2p', '3p', '4p', '5p', '6p', '7p', '8p', '9p']
myvar_cm_average.index = ['0_t', '1_t', '2_t', '3_t', '4_t', '5_t', '6_t', '7_t', '8_t', '9_t ']

print(myvar_cm_average)

cm_average_array = np.asarray(myvar_cm_average)
cm_average_array = np.reshape(cm_average_array,(1,100))
print(cm_average_array)

df = myvar_cm_average
df_new = pd.DataFrame(cm_average_array,  columns=pd.MultiIndex.from_product([ df.index,df.columns]), index=[1])
print(df_new)


             0_p          1_p          2_p         3_p         4_p  \
0_t   972.666667     0.933333     0.700000    0.433333    0.700000   
1_t     0.133333  1128.466667     1.333333    1.200000    0.100000   
2_t     3.200000     2.400000  1012.166667    2.333333    1.800000   
3_t     0.066667     0.033333     3.866667  990.100000    0.100000   
4_t     0.533333     0.033333     3.200000    0.033333  967.033333   
5_t     2.033333     0.033333     0.033333    6.333333    1.200000   
6_t     4.000000     2.833333     0.266667    0.700000    5.633333   
7_t     1.366667     3.333333     9.000000    0.700000    4.633333   
8_t     2.700000     1.266667     3.366667    5.100000    4.133333   
9_t     1.133333     2.266667     0.066667    2.566667   10.566667   

             5_p         6_p         7_p         8_p         9_p  
0_t     0.700000    1.366667    0.600000    1.166667    0.733333  
1_t     0.500000    1.266667    0.133333    1.833333    0.033333  
2_t     0.000000    1.266667

In [94]:
import plotly.express as px
# df = px.data.iris()

fig = px.scatter_matrix(df_new)
fig.show()

ValueError: ignored

In [None]:
import numpy as np

cm_average_array = np.asarray(myvar_cm_average).reshape(100) #I dont need to specify (1,100), just 100 should work

np.set_printoptions(suppress=True)
np.set_printoptions(precision=2)

print("After suppressing scientific notation and displaying up to precision 2, the results are below, as a 1x100 array")
cm_average_array



After suppressing scientific notation and displaying up to precision 2, the results are below, as a 1x100 array


array([ 972.67,    0.93,    0.7 ,    0.43,    0.7 ,    0.7 ,    1.37,
          0.6 ,    1.17,    0.73,    0.13, 1128.47,    1.33,    1.2 ,
          0.1 ,    0.5 ,    1.27,    0.13,    1.83,    0.03,    3.2 ,
          2.4 , 1012.17,    2.33,    1.8 ,    0.  ,    1.27,    3.7 ,
          4.8 ,    0.33,    0.07,    0.03,    3.87,  990.1 ,    0.1 ,
          4.87,    0.  ,    3.93,    1.67,    5.37,    0.53,    0.03,
          3.2 ,    0.03,  967.03,    0.1 ,    3.37,    0.47,    0.37,
          6.87,    2.03,    0.03,    0.03,    6.33,    1.2 ,  876.23,
          1.33,    0.5 ,    2.13,    2.17,    4.  ,    2.83,    0.27,
          0.7 ,    5.63,   15.27,  927.47,    0.  ,    1.8 ,    0.03,
          1.37,    3.33,    9.  ,    0.7 ,    4.63,    0.03,    0.  ,
        991.  ,    4.43,   13.5 ,    2.7 ,    1.27,    3.37,    5.1 ,
          4.13,    6.43,    0.63,    2.37,  940.37,    7.63,    1.13,
          2.27,    0.07,    2.57,   10.57,    2.27,    0.2 ,    1.63,
          0.37,  987

# Extraneous

To reference later: 

https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/structured_data/imbalanced_data.ipynb#scrollTo=UJ589fn8ST3x

To train a model with class weights:

```
class_weight = {0: weight_for_0, 1: weight_for_1}

weighted_model = make_model()
weighted_model.load_weights(initial_weights)

weighted_history = weighted_model.fit(
    train_features,
    train_labels,
    batch_size=BATCH_SIZE,
    epochs=EPOCHS,
    callbacks=[early_stopping],
    validation_data=(val_features, val_labels),

    # The class weights go here
    class_weight=class_weight)
```

