<a href="https://colab.research.google.com/github/RachelRamirez/misclassification_matrix/blob/main/PA_Misclassification_Cost_Matrix_Example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## What happens when I use PA method with Weighted_Matrix with w_array[7, 9] = 1

The results of running the  ISantaro appeared counterintuitive, because when I increased the cost of a misclassification, more misclassifications were made. It was hard to see at first because it wasn't consistently happening, it happened 7 out of 30 times, but when it happened it was a very large number of misclassifications.   So now I'm looking to compare another code implementation,  by Phil Alton here:  https://stackoverflow.com/a/61963004 

Remember to change the [Admin File stuff]  below and the Weight Matrix before Running

### Reproducible Seeds

In [None]:
#For Reproducibility
import numpy as np
# np.random.seed(1337)  # for reproducibility

import tensorflow as tf
# tf.random.set_seed(33)

import random as python_random
# python_random.seed(4)

# https://www.tensorflow.org/api_docs/python/tf/keras/utils/set_random_seed
tf.keras.utils.set_random_seed(342) #Possibly use next iteration if the above doesn't work


# Running more than once causes variation.  try adding this:
# Set seed value
seed_value = 56
import os
os.environ['PYTHONHASHSEED']=str(seed_value)

print("TF version: " , tf.__version__ )
print("Keras version: " , tf.keras.__version__ )

TF version:  2.11.0
Keras version:  2.11.0


### Import rest of Library

In [None]:

# from https://github.com/keras-team/keras/issues/2115#issuecomment-204060456
# witha correction on the weighted function in the middle 

'''Train a simple deep NN on the MNIST dataset.
Get to 98.40% test accuracy after 20 epochs
(there is *a lot* of margin for parameter tuning).
2 seconds per epoch on a K520 GPU.
'''

from __future__ import print_function  #do i still need this?
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adam, RMSprop
from keras.utils import np_utils
import keras.backend as K
from itertools import product
import functools
from functools import partial
from time import ctime
from time import sleep
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd 
from sklearn.metrics import confusion_matrix

## MORE REPEATABILITY STUFF NEEDED - If theres a way to update this to V2 of Tensorflow great, otherwise I had to use TF 1.0 code
# 5. Configure a new global `tensorflow` session (https://stackoverflow.com/questions/50659482/why-cant-i-get-reproducible-results-in-keras-even-though-i-set-the-random-seeds)
# from keras import backend as K


#I believe thecode below is to help things be repeatable each time different sections in my google colab notebook execute
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)

### Define batch, epochs, and format data

In [None]:
batch_size = 256 # I originally had it very  high batch size to reduce the variation in the data each batch and hope it makes the model training more nearly identical which it did, then i bring it back down to something reasonable to get better results training the NN
nb_classes = 10
nb_epoch = 15

# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()


X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
60000 train samples
10000 test samples


### Define Same Model but use normal Categorical CrossEntropy with no extra cost-matrix of Weights

In [None]:
def normal_method():

  model = Sequential()
  model.add(Dense(512, input_shape=(784,) ,kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model.add(Activation('relu'))
  model.add(Dropout(0.2))
  model.add(Dense(512, kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model.add(Activation('relu'))
  model.add(Dropout(0.2))
  model.add(Dense(10, kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model.add(Activation('softmax'))

  rms = RMSprop()
  # model.compile(loss=ncce, optimizer=rms)
  model.compile(loss=tf.keras.losses.CategoricalCrossentropy(), optimizer=rms, metrics='categorical_accuracy', )

  #add early_stop to prevent overfittings
  # callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)

  model_history = model.fit(X_train, Y_train,
            batch_size=batch_size, epochs=nb_epoch, verbose=2,
            validation_data=(X_test, Y_test),shuffle=False, use_multiprocessing=True
            , callbacks = [callback])

  
  # model.evaluate(X_test, Y_test, verbose=1)  # I know this isn't the typical use of train/val/test sets, please dont' comment on that
  
  #Predict
  y_prediction = model.predict(X_test)
  y_prediction  = np.argmax(y_prediction, axis=1)
  # Y_prediction = np_utils.to_categorical(y_prediction, nb_classes) #If I want to do SparseCategoricalAccuracy

  #Create confusion matrix and normalizes it over predicted (columns)
  # result = confusion_matrix(y_test, y_prediction , normalize='pred')  #if I want percentages instead of raw counts

  
  cm = confusion_matrix(y_test, y_prediction)
  cm = pd.DataFrame(cm, range(10),range(10))

  #This shows a pretty confusion matrix which I don't neeed to show right now
  # plt.figure(figsize = (10,10))
  # sns.heatmap(cm, annot=True, annot_kws={"size": 12}) # font size
  # plt.show()
  # cm_normal = cm

  return cm


### Define Cost Matrix

In [None]:
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cost_matrix = np.ones((10,10))
cost_matrix[7, 9] = 1    
## ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


### Weighted Categorical Cross Entropy Class

In [None]:
class WeightedCategoricalCrossentropy(tf.keras.losses.CategoricalCrossentropy):

  def __init__(self, cost_mat, name='weighted_categorical_crossentropy', **kwargs):

    cost_mat = np.array(cost_mat)   
    ## when loading from config, self.cost_mat returns as a list, rather than an numpy array. 
    ## Adding the above line fixes this issue, enabling .ndim to call sucessfully. 
    ## However, this is probably not the best implementation
    assert(cost_mat.ndim == 2)
    assert(cost_mat.shape[0] == cost_mat.shape[1])
    super().__init__(name=name, **kwargs)
    self.cost_mat = K.cast_to_floatx(cost_mat)

  def __call__(self, y_true, y_pred, sample_weight=None):
    assert sample_weight is None, "should only be derived from the cost matrix"  
    return super().__call__(
        y_true=y_true, 
        y_pred=y_pred, 
        sample_weight=get_sample_weights(y_true, y_pred, self.cost_mat),
    )


  def get_config(self):
    config = super().get_config().copy()
    # Calling .update on the line above, during assignment, causes an error with config becoming None-type.
    config.update({'cost_mat': (self.cost_mat)})
    return config

  @classmethod
  def from_config(cls, config):
    # something goes wrong here and changes self.cost_mat to a list variable.
    # See above for temporary fix
    return cls(**config)

def get_sample_weights(y_true, y_pred, cost_m):
    num_classes = len(cost_m)

    y_pred.shape.assert_has_rank(2)
    assert(y_pred.shape[1] == num_classes)
    y_pred.shape.assert_is_compatible_with(y_true.shape)

    y_pred = K.one_hot(K.argmax(y_pred), num_classes)

    y_true_nk1 = K.expand_dims(y_true, 2)
    y_pred_n1k = K.expand_dims(y_pred, 1)
    cost_m_1kk = K.expand_dims(cost_m, 0)

    sample_weights_nkk = cost_m_1kk * y_true_nk1 * y_pred_n1k
    sample_weights_n = K.sum(sample_weights_nkk, axis=[1, 2])

    return sample_weights_n


# Register the loss in the Keras namespace to enable loading of the custom object.
tf.keras.losses.WeightedCategoricalCrossentropy = WeightedCategoricalCrossentropy
 

### *WeightedCategoricalCross Entropy Function 

In [None]:
def PA_method(cost_matrix):

  model3 = Sequential()
  model3.add(Dense(512, input_shape=(784,), kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model3.add(Activation('relu'))
  model3.add(Dropout(0.2))
  model3.add(Dense(512, kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model3.add(Activation('relu'))
  model3.add(Dropout(0.2))
  model3.add(Dense(10,kernel_initializer=tf.keras.initializers.glorot_uniform(seed=42)))
  model3.add(Activation('softmax'))

  rms = RMSprop()

  model3.compile(loss=WeightedCategoricalCrossentropy(cost_matrix), optimizer=rms,  metrics='categorical_accuracy',)
  callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)

  model3_history = model3.fit(X_train, Y_train,
            batch_size=batch_size, epochs=nb_epoch, verbose=2,
            validation_data=(X_test, Y_test), shuffle=False, use_multiprocessing=True
            ,callbacks = [callback]
            )

 

  #Predict
  y_prediction = model3.predict(X_test)
  y_prediction  = np.argmax(y_prediction, axis=1)
  # Y_prediction = np_utils.to_categorical(y_prediction, nb_classes)

  #Create confusion matrix and normalizes it over predicted (columns)
  # result = confusion_matrix(y_test, y_prediction , normalize='pred')

  

  cm3 = confusion_matrix(y_test, y_prediction)
  cm3 = pd.DataFrame(cm3, range(10),range(10))
  # plt.figure(figsize = (10,10))
  # cm3
  # sns.heatmap(cm2, annot=True, annot_kws={"size": 12}) # font size
  # plt.show()

  # cm_using_weighted_new = cm3

  return cm3

### *Keep Track of Experimental Admin Stuff

> Change the file extension name to match the weights



In [None]:
# Experimental Admin Stuff
from datetime import date

today = date.today()
 

file_date = today.strftime("%Y_%m_%d")
# print("d5 =", d5)

file_extension = "w[7,9]_normal_1"



np.set_printoptions(suppress=True)
np.set_printoptions(precision=2)

#### Run Experiments

In [None]:
cm = np.zeros([10,10])
combined_cms = np.empty((1,100))

### ~~~~~~~~~
runs = 5 
### ~~~~~~~~~~

for i in range(0,runs+1):
  print(i)
  cm2 =  PA_method()    #Individual CM
  print("CM: \n", cm2)
  # cm += cm2                   #Aggregating for an Average
  cm2_array = np.asarray(cm2)  #Indiv CM as array for storing
  combined_cms = np.vstack((combined_cms,cm2_array.reshape((1,100))))

# cm_new = cm/30

0
Epoch 1/15
235/235 - 10s - loss: 0.3171 - categorical_accuracy: 0.9057 - val_loss: 0.2812 - val_categorical_accuracy: 0.9134 - 10s/epoch - 45ms/step
Epoch 2/15
235/235 - 2s - loss: 0.1232 - categorical_accuracy: 0.9641 - val_loss: 0.2009 - val_categorical_accuracy: 0.9369 - 2s/epoch - 7ms/step
Epoch 3/15
235/235 - 1s - loss: 0.0843 - categorical_accuracy: 0.9754 - val_loss: 0.0933 - val_categorical_accuracy: 0.9719 - 1s/epoch - 6ms/step
Epoch 4/15
235/235 - 1s - loss: 0.0638 - categorical_accuracy: 0.9814 - val_loss: 0.0889 - val_categorical_accuracy: 0.9733 - 1s/epoch - 5ms/step
Epoch 5/15
235/235 - 1s - loss: 0.0486 - categorical_accuracy: 0.9857 - val_loss: 0.0831 - val_categorical_accuracy: 0.9762 - 906ms/epoch - 4ms/step
Epoch 6/15
235/235 - 1s - loss: 0.0411 - categorical_accuracy: 0.9875 - val_loss: 0.0743 - val_categorical_accuracy: 0.9783 - 945ms/epoch - 4ms/step
Epoch 7/15
235/235 - 1s - loss: 0.0363 - categorical_accuracy: 0.9887 - val_loss: 0.0860 - val_categorical_accura

#### Save the 30 confusion matrices

In [None]:
#Save 30 confusion matrices

import pickle

str_runs = str(runs)

file_name = str_runs + "CM_" + file_extension + "_" + file_date + "_.pkl"
print(file_name)

with open(file_name, 'wb') as file:
      
    # A new file will be created
    pickle.dump(combined_cms, file)




# Open the file in binary mode
with open(file_name, 'rb') as file:
      
    # Call load method to deserialze
    var = pickle.load(file)
  
    print(var)


from google.colab import files
files.download( file_name )  

30CM_w[7,9]_3_2023_02_14_.pkl
[[6.91e-310 6.91e-310 2.77e-315 ... 4.78e+180 1.82e-152 1.17e+214]
 [9.72e+002 1.00e+000 1.00e+000 ... 2.00e+000 1.00e+000 9.80e+002]
 [9.70e+002 1.00e+000 1.00e+000 ... 2.00e+000 2.00e+000 9.86e+002]
 ...
 [9.71e+002 1.00e+000 1.00e+000 ... 1.00e+000 0.00e+000 9.92e+002]
 [9.68e+002 1.00e+000 2.00e+000 ... 2.00e+000 0.00e+000 9.88e+002]
 [9.74e+002 1.00e+000 1.00e+000 ... 1.00e+000 0.00e+000 9.92e+002]]


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [None]:
# How many categories are there in the test set?

truth_num_per_category = Y_test.sum(axis=0)

# Analyze 

I am now going to load the Average CM and try to get it in a format where I can make it a 1x100 and load all 30 CMs so that we can visualize their distributions in a a big histogram_matrix. At this point the Google Colab variables are gone and I have to reoad them 

In [None]:
# I need to remove the first placeholder row of zeros
combined_cms = combined_cms[1:31]

In [None]:
import numpy as np
import pandas as pd
empty_cm = np.zeros((10,10))  #why didn't this work with EMPTY instead of zeros?!!?
empty_cm=pd.DataFrame(empty_cm)

empty_cm.columns = ['0p', '1p', '2p', '3p', '4p', '5p', '6p', '7p', '8p', '9p']
empty_cm.index = ['0t', '1t', '2t', '3t', '4t', '5t', '6t', '7t', '8t', '9t']

# print(myvar_cm_average)

empty_cm_array = np.asarray(empty_cm)
empty_cm_array_1_100 = np.reshape(empty_cm_array,(1,100))
# print(cm_average_array)

df = empty_cm
df_new = pd.DataFrame(empty_cm_array_1_100,  columns=pd.MultiIndex.from_product([ df.index,df.columns]))
df_new.columns.to_flat_index()
df_new.columns   = ['_'.join(col) for col in df_new.columns.values]
print(df_new)


# Now convert combined_cms of size 30x100 to a panda dataframe

combined_cms_df = pd.DataFrame(combined_cms, columns=[df_new.columns])

   0t_0p  0t_1p  0t_2p  0t_3p  0t_4p  0t_5p  0t_6p  0t_7p  0t_8p  0t_9p  ...  \
0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0  ...   

   9t_0p  9t_1p  9t_2p  9t_3p  9t_4p  9t_5p  9t_6p  9t_7p  9t_8p  9t_9p  
0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0  

[1 rows x 100 columns]


In [None]:
csv_filename = file_name[:-4] + ".csv"

combined_cms_df.to_csv(csv_filename)


from google.colab import files
files.download(csv_filename )

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

# Extraneous

To reference later: 

https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/structured_data/imbalanced_data.ipynb#scrollTo=UJ589fn8ST3x

To train a model with class weights:

```
class_weight = {0: weight_for_0, 1: weight_for_1}

weighted_model = make_model()
weighted_model.load_weights(initial_weights)

weighted_history = weighted_model.fit(
    train_features,
    train_labels,
    batch_size=BATCH_SIZE,
    epochs=EPOCHS,
    callbacks=[early_stopping],
    validation_data=(val_features, val_labels),

    # The class weights go here
    class_weight=class_weight)
```



In [None]:
# Extraneous information I am not using at the moment

# model.compile(
#      optimizer='adam',
#      loss=WeightedCategoricalCrossentropy(cost_matrix)
#      )

## Model Saving

# model.save(save_version_dir,save_format='tf')

## Model Loading

# model = tf.keras.models.load_model(
#     save_version_dir,
#     compile=True,
#     custom_objects={
#         'WeightedCategoricalCrossentropy': WeightedCategoricalCrossentropy(cost_matrix)
#         }
#     )
 