<a href="https://colab.research.google.com/github/jaeyoung-jae-park/Joint-model-assisted-Decision-Rule/blob/main/Numerical_experiments_Toxic_comments.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data preparation

## Data Loading

There are two options to download the dataset of Toxic comments: 1) using Kaggle API; or 2) loading it from your local drive. We encourage you to choose Option 1 because Option 2 takes longer time than the other.

If you want to use Kaggle API, you should create new API token. The following document explains how to create it: https://galhever.medium.com/how-to-import-data-from-kaggle-to-google-colab-8160caa11e2.

If you want to load it from your local drive, please download the dataset here: https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/data

We only need train.csv for this experiment.


### Using Kaggle API
Upload your Kaggle API Token and then run the following coddes. 

Procedure
1.   Run the next block.
2.   Click the button of "Choose Files."
3.   Upload kaggle.json.



In [1]:
from google.colab import files
files.upload()

Saving kaggle.json to kaggle.json


{'kaggle.json': b'{"username":"jaeyoungpark","key":"0560c108cef0bdd5c48cc2c16db24253"}'}

In [2]:
!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle/

In [3]:
!chmod 600 ~/.kaggle/kaggle.json

In [4]:
!kaggle competitions download -c jigsaw-toxic-comment-classification-challenge

Downloading train.csv.zip to /content
 65% 17.0M/26.3M [00:00<00:01, 9.71MB/s]
100% 26.3M/26.3M [00:00<00:00, 30.2MB/s]
Downloading test.csv.zip to /content
 38% 9.00M/23.4M [00:00<00:01, 9.03MB/s]
100% 23.4M/23.4M [00:00<00:00, 25.4MB/s]
Downloading test_labels.csv.zip to /content
  0% 0.00/1.46M [00:00<?, ?B/s]
100% 1.46M/1.46M [00:00<00:00, 95.8MB/s]
Downloading sample_submission.csv.zip to /content
  0% 0.00/1.39M [00:00<?, ?B/s]
100% 1.39M/1.39M [00:00<00:00, 92.8MB/s]


In [5]:
!ls

kaggle.json  sample_submission.csv.zip	test_labels.csv.zip
sample_data  test.csv.zip		train.csv.zip


In [6]:
!mkdir toxic
!unzip sample_submission.csv.zip -d toxic
!unzip test_labels.csv.zip -d toxic
!unzip test.csv.zip -d toxic
!unzip train.csv.zip -d toxic

Archive:  sample_submission.csv.zip
  inflating: toxic/sample_submission.csv  
Archive:  test_labels.csv.zip
  inflating: toxic/test_labels.csv   
Archive:  test.csv.zip
  inflating: toxic/test.csv          
Archive:  train.csv.zip
  inflating: toxic/train.csv         


In [7]:
import pandas as pd
import numpy as np

train = pd.read_csv("/content/toxic/train.csv") # feel free to change the location

### Loading the dataset from your local drive

In [8]:
from google.colab import files
files.upload()

Saving train.csv to train.csv


KeyboardInterrupt: ignored

In [None]:
import io
import pandas as pd
import numpy as np

train = pd.read_csv(io.BytesIO(uploaded['train.csv']))

### Checking to load the dataset correctly

The codes below show the distribution for each label.

In [8]:
np.hstack((train['toxic'].value_counts()[1], 
           train['severe_toxic'].value_counts()[1], 
           train['obscene'].value_counts()[1],
           train['threat'].value_counts()[1],
           train['insult'].value_counts()[1],
           train['identity_hate'].value_counts()[1]))

array([15294,  1595,  8449,   478,  7877,  1405])

In [None]:
train['toxic'].value_counts()/np.sum(train['toxic'].value_counts())

0    0.904156
1    0.095844
Name: toxic, dtype: float64

In [None]:
train['severe_toxic'].value_counts()/np.sum(train['severe_toxic'].value_counts())

0    0.990004
1    0.009996
Name: severe_toxic, dtype: float64

In [None]:
train['obscene'].value_counts()/np.sum(train['obscene'].value_counts())

0    0.947052
1    0.052948
Name: obscene, dtype: float64

In [None]:
train['threat'].value_counts()/np.sum(train['threat'].value_counts())

0    0.997004
1    0.002996
Name: threat, dtype: float64

In [None]:
train['insult'].value_counts()/np.sum(train['insult'].value_counts())

0    0.950636
1    0.049364
Name: insult, dtype: float64

In [None]:
train['identity_hate'].value_counts()/np.sum(train['identity_hate'].value_counts())

0    0.991195
1    0.008805
Name: identity_hate, dtype: float64

In [None]:
pd.crosstab(train['toxic'], train['obscene'])

obscene,0,1
toxic,Unnamed: 1_level_1,Unnamed: 2_level_1
0,143754,523
1,7368,7926


In [None]:
pd.crosstab(train['toxic'], train['insult'])

insult,0,1
toxic,Unnamed: 1_level_1,Unnamed: 2_level_1
0,143744,533
1,7950,7344


## Label setting
Target - toxic comments

Auxiliary outcomes - obscene and insult

In [9]:
X = train['comment_text']
y = train[['toxic','severe_toxic', 'obscene', 'threat', 'insult', 'identity_hate']]
y_aux = train[['toxic', 'obscene', 'insult']]
y_aux_agg = pd.concat([y_aux, y_aux.iloc[:,0] * y_aux.iloc[:,1],
                        y_aux.iloc[:,0] * y_aux.iloc[:,2]], axis=1)
y_tgt = train['toxic']

In [12]:
y_aux_agg

Unnamed: 0,toxic,obscene,insult,0,1
0,0,0,0,0,0
1,0,0,0,0,0
2,0,0,0,0,0
3,0,0,0,0,0
4,0,0,0,0,0
...,...,...,...,...,...
159566,0,0,0,0,0
159567,0,0,0,0,0
159568,0,0,0,0,0
159569,0,0,0,0,0


# Hyperparameter settings

Neural network structures are defined for the following seven models: Baseline, NN, NN-joint, LDR-NN-joint, LDR-CIDNN, NLDR-NN-joint, and NLDR-CIDNN.



In [10]:
from keras.models import Sequential, Model
from keras.layers import Dense, Dropout
from keras.callbacks import EarlyStopping, ReduceLROnPlateau
from keras.optimizers import SGD
import tensorflow as tf

## Baseline

A linear decision rule does not have hidden layers. The network consists of an input layer and an output layer.

The optimizer is stochastic gradient descent with a learning rate of $10^{-3}$ and a momentum of 0.9. 
For each training set (9 folds), we further split it and use 20\% of the data as the validation set to avoid overfitting. 
The learning rate will decrease and the training may stop early based on the validation loss values. 
When a validation loss value does not achieve one smaller than the minimum of the last 5 epochs, the learning rate reduces to one tenth of its previous value. Further, although the total number of epochs is 300, the training will stop early, if a loss value smaller than the minimum is not obtained within 10 epochs.

In [11]:
def define_Baseline(output_dim):

  output_dim = output_dim

  model = Sequential()
  model.add(Dense(output_dim, input_shape=(vocab_size,), activation = "sigmoid"))
  model.summary()
  return model

def train_Baseline(model, X, y, epochs, batch_size):
  if model is None:
    if len(y.shape) == 1:
      model = define_Baseline(1)
    else: model= define_Baseline(y.shape[1])
  es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=10)
  rl = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=5)

  model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate = 0.001, momentum=0.9), metrics=['accuracy'])
  history = model.fit(X , y, batch_size = batch_size, epochs = epochs, validation_split=0.2, callbacks=[es, rl], verbose = 2)
  return model, history


## NN and NN-joint

The network includes two fully-connected hidden layers, with 256 and 128 units, respectively, and an additional dropout layer with a drop rate of 0.1.

In [12]:
def define_NN(output_dim):

  output_dim = output_dim

  model = Sequential()
  model.add(Dense(256, input_shape=(vocab_size,), activation='relu'))
  model.add(Dense(128, activation='relu'))
  model.add(Dropout(0.1))
  model.add(Dense(output_dim, activation = "sigmoid"))
  model.summary()
  return model

def train_NN(model, X, y, epochs, batch_size):
  if model is None:
    if len(y.shape) == 1:
      model = define_NN(1)
    else: model= define_NN(y.shape[1])
  es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=10)
  rl = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=5)
  model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate = 0.001, momentum=0.9), metrics=['accuracy'])
  history = model.fit(X , y, batch_size = batch_size, epochs = epochs, validation_split=0.2, callbacks=[es, rl], verbose=2)
  return model, history

## CIDNN

To calculate CIScore correctly, the indicies should match the the positions of auxiliary outcomes. The network structure is identical to NN.

In [13]:
def CIScore(y_true, y_pred):
  CI_numer = []
  CI_numer.append(tf.square((y_pred[:, 3] - y_pred[:, 1] * y_pred[:, 0])))
  CI_numer.append(tf.square((y_pred[:, 4] - y_pred[:, 2] * y_pred[:, 0])))
  
  return tf.reduce_mean(CI_numer[0], axis=-1)/CI_denom[0] + tf.reduce_mean(CI_numer[1], axis=-1)/CI_denom[1] + tf.keras.losses.binary_crossentropy(y_true[:,0], y_pred[:,0])

def train_CIDNN(model, X, y, epochs, batch_size):
  if model is None:
    model = define_NN(y.shape[1])

  es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=10)
  rl = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=5)

  model.compile(loss=CIScore, optimizer=SGD(learning_rate = 0.001, momentum=0.9), metrics=['accuracy'])
  history = model.fit(X , y, batch_size = batch_size, epochs = epochs, validation_split=0.2, callbacks=[es, rl], verbose=2)
  return model, history

  

## Joint-model-assisted linear/nonlinear decision rules (LDR/NLDR)

The original X and transformed X are concatenate for the input. Except the input layer, the hidden layers and the output layer are identical to the previous models.

In [14]:
def define_LDR(input_shape):

  output_dim = 1

  model = Sequential()
  model.add(Dense(output_dim, input_dim = input_shape, activation = "sigmoid"))
  model.summary()
  return model

def train_LDR(model, X, X_extracted, y, epochs, batch_size):
  X_transfered = np.hstack((X, X_extracted))

  if model is None:
    model = define_LDR(X_transfered.shape[1])

  es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=10)
  rl = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=5)

  model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate = 0.001, momentum=0.9), metrics=['accuracy'])
  history = model.fit(X_transfered , y, batch_size = batch_size, epochs = epochs, validation_split=0.2, callbacks=[es,rl], verbose = 2)
  return model, history


In [15]:
def define_NLDR(input_shape):

  output_dim = 1

  model = Sequential()
  model.add(Dense(256, input_shape=(input_shape,), activation='relu'))
  model.add(Dense(128, activation='relu'))
  model.add(Dropout(0.1))
  model.add(Dense(output_dim, activation = "sigmoid"))
  model.summary()
  return model

def train_NLDR(model, X, X_extracted, y, epochs, batch_size):
  X_transfered = np.hstack((X, X_extracted))

  if model is None:
    model = define_NLDR(X_transfered.shape[1])

  es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=10)
  rl = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=5)

  model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate = 0.001, momentum=0.9), metrics=['accuracy'])
  history = model.fit(X_transfered , y, batch_size = batch_size, epochs = epochs, validation_split=0.2, callbacks=[es, rl], verbose = 2)
  return model, history


# Training

We use 10-fold cross-validation for the overall procedure. 

In [16]:
X = np.array(X)
y_aux_agg = np.array(y_aux_agg)
y_tgt = np.array(y_tgt)
y_aux = np.array(y_aux)
np.random.seed(42)
fold_idx = np.random.choice(np.hstack((np.repeat(np.arange(10), int(X.shape[0]/10)), np.arange(X.shape[0] % 10))), size=X.shape[0], replace=False)

def split_data(X, y, fold_no):
  X_train, X_test, y_train, y_test = X[fold_idx != fold_no], X[fold_idx == fold_no], y[fold_idx != fold_no], y[fold_idx == fold_no]
  np.random.seed(fold_no)
  shuffle_train = np.random.choice(X_train.shape[0], X_train.shape[0], replace=False)
  shuffle_test = np.random.choice(X_test.shape[0], X_test.shape[0], replace=False)
  return X_train[shuffle_train], X_test[shuffle_test], y_train[shuffle_train], y_test[shuffle_test]

We strongly encourage you to mount Google drive. Again, it takes long time to save and load models from your local drive. 

In [17]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


Feel free to change the location where models are saved.

In [30]:
from tensorflow.python.keras.preprocessing.text import Tokenizer
import keras

for fold_no in np.arange(1):
  X_train, X_test, y_aux1_train, y_aux1_test = split_data(X, y_aux_agg, fold_no)
  X_train, X_test, y_tgt_train, y_tgt_test = split_data(X, y_tgt, fold_no)
  y_aux1_train = np.array(y_aux1_train, dtype='float32')

  # Create a tf-idf matrix
  vocab_size = 5000

  tokenizer_obj = Tokenizer(num_words = vocab_size)
  tokenizer_obj.fit_on_texts(X_train)

  max_length = max([len(s.split()) for s in X])

  X_train_tokens = tokenizer_obj.texts_to_matrix(X_train, mode='tfidf')
  X_test_tokens = tokenizer_obj.texts_to_matrix(X_test, mode='tfidf')

  # Baseline
  model_Baseline, history_Baseline = train_Baseline(model = None, X= X_train_tokens, y = y_tgt_train, epochs = 300, batch_size = 256)
  model_Baseline.save("/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_Baseline_"+str(fold_no))
  del model_Baseline
  keras.backend.clear_session()
  
  # NN
  model_NN, history_NN = train_NN(model = None, X= X_train_tokens, y = y_tgt_train, epochs = 300, batch_size = 256)
  model_NN.save("/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_NN_"+str(fold_no))
  del model_NN
  keras.backend.clear_session()

  # NN-joint
  model_NN_joint, history_NN_joint = train_NN(model = None, X= X_train_tokens, y = y_aux1_train[:,0:3], epochs = 300, batch_size = 256)
  model_NN_joint.save("/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_NN_joint_"+str(fold_no))
  del model_NN_joint
  keras.backend.clear_session()


  # CIDNN

  ## the denominator of the first term of CIScore
  sample_size = y_aux1_train.shape[0]
  CI_denom = []
  CI_denom.append((tf.square(tf.reduce_sum(y_aux1_train[:,3], axis=-1)/sample_size - 
                          tf.reduce_sum(y_aux1_train[:,1], axis=-1)/sample_size * 
                          tf.reduce_sum(y_aux1_train[:,0], axis=-1)/sample_size)))
  CI_denom.append((tf.square(tf.reduce_sum(y_aux1_train[:,4], axis=-1)/sample_size - 
                          tf.reduce_sum(y_aux1_train[:,2], axis=-1)/sample_size * 
                          tf.reduce_sum(y_aux1_train[:,0], axis=-1)/sample_size)))

  ## training the model
  model_CIDNN, history_CIDNN = train_CIDNN(model = None, X = X_train_tokens, y = y_aux1_train, epochs=300, batch_size = 256)
  model_CIDNN.save("/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_CIDNN_"+str(fold_no))
  del model_CIDNN


  # Extracting transformed features from the CIDNN
  model_CIDNN = keras.models.load_model('/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_CIDNN_'+str(fold_no), compile=False)
  extraction1 = Model(inputs=model_CIDNN.inputs, outputs=model_CIDNN.layers[-2].output)
  extracted_features1 = extraction1.predict(X_train_tokens)
  del model_CIDNN, extraction1
  keras.backend.clear_session()
  
  # LDR-CIDNN
  model_LDR_CIDNN, history_LDR_CIDNN = train_LDR(model = None, X= X_train_tokens, X_extracted = extracted_features1, y = y_tgt_train, epochs= 300, batch_size = 256)
  model_LDR_CIDNN.save("/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_LDR_CIDNN_"+str(fold_no))
  del model_LDR_CIDNN
  keras.backend.clear_session()

  # NLDR-CIDNN
  model_NLDR_CIDNN, history_NLDR_CIDNN = train_NLDR(model = None, X= X_train_tokens, X_extracted = extracted_features1, y = y_tgt_train, epochs= 300, batch_size = 256)
  model_NLDR_CIDNN.save("/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_NLDR_CIDNN_"+str(fold_no))
  del model_NLDR_CIDNN, extracted_features1
  keras.backend.clear_session()

  
  # Extracting transformed features from the joint model
  model_NN_joint = keras.models.load_model('/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_NN_joint_'+str(fold_no), compile=False)
  extraction_ce1 = Model(inputs=model_NN_joint.inputs, outputs=model_NN_joint.layers[-2].output)
  extracted_features_ce1 = extraction_ce1.predict(X_train_tokens)
  del model_NN_joint, extraction_ce1
  keras.backend.clear_session()

  # LDR-NN-joint
  model_LDR_NN_joint, history_LDR_NN_joint = train_LDR(model = None, X= X_train_tokens, X_extracted = extracted_features_ce1, y = y_tgt_train, epochs= 300, batch_size = 256)
  model_LDR_NN_joint.save("/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_LDR_NN_joint_"+str(fold_no))
  del model_LDR_NN_joint
  keras.backend.clear_session()

  # NLDR-NN-joint
  model_NLDR_NN_joint, history_NLDR_NN_joint = train_NLDR(model = None, X = X_train_tokens, X_extracted = extracted_features_ce1, y = y_tgt_train, epochs = 300, batch_size = 256)
  model_NLDR_NN_joint.save("/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_NLDR_NN_joint_"+str(fold_no))
  del model_NLDR_NN_joint
  keras.backend.clear_session()

  del X_train, X_test, y_aux1_train, y_aux1_test, y_tgt_train, y_tgt_test, tokenizer_obj, X_train_tokens, X_test_tokens

NameError: ignored

# Evaluation

Evaluate models with the following metrics: AUC, accuracy, F1 score. If you do not train models above, please upload the trained models we provide. 

We provide the trained models. Please find them here: https://drive.google.com/drive/folders/1sQaV1LYvmgDbtzhhtBFMavwcJyAI_C-H?usp=sharing. You can download them or copy them to your drive. The following link explains how to copy the shared folder to your drive: https://stackoverflow.com/questions/54351852/accessing-shared-with-me-with-colab


In [None]:
# if you want to load models from your local drive, please use this code
from google.colab import files
files.upload()

In [47]:
import keras 
from keras.models import Model
from sklearn import metrics
from tensorflow.python.keras.preprocessing.text import Tokenizer

scores = {}

for fold_no in np.arange(1):
  X_train, X_test, y_aux1_train, y_aux1_test = split_data(X, y_aux_agg, fold_no)
  X_train, X_test, y_tgt_train, y_tgt_test = split_data(X, y_tgt, fold_no)
  y_aux1_train = np.array(y_aux1_train, dtype='float32')

  vocab_size = 5000

  tokenizer_obj = Tokenizer(num_words = vocab_size)
  tokenizer_obj.fit_on_texts(X_train)

  max_length = max([len(s.split()) for s in X])

  X_train_tokens = tokenizer_obj.texts_to_matrix(X_train, mode='tfidf')
  X_test_tokens = tokenizer_obj.texts_to_matrix(X_test, mode='tfidf')

  model_Baseline = keras.models.load_model('/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_Baseline_'+str(fold_no))
  model_NN = keras.models.load_model('/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_NN_'+str(fold_no))
  model_NN_joint = keras.models.load_model('/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_NN_joint_'+str(fold_no))
  model_CIDNN = keras.models.load_model('/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_CIDNN_'+str(fold_no), compile=False)
  model_NLDR_CIDNN = keras.models.load_model('/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_NLDR_CIDNN_'+str(fold_no))
  model_LDR_CIDNN = keras.models.load_model('/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_LDR_CIDNN_'+str(fold_no))
  model_NLDR_NN_joint = keras.models.load_model('/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_NLDR_NN_joint_'+str(fold_no))
  model_LDR_NN_joint = keras.models.load_model('/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/model_LDR_NN_joint_'+str(fold_no))


  predicted_Baseline = model_Baseline.predict(X_test_tokens)
  predicted_NN = model_NN.predict(X_test_tokens)
  predicted_NN_joint = model_NN_joint.predict(X_test_tokens)[:,0]
  
  #CIDNN
  extraction = Model(inputs=model_CIDNN.inputs, outputs=model_CIDNN.layers[-2].output)
  predicted_extract = extraction.predict(X_test_tokens)
  predicted_LDR_CIDNN = model_LDR_CIDNN.predict(np.hstack((X_test_tokens, predicted_extract)))
  predicted_NLDR_CIDNN = model_NLDR_CIDNN.predict(np.hstack((X_test_tokens, predicted_extract)))
  
  #NN-joint
  extraction_NN_joint = Model(inputs=model_NN_joint.inputs, outputs=model_NN_joint.layers[-2].output)
  predicted_extract_NN_joint = extraction_NN_joint.predict(X_test_tokens)
  predicted_LDR_NN_joint = model_LDR_NN_joint.predict(np.hstack((X_test_tokens, predicted_extract_NN_joint)))
  predicted_NLDR_NN_joint = model_NLDR_NN_joint.predict(np.hstack((X_test_tokens, predicted_extract_NN_joint)))
  
  auc_Baseline = metrics.roc_auc_score(y_tgt_test, (predicted_Baseline).reshape(-1))
  auc_NN = metrics.roc_auc_score(y_tgt_test, (predicted_NN).reshape(-1))
  auc_NN_joint = metrics.roc_auc_score(y_tgt_test, (predicted_NN_joint).reshape(-1))
  auc_LDR_CIDNN = metrics.roc_auc_score(y_tgt_test, (predicted_LDR_CIDNN).reshape(-1))
  auc_NLDR_CIDNN = metrics.roc_auc_score(y_tgt_test, (predicted_NLDR_CIDNN).reshape(-1))
  auc_NLDR_NN_joint = metrics.roc_auc_score(y_tgt_test, (predicted_NLDR_NN_joint).reshape(-1))
  auc_LDR_NN_joint = metrics.roc_auc_score(y_tgt_test, (predicted_LDR_NN_joint).reshape(-1))

  acc_Baseline = np.mean(((predicted_Baseline > 0.5) *1).reshape(-1) == y_tgt_test)
  acc_NN = np.mean(((predicted_NN > 0.5) *1).reshape(-1) == y_tgt_test)
  acc_NN_joint = np.mean(((predicted_NN_joint > 0.5) *1).reshape(-1) == y_tgt_test)
  acc_LDR_CIDNN = np.mean(((predicted_LDR_CIDNN > 0.5) *1).reshape(-1) == y_tgt_test)
  acc_NLDR_CIDNN = np.mean(((predicted_NLDR_CIDNN > 0.5) *1).reshape(-1) == y_tgt_test)
  acc_LDR_NN_joint = np.mean(((predicted_LDR_NN_joint > 0.5) *1).reshape(-1) == y_tgt_test)
  acc_NLDR_NN_joint = np.mean(((predicted_NLDR_NN_joint > 0.5) *1).reshape(-1) == y_tgt_test)

  f1_Baseline = metrics.f1_score(y_tgt_test, ((predicted_Baseline>0.5) *1).reshape(-1))
  f1_NN = metrics.f1_score(y_tgt_test, ((predicted_NN>0.5) *1).reshape(-1))
  f1_NN_joint = metrics.f1_score(y_tgt_test, ((predicted_NN_joint>0.5) *1).reshape(-1))
  f1_LDR_CIDNN = metrics.f1_score(y_tgt_test, ((predicted_LDR_CIDNN>0.5)*1).reshape(-1))
  f1_NLDR_CIDNN = metrics.f1_score(y_tgt_test, ((predicted_NLDR_CIDNN>0.5)*1).reshape(-1))
  f1_LDR_NN_joint = metrics.f1_score(y_tgt_test, ((predicted_LDR_NN_joint>0.5)*1).reshape(-1))
  f1_NLDR_NN_joint = metrics.f1_score(y_tgt_test, ((predicted_NLDR_NN_joint>0.5)*1).reshape(-1))

  scores[fold_no] = np.array([auc_Baseline, auc_NN, auc_NN_joint, auc_LDR_NN_joint, auc_LDR_CIDNN, auc_NLDR_NN_joint, auc_NLDR_CIDNN,  
                              acc_Baseline, acc_NN, acc_NN_joint, acc_LDR_NN_joint, acc_LDR_CIDNN, acc_NLDR_NN_joint, acc_NLDR_CIDNN,  
                              f1_Baseline, f1_NN, f1_NN_joint, f1_LDR_NN_joint, f1_LDR_CIDNN, f1_NLDR_NN_joint, f1_NLDR_CIDNN])
  del X_train, X_test, y_aux1_train, y_aux1_test, y_tgt_train, y_tgt_test, tokenizer_obj, X_train_tokens, X_test_tokens

scores_final = np.vstack(scores.values())

scores_final = pd.DataFrame(scores_final, columns = ['auc_Baseline','auc_NN', 'auc_NN_joint', 'auc_LDR_NN_joint', 'auc_LDR_CIDNN', 'auc_NLDR_NN_joint', 'auc_NLDR_CIDNN',
                                                     'acc_Baseline','acc_NN', 'acc_NN_joint', 'acc_LDR_NN_joint', 'acc_LDR_CIDNN','acc_NLDR_NN_joint', 'acc_NLDR_CIDNN',
                                                     'f1_Baseline','f1_NN', 'f1_NN_joint', 'f1_LDR_NN_joint','f1_LDR_CIDNN','f1_NLDR_NN_joint', 'f1_NLDR_CIDNN'])
# scores_final.to_csv('/content/drive/MyDrive/Can_a_joint_model_assist_target_label_prediction/Toxic_comments/scores_final.csv')