<a href="https://colab.research.google.com/github/marco-siino/text_preprocessing_impact/blob/main/IMDB_DS/RoBERTa_IMDB_TextPreProImpact_NB_PART_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Text preprocessing worth the time: A comparative survey on the impact of common techniques on NLP model performances. 
- - - 
RoBERTa ON IMDB DS EXPERIMENTS NOTEBOOK 
- - -
RoBERTa on Internet Movies Database Dataset.
Code by M. Siino. 

From the paper: "Text preprocessing worth the time: A comparative survey on the impact of common techniques on NLP model performances." by M.Siino et al.



## Importing modules.

In [None]:
!pip install simpletransformers
!pip install tensorboardx

#import matplotlib.pyplot as plt
import os
import re
import shutil
import string
import tensorflow as tf
import numpy as np
import torch
import nltk
import pandas as pd

from tensorflow.keras import layers
from tensorflow.keras import losses
from tensorflow.keras import preprocessing
from tensorflow.keras.models import Model
from tensorflow.keras.layers.experimental.preprocessing import TextVectorization
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize 
from nltk.stem import PorterStemmer
from textblob import TextBlob
nltk.download('stopwords')
nltk.download('punkt')
from io import open
from pathlib import Path
from simpletransformers.classification import ClassificationModel, ClassificationArgs




  from .autonotebook import tqdm as notebook_tqdm
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\Domenico\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\Domenico\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


## Importing DS and extract in current working directory.

In [None]:
url = "https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz"

dataset = tf.keras.utils.get_file("aclImdb_v1", url,
                                    untar=True, cache_dir='.',
                                    cache_subdir='')

dataset_dir = os.path.join(os.path.dirname(dataset), 'aclImdb')

train_set_dir = os.path.join(dataset_dir, 'train')
test_set_dir = os.path.join(dataset_dir, 'test')

remove_dir = os.path.join(train_set_dir, 'unsup')
shutil.rmtree(remove_dir)

## Building the dataset.

In [None]:
# Generate full randomized training set.
batch_size = 1
seed = 1

train_ds = tf.keras.preprocessing.text_dataset_from_directory(
    'aclImdb/train', 
    batch_size=batch_size,
    shuffle=False,
    seed=seed
    )

test_ds = tf.keras.preprocessing.text_dataset_from_directory(
    'aclImdb/test', 
    batch_size=batch_size,
    shuffle=False,
    seed=seed
    )

train_ds = train_ds.shuffle(25000,seed=1,reshuffle_each_iteration = False)
test_ds = test_ds.shuffle(25000,seed=1,reshuffle_each_iteration = False)

train_ds = train_ds.take(5000)
test_ds = test_ds.take(5000)

train_ds_size=len(train_ds)
test_ds_size=len(test_ds)

Found 25000 files belonging to 2 classes.
Found 25000 files belonging to 2 classes.


## Functions to pre-process source text. (A detailed discussion on our paper)

In [None]:
# Do-Nothing preprocessing function.
def DON(input_data):
  tag_open_CDATA_removed = tf.strings.regex_replace(input_data, '<\!\[CDATA\[', ' ')
  tag_closed_CDATA_removed = tf.strings.regex_replace(tag_open_CDATA_removed,'\]{1,}>', ' ')
  tag_author_lang_en_removed = tf.strings.regex_replace(tag_closed_CDATA_removed,'<author lang="en">', ' ')
  tag_closed_author_removed = tf.strings.regex_replace(tag_author_lang_en_removed,'</author>', ' ')
  tag_open_documents_removed = tf.strings.regex_replace(tag_closed_author_removed,'<documents>\n(\t){0,2}', '')
  output_data = tf.strings.regex_replace(tag_open_documents_removed,'</documents>\n(\t){0,2}', ' ')
  return output_data

# Lowercasing preprocessing function.
def LOW(input_data):  
  return tf.strings.lower(DON(input_data))

# Removing Stop Words function.
def RSW(input_data):
  output_data = DON(input_data)

  #print("\n\nInput data è il seguente tensore:")
  #print(output_data)

  #print("Lo converto in stringa e diventa:")
  # Il seguente try per l'adattamento del ts. Nell'except caso della simulazione vera e propria.
  try:
    input_string=output_data[0]

  # # # # # # # Questo è il caso della chiamata a funzione per la simulazione vera e propria.  
  except:
    #print("\n\n****CASO DELLA SIMULAZIONE VERA E PROPRIA****\n\n")
    #print("\nQuesto è il contenuto di output data in caso di simulazione")
    #print(output_data)
    input_string=output_data
    
    try:
      input_string = input_string.numpy()
    
    except:
      #print("This one is not a tensor!")
      return output_data

    else:
      #print("\nEstraendo il contenuto del tensore risulta:")
      input_string=(str(input_string))[2:-1]

    #print(input_string)
    blob = TextBlob(str(input_string)).words

    outputlist = [word for word in blob if word not in stopwords.words('english')]
    #print("tolte le stopword inglesi diventa:")

    output_string = (' '.join(word for word in outputlist))
    #print(output_string)  

    output_tensor=tf.constant(output_string)
    #print(output_tensor)

    return output_tensor

   # # # # # # # Questo è il caso dell'adattamento del TS.   
  else:
    
    try:

      # input_string = input_string.numpy() [0]
      input_string = input_string.numpy()
    
    except:
      #print("This one is not a tensor!")
      return output_data

    else:
      input_string=(str(input_string))[2:-1]

    #print(input_string)
    blob = TextBlob(str(input_string)).words

    outputlist = [word for word in blob if word not in stopwords.words('english')]
    #print("Tolte le stopword inglesi diventa:")

    output_string = (' '.join(word for word in outputlist))
    #print(output_string)  

    output_tensor=tf.constant([[output_string]])
    #print(output_tensor)

    return output_tensor

  return output_data

# Porter Stemmer preprocessing function.
def STM(input_data):
  output_data = DON(input_data)
  stemmer = PorterStemmer()

  #print("\n\nInput data è il seguente tensore:")
  #print(output_data)

  #print("Lo converto in stringa e diventa:")
  # Il seguente try per l'adattamento del ts. Nell'except caso della simulazione vera e propria.
  try:
    input_string=output_data[0]

  # # # # # # # Questo è il caso della chiamata a funzione per la simulazione vera e propria.  
  except:
    #print("\n\n****CASO DELLA SIMULAZIONE VERA E PROPRIA****\n\n")
    #print("\nQuesto è il contenuto di output data in caso di simulazione")
    #print(output_data)
    input_string=output_data
    
    try:
      input_string = input_string.numpy()
    
    except:
      #print("This one is not a tensor!")
      return output_data

    else:
      #print("\nEstraendo il contenuto del tensore risulta:")
      #print(input_string)
      input_string=(str(input_string))[2:-1]

    #print(input_string)
    blob = TextBlob(str(input_string)).words

    outputlist = [stemmer.stem(word) for word in blob]

    output_string = (' '.join(word for word in outputlist))
    #print(output_string)  

    output_tensor=tf.constant(output_string)
    #print(output_tensor)

    return output_tensor

   # # # # # # # Questo è il caso dell'adattamento del TS.   
  else:
    
    try:
      #input_string = input_string.numpy()[0]
      input_string = input_string.numpy()
      #print(input_string)
    
    except:
      #print("This one is not a tensor!")
      return output_data

    else:
      input_string=(str(input_string))[2:-1]

    #print(input_string)
    blob = TextBlob(str(input_string)).words

    outputlist = [stemmer.stem(word) for word in blob]

    output_string = (' '.join(word for word in outputlist))

    output_tensor=tf.constant([[output_string]])
    #print(output_tensor)

    return output_tensor

  return output_data

## Define the combined preprocessing functions. (The base functions are: DON, LOW, RSW and STM).

In [None]:
## SECTION WITH PAIRS OF PREPRO FUNCTIONS. APPLICATION ORDER MATTERS (...IN FOLLOWING SECTIONS TOO).
#...5
def LOW_RSW(input_data):
  return RSW(LOW(input_data))

# 6
def LOW_STM(input_data):
  return STM(LOW(input_data))

# 7
def RSW_LOW(input_data):
  return LOW(RSW(input_data))

# 8
def RSW_STM(input_data):
  return STM(RSW(input_data))

# 9
def STM_LOW(input_data):
  return LOW(STM(input_data))

# 10
def STM_RSW(input_data):
  return RSW(STM(input_data))
  
# 11
def LOW_STM_RSW(input_data):
  return RSW(STM(LOW(input_data)))

# 12
def LOW_RSW_STM(input_data):
  return STM(RSW(LOW(input_data)))

# 13
def STM_LOW_RSW(input_data):
  return RSW(LOW(STM(input_data)))

# 14
def STM_RSW_LOW(input_data):
  return LOW(RSW(STM(input_data)))

# 15
def RSW_LOW_STM(input_data):
  return STM(LOW(RSW(input_data)))

# 16
def RSW_STM_LOW(input_data):
  return LOW(STM(RSW(input_data)))

## Define a dictionary with -> function_names:prepro_function_caller. And a dictionary to store model results.

In [None]:
model_results = {}
prepro_functions_dict_base = {
    'DON':DON,
    'LOW':LOW,
    'RSW':RSW,
    'STM':STM
    }

# 3 prepro functions = 15 combs...+1 for do_nothing

prepro_functions_dict_comb = {
    # 1. Do nothing 
    #'DON': DON,
    # 2. Lowercasing 
    #'LOW':LOW,
    # 3. Removing Stopwords
    #'RSW':RSW, 
    # 4. Porter Stemming
    #'STM':STM,
    # 5. LOW->RSW
    #'LOW_RSW':LOW_RSW, 
    # 6. LOW->STM
    #'LOW_STM':LOW_STM,
    # 7. RSW->LOW
    #'RSW_LOW':RSW_LOW,
    # 8. RSW->STM
    #'RSW_STM':RSW_STM,
    # 9. STM->LOW
    #'STM_LOW':STM_LOW,
    # 10. STM->RSW
    #'STM_RSW':STM_RSW,
    # 11. LOW->STM->RSW
    'LOW_STM_RSW':LOW_STM_RSW,  
    # 12. LOW->RSW->STM
    'LOW_RSW_STM':LOW_RSW_STM,
    # 13. STM->LOW->RSW
    'STM_LOW_RSW':STM_LOW_RSW,
    # 14. STM->RSW->LOW
    'STM_RSW_LOW':STM_RSW_LOW,
    # 15. RSW->LOW->STM
    'RSW_LOW_STM':RSW_LOW_STM,
    # 16. RSW->STM->LOW
    'RSW_STM_LOW':RSW_STM_LOW
}

for key in prepro_functions_dict_comb:
  print(key)
  model_results[key]=[]

LOW_STM_RSW
LOW_RSW_STM
STM_LOW_RSW
STM_RSW_LOW
RSW_LOW_STM
RSW_STM_LOW


## Function to convert DSs to Pandas Dataframe

In [None]:
def preprocess_and_convert_ds(preprocessing_function):
  # Convert English dataset.
  train_df = [] # will contain text and label
  for element in train_ds:
    authorDocument=element[0]
    label=int(element[1].numpy())
    #print(authorDocument[0])
    text = preprocessing_function(authorDocument[0].numpy()).numpy().decode('UTF-8')
    train_df.append({
        'text':text,
        'label':label
    })
  train_df = pd.DataFrame(train_df)

  test_df = [] # will contain text and label
  for element in test_ds:
    authorDocument=element[0]
    label=int(element[1].numpy())
    #print(authorDocument[0])
    text = preprocessing_function(authorDocument[0].numpy()).numpy().decode('UTF-8')
    test_df.append({
        'text':text,
        'label':label
    })
  test_df = pd.DataFrame(test_df)

  return train_df, test_df


## Print some RAW and preprocessed samples (No need to execute)

In [None]:
for idx, element in enumerate(raw_train_ds_es):
  if idx>1: break
  authorDocument=element[0]
  label=element[1]
  temp = custom_standardization(authorDocument[0].numpy()).numpy().decode('UTF-8')
  print("Not-Preprocessed samples: \n",authorDocument)
  print("Preprocessed samples: \n",temp)

NameError: ignored

## Some parameters definition...

In [None]:
# check gpu
cuda_available = torch.cuda.is_available()

print('Cuda available? ',cuda_available)

num_epochs_per_run = 10
num_runs = 5

Cuda available?  False


## Training and evaluation of the model

In [None]:
for key in prepro_functions_dict_comb:
  model_results[key]=[]

for key in prepro_functions_dict_comb:

  model_args = ClassificationArgs(num_train_epochs=1, 
                                      no_save=True, 
                                      no_cache=True, 
                                      silent=True,
                                      overwrite_output_dir=True)

  model = ClassificationModel("roberta", 
                                  'roberta-base', 
                                  args = model_args, 
                                  num_labels=2, 
                                  use_cuda=cuda_available)

  runs_accuracy = []

  print("\n\n* * * * EVALUATION USING", key, "AS PREPROCESSING FUNCTION * * * *")

  # Preprocess train and test set and convert to DFs.
  train_df,test_df = preprocess_and_convert_ds(prepro_functions_dict_comb[key])
  
  for run in range(1,(num_runs+1)):
    print("\nRUN NUMBER: ", run)
    epochs_accuracy=[]
    model = ClassificationModel("roberta", 
                                  'roberta-base', 
                                  args = model_args, 
                                  num_labels=2, 
                                  use_cuda=cuda_available)
    for epoch in range (0,num_epochs_per_run):
      print("\nEPOCH NUMBER: ", epoch)
      # train model
      print("\nNOW TRAIN THE MODEL.")
      model.train_model(train_df,show_running_loss=False)
      print("\nNOW EVALUATE THE TEST DF.")
      result, model_outputs, wrong_predictions = model.eval_model(test_df)
      # Results on test set.
      print(result)
      correct_predictions = result['tp']+result['tn']
      print("Correct predictions are: ",correct_predictions)
      total_predictions = result['tp']+result['tn']+result['fp']+result['fn']
      print("Total predictions are: ",total_predictions)
      accuracy = correct_predictions/total_predictions
      print("Accuracy on test set is:",accuracy,"\n\n")
      epochs_accuracy.append(accuracy)

    print(epochs_accuracy)
    runs_accuracy.append(max(epochs_accuracy))

  runs_accuracy.sort()
  print("\n\n Over all runs maximum accuracies are:", runs_accuracy)
  print("The median is:",runs_accuracy[2])

  if (runs_accuracy[2]-runs_accuracy[0])>(runs_accuracy[4]-runs_accuracy[2]):
    max_range_from_median = runs_accuracy[2]-runs_accuracy[0]
  else:
    max_range_from_median = runs_accuracy[4]-runs_accuracy[2]
  final_result = str(runs_accuracy[2])+" +/- "+ str(max_range_from_median)
  model_results[key].append(final_result)
  print("RoBERTa Accuracy Score on Test set -> ",model_results[key])


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie



* * * * EVALUATION USING LOW_STM_RSW AS PREPROCESSING FUNCTION * * * *

RUN NUMBER:  1


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5788431922756797, 'tp': 1967, 'tn': 1980, 'fp': 513, 'fn': 540, 'auroc': 0.8476044852191642, 'auprc': 0.8064489798513781, 'eval_loss': 0.4967472278952599}
Correct predictions are:  3947
Total predictions are:  5000
Accuracy on test set is: 0.7894 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.610814481763355, 'tp': 2013, 'tn': 2014, 'fp': 479, 'fn': 494, 'auroc': 0.8855005423242518, 'auprc': 0.881560420836417, 'eval_loss': 0.45714407997131346}
Correct predictions are:  4027
Total predictions are:  5000
Accuracy on test set is: 0.8054 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6029088254792454, 'tp': 2037, 'tn': 1970, 'fp': 523, 'fn': 470, 'auroc': 0.8743909352249322, 'auprc': 0.8624813799902161, 'eval_loss': 0.499165584897995}
Correct predictions are:  4007
Total predictions are:  5000
Accuracy on test set is: 0.8014 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.601574035836702, 'tp': 2088, 'tn': 1913, 'fp': 580, 'fn': 419, 'auroc': 0.8796854567339808, 'auprc': 0.8705082467427135, 'eval_loss': 0.5066797744154931}
Correct predictions are:  4001
Total predictions are:  5000
Accuracy on test set is: 0.8002 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.8385176139780937, 'auprc': 0.835649426335994, 'eval_loss': 0.6926385555267334}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5764832653318739, 'tp': 1785, 'tn': 2140, 'fp': 353, 'fn': 722, 'auroc': 0.8799456987742784, 'auprc': 0.8794104179158257, 'eval_loss': 0.4850290187597275}
Correct predictions are:  3925
Total predictions are:  5000
Accuracy on test set is: 0.785 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6183142945070419, 'tp': 1921, 'tn': 2119, 'fp': 374, 'fn': 586, 'auroc': 0.8895799343066848, 'auprc': 0.8836587855751754, 'eval_loss': 0.46722850370407104}
Correct predictions are:  4040
Total predictions are:  5000
Accuracy on test set is: 0.808 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6193456640325464, 'tp': 1980, 'tn': 2067, 'fp': 426, 'fn': 527, 'auroc': 0.8854769421392265, 'auprc': 0.8736422022504161, 'eval_loss': 0.49887627267837525}
Correct predictions are:  4047
Total predictions are:  5000
Accuracy on test set is: 0.8094 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5726414333861926, 'tp': 2108, 'tn': 1815, 'fp': 678, 'fn': 399, 'auroc': 0.7940355852389884, 'auprc': 0.6809964738638381, 'eval_loss': 0.5129246085524559}
Correct predictions are:  3923
Total predictions are:  5000
Accuracy on test set is: 0.7846 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.24832380586385527, 'tp': 316, 'tn': 2484, 'fp': 9, 'fn': 2191, 'auroc': 0.8120721266454729, 'auprc': 0.8305453091462206, 'eval_loss': 0.6548159215927124}
Correct predictions are:  2800
Total predictions are:  5000
Accuracy on test set is: 0.56 


[0.7894, 0.8054, 0.8014, 0.8002, 0.5014, 0.785, 0.808, 0.8094, 0.7846, 0.56]

RUN NUMBER:  2


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.7387438717519546, 'auprc': 0.7240486862849329, 'eval_loss': 0.6931501873016357}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.4654331269827102, 'tp': 1758, 'tn': 1903, 'fp': 590, 'fn': 749, 'auroc': 0.7812843652694237, 'auprc': 0.7894320527175993, 'eval_loss': 0.6173027813434601}
Correct predictions are:  3661
Total predictions are:  5000
Accuracy on test set is: 0.7322 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5314996719327071, 'tp': 2047, 'tn': 1775, 'fp': 718, 'fn': 460, 'auroc': 0.8335009346473277, 'auprc': 0.7887304635668551, 'eval_loss': 0.5719748695015907}
Correct predictions are:  3822
Total predictions are:  5000
Accuracy on test set is: 0.7644 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5651971233407443, 'tp': 1963, 'tn': 1950, 'fp': 543, 'fn': 544, 'auroc': 0.8417824395743263, 'auprc': 0.7908595491004702, 'eval_loss': 0.5035211671352386}
Correct predictions are:  3913
Total predictions are:  5000
Accuracy on test set is: 0.7826 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5731692548480551, 'tp': 2118, 'tn': 1805, 'fp': 688, 'fn': 389, 'auroc': 0.8656497466940141, 'auprc': 0.8504179416293528, 'eval_loss': 0.5004582851529121}
Correct predictions are:  3923
Total predictions are:  5000
Accuracy on test set is: 0.7846 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5905916600791085, 'tp': 1898, 'tn': 2074, 'fp': 419, 'fn': 609, 'auroc': 0.8307335529510552, 'auprc': 0.7894856918535317, 'eval_loss': 0.5211471720218659}
Correct predictions are:  3972
Total predictions are:  5000
Accuracy on test set is: 0.7944 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5745605144941304, 'tp': 2049, 'tn': 1885, 'fp': 608, 'fn': 458, 'auroc': 0.8756377449999206, 'auprc': 0.8700833916352928, 'eval_loss': 0.47890679894685745}
Correct predictions are:  3934
Total predictions are:  5000
Accuracy on test set is: 0.7868 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.8706261857092961, 'auprc': 0.8684341744847927, 'eval_loss': 0.6721434483528137}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.4731220447048448, 'tp': 1225, 'tn': 2331, 'fp': 162, 'fn': 1282, 'auroc': 0.8507235496726295, 'auprc': 0.8486598627902795, 'eval_loss': 0.5971120788097382}
Correct predictions are:  3556
Total predictions are:  5000
Accuracy on test set is: 0.7112 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5572448170926178, 'tp': 2079, 'tn': 1807, 'fp': 686, 'fn': 428, 'auroc': 0.8228625312422448, 'auprc': 0.8451143995145094, 'eval_loss': 0.5328808748364449}
Correct predictions are:  3886
Total predictions are:  5000
Accuracy on test set is: 0.7772 


[0.4986, 0.7322, 0.7644, 0.7826, 0.7846, 0.7944, 0.7868, 0.5014, 0.7112, 0.7772]

RUN NUMBER:  3


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5456213863995919, 'tp': 1892, 'tn': 1971, 'fp': 522, 'fn': 615, 'auroc': 0.8511962733787833, 'auprc': 0.8422825163985712, 'eval_loss': 0.5221590461134911}
Correct predictions are:  3863
Total predictions are:  5000
Accuracy on test set is: 0.7726 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5780180899804436, 'tp': 1971, 'tn': 1974, 'fp': 519, 'fn': 536, 'auroc': 0.8545858199528285, 'auprc': 0.8638834807728191, 'eval_loss': 0.49792022869586944}
Correct predictions are:  3945
Total predictions are:  5000
Accuracy on test set is: 0.789 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6128087384971348, 'tp': 2133, 'tn': 1893, 'fp': 600, 'fn': 374, 'auroc': 0.8922902755557605, 'auprc': 0.8860471784357344, 'eval_loss': 0.47173399707078933}
Correct predictions are:  4026
Total predictions are:  5000
Accuracy on test set is: 0.8052 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6126835275831733, 'tp': 2063, 'tn': 1968, 'fp': 525, 'fn': 444, 'auroc': 0.8875998387827362, 'auprc': 0.886039631451158, 'eval_loss': 0.4811041169643402}
Correct predictions are:  4031
Total predictions are:  5000
Accuracy on test set is: 0.8062 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5572744132830292, 'tp': 2306, 'tn': 1520, 'fp': 973, 'fn': 201, 'auroc': 0.8143471844819263, 'auprc': 0.7170837449245193, 'eval_loss': 0.5357108689308167}
Correct predictions are:  3826
Total predictions are:  5000
Accuracy on test set is: 0.7652 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5446884582010985, 'tp': 1566, 'tn': 2242, 'fp': 251, 'fn': 941, 'auroc': 0.8238550190233491, 'auprc': 0.7788293514395045, 'eval_loss': 0.5523071698665619}
Correct predictions are:  3808
Total predictions are:  5000
Accuracy on test set is: 0.7616 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5908026606581248, 'tp': 2004, 'tn': 1973, 'fp': 520, 'fn': 503, 'auroc': 0.859097135321541, 'auprc': 0.849492038352137, 'eval_loss': 0.5402330477833748}
Correct predictions are:  3977
Total predictions are:  5000
Accuracy on test set is: 0.7954 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.36164774494325863, 'tp': 688, 'tn': 2447, 'fp': 46, 'fn': 1819, 'auroc': 0.8600676229301638, 'auprc': 0.8542565918681964, 'eval_loss': 0.6207323719501495}
Correct predictions are:  3135
Total predictions are:  5000
Accuracy on test set is: 0.627 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5045344506008772, 'tp': 2302, 'tn': 1373, 'fp': 1120, 'fn': 205, 'auroc': 0.8648362203159672, 'auprc': 0.8681241395813539, 'eval_loss': 0.5943000181436539}
Correct predictions are:  3675
Total predictions are:  5000
Accuracy on test set is: 0.735 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.8470952012263775, 'auprc': 0.8441603475187832, 'eval_loss': 0.6918931363105774}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 


[0.7726, 0.789, 0.8052, 0.8062, 0.7652, 0.7616, 0.7954, 0.627, 0.735, 0.5014]

RUN NUMBER:  4


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.5948323434855729, 'auprc': 0.599402759678786, 'eval_loss': 0.6931870680809021}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.6657967398464404, 'auprc': 0.6580587701014258, 'eval_loss': 0.6931651575088501}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.6868522649217569, 'auprc': 0.6782883294948878, 'eval_loss': 0.693401675415039}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.6968274631273109, 'auprc': 0.6807751040366425, 'eval_loss': 0.6934151350975036}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.3108471495257449, 'tp': 1629, 'tn': 1648, 'fp': 845, 'fn': 878, 'auroc': 0.7206128496047408, 'auprc': 0.7029579959081317, 'eval_loss': 0.6574826245307922}
Correct predictions are:  3277
Total predictions are:  5000
Accuracy on test set is: 0.6554 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.3076632044934192, 'tp': 2395, 'tn': 668, 'fp': 1825, 'fn': 112, 'auroc': 0.7430200652773118, 'auprc': 0.6854994040228513, 'eval_loss': 0.6676036633491517}
Correct predictions are:  3063
Total predictions are:  5000
Accuracy on test set is: 0.6126 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5057004299564735, 'tp': 1836, 'tn': 1927, 'fp': 566, 'fn': 671, 'auroc': 0.8292672214550162, 'auprc': 0.8255753555348408, 'eval_loss': 0.5592593445301056}
Correct predictions are:  3763
Total predictions are:  5000
Accuracy on test set is: 0.7526 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5570985321654517, 'tp': 2057, 'tn': 1831, 'fp': 662, 'fn': 450, 'auroc': 0.8620912387953121, 'auprc': 0.8606398714997509, 'eval_loss': 0.5079363450407982}
Correct predictions are:  3888
Total predictions are:  5000
Accuracy on test set is: 0.7776 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5018517356153576, 'tp': 1778, 'tn': 1972, 'fp': 521, 'fn': 729, 'auroc': 0.8213752395818782, 'auprc': 0.8288915032354714, 'eval_loss': 0.5563921894550323}
Correct predictions are:  3750
Total predictions are:  5000
Accuracy on test set is: 0.75 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5523503159329692, 'tp': 1832, 'tn': 2043, 'fp': 450, 'fn': 675, 'auroc': 0.853033407781917, 'auprc': 0.8561772897071941, 'eval_loss': 0.5296208690166473}
Correct predictions are:  3875
Total predictions are:  5000
Accuracy on test set is: 0.775 


[0.4986, 0.5014, 0.4986, 0.4986, 0.6554, 0.6126, 0.7526, 0.7776, 0.75, 0.775]

RUN NUMBER:  5


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6692005974148695, 'tp': 2091, 'tn': 2082, 'fp': 411, 'fn': 416, 'auroc': 0.917220951012256, 'auprc': 0.9132314675245082, 'eval_loss': 0.3686439800918102}
Correct predictions are:  4173
Total predictions are:  5000
Accuracy on test set is: 0.8346 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6910185707743357, 'tp': 2189, 'tn': 2036, 'fp': 457, 'fn': 318, 'auroc': 0.919273287102571, 'auprc': 0.9139301183758481, 'eval_loss': 0.433151958373189}
Correct predictions are:  4225
Total predictions are:  5000
Accuracy on test set is: 0.845 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6753412086110636, 'tp': 2052, 'tn': 2135, 'fp': 358, 'fn': 455, 'auroc': 0.9210755412322432, 'auprc': 0.9157985427165687, 'eval_loss': 0.5353094352990388}
Correct predictions are:  4187
Total predictions are:  5000
Accuracy on test set is: 0.8374 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6864132214796564, 'tp': 2108, 'tn': 2108, 'fp': 385, 'fn': 399, 'auroc': 0.9233286788968427, 'auprc': 0.9199330409033932, 'eval_loss': 0.5908059426635504}
Correct predictions are:  4216
Total predictions are:  5000
Accuracy on test set is: 0.8432 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6780908152780556, 'tp': 2218, 'tn': 1970, 'fp': 523, 'fn': 289, 'auroc': 0.9228712353104848, 'auprc': 0.9208907499528706, 'eval_loss': 0.6492062390334904}
Correct predictions are:  4188
Total predictions are:  5000
Accuracy on test set is: 0.8376 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6854092099885071, 'tp': 2084, 'tn': 2129, 'fp': 364, 'fn': 423, 'auroc': 0.9253464547162049, 'auprc': 0.9214880287635536, 'eval_loss': 0.6611487139694393}
Correct predictions are:  4213
Total predictions are:  5000
Accuracy on test set is: 0.8426 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6849350037970701, 'tp': 2179, 'tn': 2031, 'fp': 462, 'fn': 328, 'auroc': 0.9229901162425114, 'auprc': 0.919377552079525, 'eval_loss': 0.736071564488858}
Correct predictions are:  4210
Total predictions are:  5000
Accuracy on test set is: 0.842 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6781307873104901, 'tp': 2131, 'tn': 2064, 'fp': 429, 'fn': 376, 'auroc': 0.9203492155378499, 'auprc': 0.9164066348722227, 'eval_loss': 0.73970800871104}
Correct predictions are:  4195
Total predictions are:  5000
Accuracy on test set is: 0.839 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6761201247507449, 'tp': 2024, 'tn': 2163, 'fp': 330, 'fn': 483, 'auroc': 0.9214997845583109, 'auprc': 0.917161593724703, 'eval_loss': 0.8749300906583667}
Correct predictions are:  4187
Total predictions are:  5000
Accuracy on test set is: 0.8374 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6775585581192461, 'tp': 2146, 'tn': 2047, 'fp': 446, 'fn': 361, 'auroc': 0.9088940057290048, 'auprc': 0.9103466381028469, 'eval_loss': 0.7522753954455257}
Correct predictions are:  4193
Total predictions are:  5000
Accuracy on test set is: 0.8386 


[0.8346, 0.845, 0.8374, 0.8432, 0.8376, 0.8426, 0.842, 0.839, 0.8374, 0.8386]


 Over all runs maximum accuracies are: [0.7776, 0.7944, 0.8062, 0.8094, 0.845]
The median is: 0.8062
RoBERTa Accuracy Score on Test set ->  ['0.8062 +/- 0.038799999999999946']


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie



* * * * EVALUATION USING LOW_RSW_STM AS PREPROCESSING FUNCTION * * * *

RUN NUMBER:  1


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.7185222732146219, 'auprc': 0.6973399118055621, 'eval_loss': 0.6939784601211548}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.5998827030803922, 'auprc': 0.6113053512459055, 'eval_loss': 0.6933089496612549}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.7242470380967786, 'auprc': 0.7059718339723986, 'eval_loss': 0.6932416665077209}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.6646261706691782, 'auprc': 0.6532196881033109, 'eval_loss': 0.6931681035041809}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.44128177964915244, 'auprc': 0.4641411084264193, 'eval_loss': 0.6932411531448365}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.4841313155895142, 'auprc': 0.49850918547985634, 'eval_loss': 0.6932428451538086}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.5849249858118888, 'auprc': 0.5793836904185489, 'eval_loss': 0.6951537980079651}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.47815886876553115, 'auprc': 0.4865851544427301, 'eval_loss': 0.6931788876533508}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.4798386419349528, 'auprc': 0.48794417493361064, 'eval_loss': 0.6934553374290466}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.5305941598582133, 'auprc': 0.5287344317267568, 'eval_loss': 0.6931493614196778}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 


[0.4986, 0.4986, 0.4986, 0.4986, 0.5014, 0.4986, 0.4986, 0.4986, 0.5014, 0.5014]

RUN NUMBER:  2


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6767857136107553, 'tp': 2164, 'tn': 2026, 'fp': 467, 'fn': 343, 'auroc': 0.9176259141871672, 'auprc': 0.9148145986069293, 'eval_loss': 0.38454319678843024}
Correct predictions are:  4190
Total predictions are:  5000
Accuracy on test set is: 0.838 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6884337588984923, 'tp': 2132, 'tn': 2089, 'fp': 404, 'fn': 375, 'auroc': 0.9248037304612469, 'auprc': 0.9234015875555115, 'eval_loss': 0.3943739936470985}
Correct predictions are:  4221
Total predictions are:  5000
Accuracy on test set is: 0.8442 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6864843856422886, 'tp': 2069, 'tn': 2146, 'fp': 347, 'fn': 438, 'auroc': 0.9240840448189113, 'auprc': 0.9230926142915344, 'eval_loss': 0.5212295402266085}
Correct predictions are:  4215
Total predictions are:  5000
Accuracy on test set is: 0.843 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6764466982320946, 'tp': 2089, 'tn': 2102, 'fp': 391, 'fn': 418, 'auroc': 0.9213465033565863, 'auprc': 0.9202049462551513, 'eval_loss': 0.5913312573999167}
Correct predictions are:  4191
Total predictions are:  5000
Accuracy on test set is: 0.8382 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.

NOW EVALUATE THE TEST DF.




{'mcc': 0.6719725242168703, 'tp': 2140, 'tn': 2039, 'fp': 454, 'fn': 367, 'auroc': 0.9199872927003748, 'auprc': 0.9193783686652264, 'eval_loss': 0.6750512220859528}
Correct predictions are:  4179
Total predictions are:  5000
Accuracy on test set is: 0.8358 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6699553247127821, 'tp': 2198, 'tn': 1971, 'fp': 522, 'fn': 309, 'auroc': 0.9171852707325227, 'auprc': 0.9164294369634045, 'eval_loss': 0.7038241250485182}
Correct predictions are:  4169
Total predictions are:  5000
Accuracy on test set is: 0.8338 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6713035059883145, 'tp': 2120, 'tn': 2058, 'fp': 435, 'fn': 387, 'auroc': 0.9194933688280115, 'auprc': 0.9194484257904292, 'eval_loss': 0.7182616240404546}
Correct predictions are:  4178
Total predictions are:  5000
Accuracy on test set is: 0.8356 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6792374067046777, 'tp': 2094, 'tn': 2104, 'fp': 389, 'fn': 413, 'auroc': 0.9224565120590545, 'auprc': 0.9210023553650031, 'eval_loss': 0.69751340399459}
Correct predictions are:  4198
Total predictions are:  5000
Accuracy on test set is: 0.8396 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6490351026946903, 'tp': 2036, 'tn': 2086, 'fp': 407, 'fn': 471, 'auroc': 0.8853593412172351, 'auprc': 0.8136312646991207, 'eval_loss': 0.756492348754406}
Correct predictions are:  4122
Total predictions are:  5000
Accuracy on test set is: 0.8244 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6801381562616967, 'tp': 2158, 'tn': 2041, 'fp': 452, 'fn': 349, 'auroc': 0.9206025775242079, 'auprc': 0.9178182442113946, 'eval_loss': 0.6463912157192826}
Correct predictions are:  4199
Total predictions are:  5000
Accuracy on test set is: 0.8398 


[0.838, 0.8442, 0.843, 0.8382, 0.8358, 0.8338, 0.8356, 0.8396, 0.8244, 0.8398]

RUN NUMBER:  3


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.47169177806354, 'auprc': 0.4782308912988057, 'eval_loss': 0.6934088620185852}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.6249413795404155, 'auprc': 0.6404788351003298, 'eval_loss': 0.6936044204711914}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.6452252985663408, 'auprc': 0.6489933979445196, 'eval_loss': 0.6932403461456299}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.6596653317762011, 'auprc': 0.6711659755075707, 'eval_loss': 0.6933888234138489}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.4131927426340519, 'tp': 1100, 'tn': 2301, 'fp': 192, 'fn': 1407, 'auroc': 0.8112494801959247, 'auprc': 0.7682041576408805, 'eval_loss': 0.6090177876472473}
Correct predictions are:  3401
Total predictions are:  5000
Accuracy on test set is: 0.6802 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.014183207587881927, 'tp': 2507, 'tn': 1, 'fp': 2492, 'fn': 0, 'auroc': 0.8279443310835557, 'auprc': 0.8278200218610499, 'eval_loss': 0.6942785036087036}
Correct predictions are:  2508
Total predictions are:  5000
Accuracy on test set is: 0.5016 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5183108581587713, 'tp': 2089, 'tn': 1692, 'fp': 801, 'fn': 418, 'auroc': 0.8389625774666072, 'auprc': 0.8311705102663369, 'eval_loss': 0.5407608910083771}
Correct predictions are:  3781
Total predictions are:  5000
Accuracy on test set is: 0.7562 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5002292573447624, 'tp': 1668, 'tn': 2065, 'fp': 428, 'fn': 839, 'auroc': 0.7992739463077392, 'auprc': 0.7648410647281249, 'eval_loss': 0.5308736925601959}
Correct predictions are:  3733
Total predictions are:  5000
Accuracy on test set is: 0.7466 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.536493238159621, 'tp': 1807, 'tn': 2028, 'fp': 465, 'fn': 700, 'auroc': 0.8431188500517844, 'auprc': 0.808087678454931, 'eval_loss': 0.537080529665947}
Correct predictions are:  3835
Total predictions are:  5000
Accuracy on test set is: 0.767 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5436325725582947, 'tp': 1693, 'tn': 2142, 'fp': 351, 'fn': 814, 'auroc': 0.8567321567801093, 'auprc': 0.8387409064943243, 'eval_loss': 0.5130384755969047}
Correct predictions are:  3835
Total predictions are:  5000
Accuracy on test set is: 0.767 


[0.5014, 0.5014, 0.4986, 0.4986, 0.6802, 0.5016, 0.7562, 0.7466, 0.767, 0.767]

RUN NUMBER:  4


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.657524931430951, 'tp': 2021, 'tn': 2121, 'fp': 372, 'fn': 486, 'auroc': 0.915007813661259, 'auprc': 0.9111851964514655, 'eval_loss': 0.38529835928976536}
Correct predictions are:  4142
Total predictions are:  5000
Accuracy on test set is: 0.8284 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6716192057065221, 'tp': 2088, 'tn': 2091, 'fp': 402, 'fn': 419, 'auroc': 0.9154388570406393, 'auprc': 0.9128549959771651, 'eval_loss': 0.4192575818657875}
Correct predictions are:  4179
Total predictions are:  5000
Accuracy on test set is: 0.8358 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6704081520049112, 'tp': 2140, 'tn': 2035, 'fp': 458, 'fn': 367, 'auroc': 0.9144377291917968, 'auprc': 0.9041390270824239, 'eval_loss': 0.474289126291126}
Correct predictions are:  4175
Total predictions are:  5000
Accuracy on test set is: 0.835 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6723968149756311, 'tp': 2099, 'tn': 2082, 'fp': 411, 'fn': 408, 'auroc': 0.9171763106622757, 'auprc': 0.9180816122474251, 'eval_loss': 0.5599486022986472}
Correct predictions are:  4181
Total predictions are:  5000
Accuracy on test set is: 0.8362 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6758834171192082, 'tp': 2159, 'tn': 2029, 'fp': 464, 'fn': 348, 'auroc': 0.9209370601465516, 'auprc': 0.9178751029758275, 'eval_loss': 0.6321008301094174}
Correct predictions are:  4188
Total predictions are:  5000
Accuracy on test set is: 0.8376 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6642164701804718, 'tp': 2157, 'tn': 2001, 'fp': 492, 'fn': 350, 'auroc': 0.915783499742638, 'auprc': 0.9073484375845111, 'eval_loss': 0.7456467145428062}
Correct predictions are:  4158
Total predictions are:  5000
Accuracy on test set is: 0.8316 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6652212646171315, 'tp': 2101, 'tn': 2062, 'fp': 431, 'fn': 406, 'auroc': 0.9186846424875972, 'auprc': 0.9162559334244814, 'eval_loss': 0.7041259019546211}
Correct predictions are:  4163
Total predictions are:  5000
Accuracy on test set is: 0.8326 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6677735251975231, 'tp': 2122, 'tn': 2047, 'fp': 446, 'fn': 385, 'auroc': 0.9196461700259729, 'auprc': 0.9160246396857654, 'eval_loss': 0.7863253000751138}
Correct predictions are:  4169
Total predictions are:  5000
Accuracy on test set is: 0.8338 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6652237110733109, 'tp': 2079, 'tn': 2084, 'fp': 409, 'fn': 428, 'auroc': 0.918652962239224, 'auprc': 0.910768989309595, 'eval_loss': 0.8018274288482964}
Correct predictions are:  4163
Total predictions are:  5000
Accuracy on test set is: 0.8326 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6573536752550109, 'tp': 2053, 'tn': 2090, 'fp': 403, 'fn': 454, 'auroc': 0.9145629301733725, 'auprc': 0.9132643057460345, 'eval_loss': 0.7759170119985938}
Correct predictions are:  4143
Total predictions are:  5000
Accuracy on test set is: 0.8286 


[0.8284, 0.8358, 0.835, 0.8362, 0.8376, 0.8316, 0.8326, 0.8338, 0.8326, 0.8286]

RUN NUMBER:  5


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6719858905983104, 'tp': 2201, 'tn': 1973, 'fp': 520, 'fn': 306, 'auroc': 0.9166055861877958, 'auprc': 0.9140213398274534, 'eval_loss': 0.37490652037262917}
Correct predictions are:  4174
Total predictions are:  5000
Accuracy on test set is: 0.8348 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6973113475870232, 'tp': 2193, 'tn': 2048, 'fp': 445, 'fn': 314, 'auroc': 0.9276730329565784, 'auprc': 0.9252352508362005, 'eval_loss': 0.3954717468559742}
Correct predictions are:  4241
Total predictions are:  5000
Accuracy on test set is: 0.8482 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6809002535474401, 'tp': 2173, 'tn': 2027, 'fp': 466, 'fn': 334, 'auroc': 0.9222929907770477, 'auprc': 0.9199737945932324, 'eval_loss': 0.45968432557284833}
Correct predictions are:  4200
Total predictions are:  5000
Accuracy on test set is: 0.84 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6914411870557039, 'tp': 2176, 'tn': 2051, 'fp': 442, 'fn': 331, 'auroc': 0.9196023296822647, 'auprc': 0.9004130996274343, 'eval_loss': 0.5546729481294751}
Correct predictions are:  4227
Total predictions are:  5000
Accuracy on test set is: 0.8454 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6991116386786982, 'tp': 2188, 'tn': 2058, 'fp': 435, 'fn': 319, 'auroc': 0.9277577536207884, 'auprc': 0.9232120274978123, 'eval_loss': 0.5691103402525186}
Correct predictions are:  4246
Total predictions are:  5000
Accuracy on test set is: 0.8492 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6878380021036412, 'tp': 2085, 'tn': 2134, 'fp': 359, 'fn': 422, 'auroc': 0.925550616316832, 'auprc': 0.9228655927155467, 'eval_loss': 0.6531652946494519}
Correct predictions are:  4219
Total predictions are:  5000
Accuracy on test set is: 0.8438 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6952037323281696, 'tp': 2133, 'tn': 2105, 'fp': 388, 'fn': 374, 'auroc': 0.9265137438677519, 'auprc': 0.922623791429488, 'eval_loss': 0.6536908151656389}
Correct predictions are:  4238
Total predictions are:  5000
Accuracy on test set is: 0.8476 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6872125331249795, 'tp': 2126, 'tn': 2092, 'fp': 401, 'fn': 381, 'auroc': 0.9271517488697111, 'auprc': 0.9221006263840514, 'eval_loss': 0.6768644362777472}
Correct predictions are:  4218
Total predictions are:  5000
Accuracy on test set is: 0.8436 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6984016277909514, 'tp': 2136, 'tn': 2110, 'fp': 383, 'fn': 371, 'auroc': 0.9262049414467409, 'auprc': 0.9228371054137269, 'eval_loss': 0.7115546334207058}
Correct predictions are:  4246
Total predictions are:  5000
Accuracy on test set is: 0.8492 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6816787411027623, 'tp': 2130, 'tn': 2074, 'fp': 419, 'fn': 377, 'auroc': 0.9229337158003318, 'auprc': 0.9216857069625335, 'eval_loss': 0.7721647175081074}
Correct predictions are:  4204
Total predictions are:  5000
Accuracy on test set is: 0.8408 


[0.8348, 0.8482, 0.84, 0.8454, 0.8492, 0.8438, 0.8476, 0.8436, 0.8492, 0.8408]


 Over all runs maximum accuracies are: [0.5014, 0.767, 0.8376, 0.8442, 0.8492]
The median is: 0.8376
RoBERTa Accuracy Score on Test set ->  ['0.8376 +/- 0.33620000000000005']


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie



* * * * EVALUATION USING STM_LOW_RSW AS PREPROCESSING FUNCTION * * * *

RUN NUMBER:  1


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.7694698726438015, 'auprc': 0.7680226832814385, 'eval_loss': 0.693040403175354}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5611139843246149, 'tp': 2062, 'tn': 1836, 'fp': 657, 'fn': 445, 'auroc': 0.8534763712547507, 'auprc': 0.8345949734040722, 'eval_loss': 0.49154411554336547}
Correct predictions are:  3898
Total predictions are:  5000
Accuracy on test set is: 0.7796 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.024570951821620956, 'tp': 2507, 'tn': 3, 'fp': 2490, 'fn': 0, 'auroc': 0.8656589467661426, 'auprc': 0.8600016705079611, 'eval_loss': 0.6929204022407531}
Correct predictions are:  2510
Total predictions are:  5000
Accuracy on test set is: 0.502 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.576793537418391, 'tp': 2082, 'tn': 1855, 'fp': 638, 'fn': 425, 'auroc': 0.8701213017510057, 'auprc': 0.869053632466648, 'eval_loss': 0.49784986329078673}
Correct predictions are:  3937
Total predictions are:  5000
Accuracy on test set is: 0.7874 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5957859972206825, 'tp': 2133, 'tn': 1848, 'fp': 645, 'fn': 374, 'auroc': 0.8505760285160636, 'auprc': 0.7846985233614316, 'eval_loss': 0.5029361195325851}
Correct predictions are:  3981
Total predictions are:  5000
Accuracy on test set is: 0.7962 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5892859429148208, 'tp': 1860, 'tn': 2105, 'fp': 388, 'fn': 647, 'auroc': 0.8769575153469202, 'auprc': 0.8716656763358623, 'eval_loss': 0.5094379117012023}
Correct predictions are:  3965
Total predictions are:  5000
Accuracy on test set is: 0.793 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5841473290247554, 'tp': 1961, 'tn': 1999, 'fp': 494, 'fn': 546, 'auroc': 0.8798834582863131, 'auprc': 0.8732136767162277, 'eval_loss': 0.52293787753582}
Correct predictions are:  3960
Total predictions are:  5000
Accuracy on test set is: 0.792 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5980734931997863, 'tp': 2027, 'tn': 1968, 'fp': 525, 'fn': 480, 'auroc': 0.877978243349428, 'auprc': 0.875727276650503, 'eval_loss': 0.4886719374895096}
Correct predictions are:  3995
Total predictions are:  5000
Accuracy on test set is: 0.799 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5948287838337353, 'tp': 2129, 'tn': 1850, 'fp': 643, 'fn': 378, 'auroc': 0.8418738002905942, 'auprc': 0.8011635029619748, 'eval_loss': 0.5221679372668266}
Correct predictions are:  3979
Total predictions are:  5000
Accuracy on test set is: 0.7958 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.4049315796919848, 'tp': 820, 'tn': 2443, 'fp': 50, 'fn': 1687, 'auroc': 0.8548971023932828, 'auprc': 0.8541517880857226, 'eval_loss': 0.5997709676742554}
Correct predictions are:  3263
Total predictions are:  5000
Accuracy on test set is: 0.6526 


[0.4986, 0.7796, 0.502, 0.7874, 0.7962, 0.793, 0.792, 0.799, 0.7958, 0.6526]

RUN NUMBER:  2


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5978402766395451, 'tp': 1918, 'tn': 2073, 'fp': 420, 'fn': 589, 'auroc': 0.8798843382932122, 'auprc': 0.8765659895502431, 'eval_loss': 0.44940892201662064}
Correct predictions are:  3991
Total predictions are:  5000
Accuracy on test set is: 0.7982 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5751371448472052, 'tp': 1935, 'tn': 2002, 'fp': 491, 'fn': 572, 'auroc': 0.8604313057814373, 'auprc': 0.8701573297991239, 'eval_loss': 0.48475395209789274}
Correct predictions are:  3937
Total predictions are:  5000
Accuracy on test set is: 0.7874 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5731762477106724, 'tp': 1824, 'tn': 2099, 'fp': 394, 'fn': 683, 'auroc': 0.8707005062919693, 'auprc': 0.8723925659917809, 'eval_loss': 0.48930418701171874}
Correct predictions are:  3923
Total predictions are:  5000
Accuracy on test set is: 0.7846 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6287252896369608, 'tp': 2085, 'tn': 1986, 'fp': 507, 'fn': 422, 'auroc': 0.8935395653501924, 'auprc': 0.8877933736334127, 'eval_loss': 0.4643405649423599}
Correct predictions are:  4071
Total predictions are:  5000
Accuracy on test set is: 0.8142 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.607156769100536, 'tp': 2180, 'tn': 1824, 'fp': 669, 'fn': 327, 'auroc': 0.8847918967684707, 'auprc': 0.8834191591349159, 'eval_loss': 0.500898078930378}
Correct predictions are:  4004
Total predictions are:  5000
Accuracy on test set is: 0.8008 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6129943113945898, 'tp': 2057, 'tn': 1975, 'fp': 518, 'fn': 450, 'auroc': 0.880898906247425, 'auprc': 0.8645322294789264, 'eval_loss': 0.5128700563192368}
Correct predictions are:  4032
Total predictions are:  5000
Accuracy on test set is: 0.8064 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6104160252162643, 'tp': 2200, 'tn': 1809, 'fp': 684, 'fn': 307, 'auroc': 0.8933296437044067, 'auprc': 0.892508413125443, 'eval_loss': 0.48922437419891357}
Correct predictions are:  4009
Total predictions are:  5000
Accuracy on test set is: 0.8018 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6056658213404877, 'tp': 1826, 'tn': 2172, 'fp': 321, 'fn': 681, 'auroc': 0.873680289653471, 'auprc': 0.8220516910518259, 'eval_loss': 0.5156546618163586}
Correct predictions are:  3998
Total predictions are:  5000
Accuracy on test set is: 0.7996 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6080936189487436, 'tp': 1913, 'tn': 2102, 'fp': 391, 'fn': 594, 'auroc': 0.8868095925872058, 'auprc': 0.8885755442691026, 'eval_loss': 0.503600704151392}
Correct predictions are:  4015
Total predictions are:  5000
Accuracy on test set is: 0.803 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.61238513326619, 'tp': 2132, 'tn': 1893, 'fp': 600, 'fn': 375, 'auroc': 0.8932450030408239, 'auprc': 0.892098644292268, 'eval_loss': 0.5293065720081329}
Correct predictions are:  4025
Total predictions are:  5000
Accuracy on test set is: 0.805 


[0.7982, 0.7874, 0.7846, 0.8142, 0.8008, 0.8064, 0.8018, 0.7996, 0.803, 0.805]

RUN NUMBER:  3


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5035449091835128, 'tp': 2096, 'tn': 1644, 'fp': 849, 'fn': 411, 'auroc': 0.8359604739301156, 'auprc': 0.8363231744756056, 'eval_loss': 0.5241886733412743}
Correct predictions are:  3740
Total predictions are:  5000
Accuracy on test set is: 0.748 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5960091283751386, 'tp': 1996, 'tn': 1994, 'fp': 499, 'fn': 511, 'auroc': 0.8743703750637404, 'auprc': 0.8625237180609104, 'eval_loss': 0.44965520169734957}
Correct predictions are:  3990
Total predictions are:  5000
Accuracy on test set is: 0.798 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6250371356986798, 'tp': 1983, 'tn': 2078, 'fp': 415, 'fn': 524, 'auroc': 0.8901621788714824, 'auprc': 0.8900350283840313, 'eval_loss': 0.44934432402849195}
Correct predictions are:  4061
Total predictions are:  5000
Accuracy on test set is: 0.8122 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6098637967197555, 'tp': 2176, 'tn': 1836, 'fp': 657, 'fn': 331, 'auroc': 0.8927217189382765, 'auprc': 0.8878934793742446, 'eval_loss': 0.4782244719445705}
Correct predictions are:  4012
Total predictions are:  5000
Accuracy on test set is: 0.8024 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6254088434316969, 'tp': 2008, 'tn': 2055, 'fp': 438, 'fn': 499, 'auroc': 0.8981792017249415, 'auprc': 0.897301631080328, 'eval_loss': 0.5002843851625919}
Correct predictions are:  4063
Total predictions are:  5000
Accuracy on test set is: 0.8126 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6243280602951208, 'tp': 2193, 'tn': 1855, 'fp': 638, 'fn': 314, 'auroc': 0.8890508101583516, 'auprc': 0.8957019971423597, 'eval_loss': 0.47456613247394563}
Correct predictions are:  4048
Total predictions are:  5000
Accuracy on test set is: 0.8096 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6354452461422766, 'tp': 2164, 'tn': 1918, 'fp': 575, 'fn': 343, 'auroc': 0.9004140992465381, 'auprc': 0.9006336669160403, 'eval_loss': 0.44320955239534376}
Correct predictions are:  4082
Total predictions are:  5000
Accuracy on test set is: 0.8164 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6378688502135614, 'tp': 2175, 'tn': 1912, 'fp': 581, 'fn': 332, 'auroc': 0.9007101815678235, 'auprc': 0.8999529566729885, 'eval_loss': 0.5030661512970924}
Correct predictions are:  4087
Total predictions are:  5000
Accuracy on test set is: 0.8174 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6450751294641638, 'tp': 2148, 'tn': 1961, 'fp': 532, 'fn': 359, 'auroc': 0.8832964450441291, 'auprc': 0.8569932594973626, 'eval_loss': 0.537348651945591}
Correct predictions are:  4109
Total predictions are:  5000
Accuracy on test set is: 0.8218 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6272983445411985, 'tp': 2193, 'tn': 1863, 'fp': 630, 'fn': 314, 'auroc': 0.8962231863897813, 'auprc': 0.894783026392296, 'eval_loss': 0.5012600041806697}
Correct predictions are:  4056
Total predictions are:  5000
Accuracy on test set is: 0.8112 


[0.748, 0.798, 0.8122, 0.8024, 0.8126, 0.8096, 0.8164, 0.8174, 0.8218, 0.8112]

RUN NUMBER:  4


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5736751024407426, 'tp': 1879, 'tn': 2051, 'fp': 442, 'fn': 628, 'auroc': 0.8681472862747244, 'auprc': 0.8664474766135236, 'eval_loss': 0.4597698366045952}
Correct predictions are:  3930
Total predictions are:  5000
Accuracy on test set is: 0.786 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6090980596598636, 'tp': 1931, 'tn': 2088, 'fp': 405, 'fn': 576, 'auroc': 0.8810761076366839, 'auprc': 0.87729778039249, 'eval_loss': 0.46762230302095414}
Correct predictions are:  4019
Total predictions are:  5000
Accuracy on test set is: 0.8038 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5940617885125213, 'tp': 2121, 'tn': 1857, 'fp': 636, 'fn': 386, 'auroc': 0.8806913846204554, 'auprc': 0.8698641954433001, 'eval_loss': 0.47400425913333893}
Correct predictions are:  3978
Total predictions are:  5000
Accuracy on test set is: 0.7956 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6197354645841326, 'tp': 1950, 'tn': 2096, 'fp': 397, 'fn': 557, 'auroc': 0.8890603302329891, 'auprc': 0.8682849769798096, 'eval_loss': 0.45246393384337424}
Correct predictions are:  4046
Total predictions are:  5000
Accuracy on test set is: 0.8092 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5899018637868295, 'tp': 2113, 'tn': 1855, 'fp': 638, 'fn': 394, 'auroc': 0.8807701852382521, 'auprc': 0.8741716264808583, 'eval_loss': 0.4896265535593033}
Correct predictions are:  3968
Total predictions are:  5000
Accuracy on test set is: 0.7936 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6102583648287396, 'tp': 1947, 'tn': 2076, 'fp': 417, 'fn': 560, 'auroc': 0.8544293387260157, 'auprc': 0.7686049054100234, 'eval_loss': 0.49131767444610597}
Correct predictions are:  4023
Total predictions are:  5000
Accuracy on test set is: 0.8046 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6227975150234389, 'tp': 2035, 'tn': 2022, 'fp': 471, 'fn': 472, 'auroc': 0.8911257864261655, 'auprc': 0.8910263021697378, 'eval_loss': 0.49963321187496185}
Correct predictions are:  4057
Total predictions are:  5000
Accuracy on test set is: 0.8114 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.554746563277546, 'tp': 2341, 'tn': 1461, 'fp': 1032, 'fn': 166, 'auroc': 0.738066266439529, 'auprc': 0.6153007616755659, 'eval_loss': 0.5321348212599755}
Correct predictions are:  3802
Total predictions are:  5000
Accuracy on test set is: 0.7604 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6000163201716959, 'tp': 1999, 'tn': 2001, 'fp': 492, 'fn': 508, 'auroc': 0.8257623139765415, 'auprc': 0.8572094213412352, 'eval_loss': 0.5423527262926102}
Correct predictions are:  4000
Total predictions are:  5000
Accuracy on test set is: 0.8 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6254226858828559, 'tp': 1968, 'tn': 2093, 'fp': 400, 'fn': 539, 'auroc': 0.8900243377908083, 'auprc': 0.8739002450394965, 'eval_loss': 0.5099257447719574}
Correct predictions are:  4061
Total predictions are:  5000
Accuracy on test set is: 0.8122 


[0.786, 0.8038, 0.7956, 0.8092, 0.7936, 0.8046, 0.8114, 0.7604, 0.8, 0.8122]

RUN NUMBER:  5


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.682272469016157, 'auprc': 0.6577267239573505, 'eval_loss': 0.6948343278884888}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5217046934658847, 'tp': 1985, 'tn': 1817, 'fp': 676, 'fn': 522, 'auroc': 0.8214107598603573, 'auprc': 0.8240895081231653, 'eval_loss': 0.5491300178527831}
Correct predictions are:  3802
Total predictions are:  5000
Accuracy on test set is: 0.7604 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.42422398307057796, 'tp': 987, 'tn': 2386, 'fp': 107, 'fn': 1520, 'auroc': 0.6272801178761243, 'auprc': 0.734161483563327, 'eval_loss': 0.6190155213832855}
Correct predictions are:  3373
Total predictions are:  5000
Accuracy on test set is: 0.6746 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5878563741310685, 'tp': 2068, 'tn': 1899, 'fp': 594, 'fn': 439, 'auroc': 0.8748523788426501, 'auprc': 0.8744411295232096, 'eval_loss': 0.4908921403646469}
Correct predictions are:  3967
Total predictions are:  5000
Accuracy on test set is: 0.7934 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5838191107179508, 'tp': 1914, 'tn': 2043, 'fp': 450, 'fn': 593, 'auroc': 0.8339486981577936, 'auprc': 0.859389619615372, 'eval_loss': 0.5167745352506637}
Correct predictions are:  3957
Total predictions are:  5000
Accuracy on test set is: 0.7914 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6030594922958207, 'tp': 2086, 'tn': 1919, 'fp': 574, 'fn': 421, 'auroc': 0.8673140797423852, 'auprc': 0.8773484988544706, 'eval_loss': 0.542154499912262}
Correct predictions are:  4005
Total predictions are:  5000
Accuracy on test set is: 0.801 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.09852813408221933, 'tp': 54, 'tn': 2491, 'fp': 2, 'fn': 2453, 'auroc': 0.8615875548464299, 'auprc': 0.8651244044968163, 'eval_loss': 0.6908934730529785}
Correct predictions are:  2545
Total predictions are:  5000
Accuracy on test set is: 0.509 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5971350333031218, 'tp': 2104, 'tn': 1884, 'fp': 609, 'fn': 403, 'auroc': 0.8714168319079622, 'auprc': 0.8412667385065704, 'eval_loss': 0.5043507263422012}
Correct predictions are:  3988
Total predictions are:  5000
Accuracy on test set is: 0.7976 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.09455687867685, 'tp': 2193, 'tn': 485, 'fp': 2008, 'fn': 314, 'auroc': 0.799868910972262, 'auprc': 0.8438478397904865, 'eval_loss': 0.692611737537384}
Correct predictions are:  2678
Total predictions are:  5000
Accuracy on test set is: 0.5356 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5299696952611903, 'tp': 2324, 'tn': 1414, 'fp': 1079, 'fn': 183, 'auroc': 0.7630707024743074, 'auprc': 0.6449048985133417, 'eval_loss': 0.5517807277441025}
Correct predictions are:  3738
Total predictions are:  5000
Accuracy on test set is: 0.7476 


[0.4986, 0.7604, 0.6746, 0.7934, 0.7914, 0.801, 0.509, 0.7976, 0.5356, 0.7476]


 Over all runs maximum accuracies are: [0.799, 0.801, 0.8122, 0.8142, 0.8218]
The median is: 0.8122
RoBERTa Accuracy Score on Test set ->  ['0.8122 +/- 0.01319999999999999']


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie



* * * * EVALUATION USING STM_RSW_LOW AS PREPROCESSING FUNCTION * * * *

RUN NUMBER:  1


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5827314481498316, 'tp': 2128, 'tn': 1819, 'fp': 674, 'fn': 379, 'auroc': 0.8794987352700846, 'auprc': 0.877220037211751, 'eval_loss': 0.4603658403277397}
Correct predictions are:  3947
Total predictions are:  5000
Accuracy on test set is: 0.7894 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6156133332994859, 'tp': 1915, 'tn': 2118, 'fp': 375, 'fn': 592, 'auroc': 0.8863107886765833, 'auprc': 0.8887982252051422, 'eval_loss': 0.4618244962334633}
Correct predictions are:  4033
Total predictions are:  5000
Accuracy on test set is: 0.8066 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6232292626028708, 'tp': 2051, 'tn': 2007, 'fp': 486, 'fn': 456, 'auroc': 0.8967278303461899, 'auprc': 0.890685051125697, 'eval_loss': 0.4664386723279953}
Correct predictions are:  4058
Total predictions are:  5000
Accuracy on test set is: 0.8116 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6253576656105831, 'tp': 2069, 'tn': 1994, 'fp': 499, 'fn': 438, 'auroc': 0.8897560156871629, 'auprc': 0.8639934294143797, 'eval_loss': 0.4678882896900177}
Correct predictions are:  4063
Total predictions are:  5000
Accuracy on test set is: 0.8126 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6146442293607929, 'tp': 2148, 'tn': 1881, 'fp': 612, 'fn': 359, 'auroc': 0.851272273974628, 'auprc': 0.8086295338518418, 'eval_loss': 0.5150199774265289}
Correct predictions are:  4029
Total predictions are:  5000
Accuracy on test set is: 0.8058 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6220751523431901, 'tp': 1955, 'tn': 2097, 'fp': 396, 'fn': 552, 'auroc': 0.8955126208189472, 'auprc': 0.8942506867819993, 'eval_loss': 0.4674291839718819}
Correct predictions are:  4052
Total predictions are:  5000
Accuracy on test set is: 0.8104 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6109500584957137, 'tp': 1969, 'tn': 2057, 'fp': 436, 'fn': 538, 'auroc': 0.8903387402557236, 'auprc': 0.8831362959642854, 'eval_loss': 0.5069976220846176}
Correct predictions are:  4026
Total predictions are:  5000
Accuracy on test set is: 0.8052 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6080202662678338, 'tp': 2151, 'tn': 1860, 'fp': 633, 'fn': 356, 'auroc': 0.8616756355369827, 'auprc': 0.8327816717038171, 'eval_loss': 0.5229124562978744}
Correct predictions are:  4011
Total predictions are:  5000
Accuracy on test set is: 0.8022 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5989477907734319, 'tp': 2036, 'tn': 1961, 'fp': 532, 'fn': 471, 'auroc': 0.8883536046922607, 'auprc': 0.8841466325917886, 'eval_loss': 0.43901161158084867}
Correct predictions are:  3997
Total predictions are:  5000
Accuracy on test set is: 0.7994 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6362012460624679, 'tp': 1999, 'tn': 2090, 'fp': 403, 'fn': 508, 'auroc': 0.8713225111684875, 'auprc': 0.8027388338245829, 'eval_loss': 0.4931035452008247}
Correct predictions are:  4089
Total predictions are:  5000
Accuracy on test set is: 0.8178 


[0.7894, 0.8066, 0.8116, 0.8126, 0.8058, 0.8104, 0.8052, 0.8022, 0.7994, 0.8178]

RUN NUMBER:  2


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5295534681100038, 'tp': 2046, 'tn': 1771, 'fp': 722, 'fn': 461, 'auroc': 0.7951696741302452, 'auprc': 0.7012047464994486, 'eval_loss': 0.5192037623643875}
Correct predictions are:  3817
Total predictions are:  5000
Accuracy on test set is: 0.7634 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5793261036710029, 'tp': 1957, 'tn': 1991, 'fp': 502, 'fn': 550, 'auroc': 0.8661826308718261, 'auprc': 0.86866148351189, 'eval_loss': 0.5081300094604492}
Correct predictions are:  3948
Total predictions are:  5000
Accuracy on test set is: 0.7896 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6023658857537274, 'tp': 1968, 'tn': 2037, 'fp': 456, 'fn': 539, 'auroc': 0.884699896047185, 'auprc': 0.8789119108830213, 'eval_loss': 0.47185400043725967}
Correct predictions are:  4005
Total predictions are:  5000
Accuracy on test set is: 0.801 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.4629245127389032, 'tp': 1027, 'tn': 2426, 'fp': 67, 'fn': 1480, 'auroc': 0.7084625943467397, 'auprc': 0.7908471559428726, 'eval_loss': 0.6442597319126129}
Correct predictions are:  3453
Total predictions are:  5000
Accuracy on test set is: 0.6906 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5801657291590578, 'tp': 2188, 'tn': 1741, 'fp': 752, 'fn': 319, 'auroc': 0.8750543804263425, 'auprc': 0.8530924311667605, 'eval_loss': 0.5075619214773178}
Correct predictions are:  3929
Total predictions are:  5000
Accuracy on test set is: 0.7858 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.
{'mcc': 0.5948287838337353, 'tp': 2129, 'tn': 1850, 'fp': 643, 'fn': 378, 'auroc': 0.8761701491739694, 'auprc': 0.8646745798459962, 'eval_loss': 0.4990461624622345}
Correct predictions are:  3979
Total predictions are:  5000
Accuracy on test set is: 0.7958 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5974716064049782, 'tp': 2129, 'tn': 1857, 'fp': 636, 'fn': 378, 'auroc': 0.8891214507121735, 'auprc': 0.8867362065160982, 'eval_loss': 0.49754379211068156}
Correct predictions are:  3986
Total predictions are:  5000
Accuracy on test set is: 0.7972 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6151741233117528, 'tp': 2104, 'tn': 1931, 'fp': 562, 'fn': 403, 'auroc': 0.8767247135217541, 'auprc': 0.8731131319028201, 'eval_loss': 0.47107796103954314}
Correct predictions are:  4035
Total predictions are:  5000
Accuracy on test set is: 0.807 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5899719094859809, 'tp': 2243, 'tn': 1699, 'fp': 794, 'fn': 264, 'auroc': 0.8902242593581935, 'auprc': 0.8873552437936, 'eval_loss': 0.509838357514143}
Correct predictions are:  3942
Total predictions are:  5000
Accuracy on test set is: 0.7884 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6042819752480495, 'tp': 2206, 'tn': 1785, 'fp': 708, 'fn': 301, 'auroc': 0.8103099528300302, 'auprc': 0.720093825396757, 'eval_loss': 0.5010407052159309}
Correct predictions are:  3991
Total predictions are:  5000
Accuracy on test set is: 0.7982 


[0.7634, 0.7896, 0.801, 0.6906, 0.7858, 0.7958, 0.7972, 0.807, 0.7884, 0.7982]

RUN NUMBER:  3


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6227975150234389, 'tp': 2035, 'tn': 2022, 'fp': 471, 'fn': 472, 'auroc': 0.8898645765382801, 'auprc': 0.8858294307114812, 'eval_loss': 0.4236627125740051}
Correct predictions are:  4057
Total predictions are:  5000
Accuracy on test set is: 0.8114 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6317600595104986, 'tp': 2077, 'tn': 2002, 'fp': 491, 'fn': 430, 'auroc': 0.8991113690331332, 'auprc': 0.8977273574065016, 'eval_loss': 0.4515158511161804}
Correct predictions are:  4079
Total predictions are:  5000
Accuracy on test set is: 0.8158 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6508348416371301, 'tp': 2159, 'tn': 1964, 'fp': 529, 'fn': 348, 'auroc': 0.9030149996375971, 'auprc': 0.9011697712377004, 'eval_loss': 0.4464569105744362}
Correct predictions are:  4123
Total predictions are:  5000
Accuracy on test set is: 0.8246 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6296504834179582, 'tp': 2176, 'tn': 1889, 'fp': 604, 'fn': 331, 'auroc': 0.8957018223022868, 'auprc': 0.8839956488584577, 'eval_loss': 0.5144481601178647}
Correct predictions are:  4065
Total predictions are:  5000
Accuracy on test set is: 0.813 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6491186777807951, 'tp': 2164, 'tn': 1954, 'fp': 539, 'fn': 343, 'auroc': 0.8877347198402036, 'auprc': 0.8369210298687381, 'eval_loss': 0.489417466378212}
Correct predictions are:  4118
Total predictions are:  5000
Accuracy on test set is: 0.8236 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.662268750843115, 'tp': 2223, 'tn': 1922, 'fp': 571, 'fn': 284, 'auroc': 0.9103628172444872, 'auprc': 0.9063947734360377, 'eval_loss': 0.5058231715202332}
Correct predictions are:  4145
Total predictions are:  5000
Accuracy on test set is: 0.829 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6513470672928897, 'tp': 2046, 'tn': 2082, 'fp': 411, 'fn': 461, 'auroc': 0.9052446971184254, 'auprc': 0.9001659865288093, 'eval_loss': 0.5323032192409038}
Correct predictions are:  4128
Total predictions are:  5000
Accuracy on test set is: 0.8256 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6501003647608028, 'tp': 2149, 'tn': 1973, 'fp': 520, 'fn': 358, 'auroc': 0.9051841366436312, 'auprc': 0.8991179263567741, 'eval_loss': 0.5592438753306865}
Correct predictions are:  4122
Total predictions are:  5000
Accuracy on test set is: 0.8244 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6579565463184003, 'tp': 2154, 'tn': 1988, 'fp': 505, 'fn': 353, 'auroc': 0.9055266193286954, 'auprc': 0.9018535942410484, 'eval_loss': 0.5850512365192175}
Correct predictions are:  4142
Total predictions are:  5000
Accuracy on test set is: 0.8284 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6355079161100244, 'tp': 1910, 'tn': 2169, 'fp': 324, 'fn': 597, 'auroc': 0.9060244632317918, 'auprc': 0.9025321953892989, 'eval_loss': 0.6314675674438477}
Correct predictions are:  4079
Total predictions are:  5000
Accuracy on test set is: 0.8158 


[0.8114, 0.8158, 0.8246, 0.813, 0.8236, 0.829, 0.8256, 0.8244, 0.8284, 0.8158]

RUN NUMBER:  4


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6570724419476381, 'tp': 2136, 'tn': 2005, 'fp': 488, 'fn': 371, 'auroc': 0.907968398472244, 'auprc': 0.9052270290789067, 'eval_loss': 0.39380376536250117}
Correct predictions are:  4141
Total predictions are:  5000
Accuracy on test set is: 0.8282 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6580463427927525, 'tp': 2194, 'tn': 1944, 'fp': 549, 'fn': 313, 'auroc': 0.913692123346247, 'auprc': 0.910842821759295, 'eval_loss': 0.4057748050630093}
Correct predictions are:  4138
Total predictions are:  5000
Accuracy on test set is: 0.8276 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6784158245685529, 'tp': 2116, 'tn': 2080, 'fp': 413, 'fn': 391, 'auroc': 0.9165785459758004, 'auprc': 0.914034704663049, 'eval_loss': 0.44216965155005455}
Correct predictions are:  4196
Total predictions are:  5000
Accuracy on test set is: 0.8392 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6788981093210268, 'tp': 2183, 'tn': 2011, 'fp': 482, 'fn': 324, 'auroc': 0.9148368523209222, 'auprc': 0.9101567215243735, 'eval_loss': 0.5137875683665275}
Correct predictions are:  4194
Total predictions are:  5000
Accuracy on test set is: 0.8388 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6767405211413224, 'tp': 2144, 'tn': 2047, 'fp': 446, 'fn': 363, 'auroc': 0.9171689506045727, 'auprc': 0.9123920728853931, 'eval_loss': 0.5471041945174336}
Correct predictions are:  4191
Total predictions are:  5000
Accuracy on test set is: 0.8382 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.670443270748131, 'tp': 2082, 'tn': 2094, 'fp': 399, 'fn': 425, 'auroc': 0.9159539010785844, 'auprc': 0.9115027571318894, 'eval_loss': 0.6361440266117454}
Correct predictions are:  4176
Total predictions are:  5000
Accuracy on test set is: 0.8352 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6753581561618216, 'tp': 2130, 'tn': 2058, 'fp': 435, 'fn': 377, 'auroc': 0.9155802981495376, 'auprc': 0.9113055339125011, 'eval_loss': 0.6579405097961426}
Correct predictions are:  4188
Total predictions are:  5000
Accuracy on test set is: 0.8376 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6813542928053021, 'tp': 2137, 'tn': 2066, 'fp': 427, 'fn': 370, 'auroc': 0.9172169509808958, 'auprc': 0.9121587870437948, 'eval_loss': 0.7669736339345574}
Correct predictions are:  4203
Total predictions are:  5000
Accuracy on test set is: 0.8406 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6660369956068496, 'tp': 2105, 'tn': 2060, 'fp': 433, 'fn': 402, 'auroc': 0.9145376499751758, 'auprc': 0.9076994521050301, 'eval_loss': 0.7508420505546033}
Correct predictions are:  4165
Total predictions are:  5000
Accuracy on test set is: 0.833 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6689385067914632, 'tp': 2069, 'tn': 2103, 'fp': 390, 'fn': 438, 'auroc': 0.9139096450516172, 'auprc': 0.9085447385401031, 'eval_loss': 0.7968143521904946}
Correct predictions are:  4172
Total predictions are:  5000
Accuracy on test set is: 0.8344 


[0.8282, 0.8276, 0.8392, 0.8388, 0.8382, 0.8352, 0.8376, 0.8406, 0.833, 0.8344]

RUN NUMBER:  5


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.7349656821309479, 'auprc': 0.7189900653935297, 'eval_loss': 0.6929263680458069}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.4945101035255435, 'tp': 2061, 'tn': 1661, 'fp': 832, 'fn': 446, 'auroc': 0.8189947409187688, 'auprc': 0.7956623646089418, 'eval_loss': 0.544583643078804}
Correct predictions are:  3722
Total predictions are:  5000
Accuracy on test set is: 0.7444 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5618377945280224, 'tp': 1925, 'tn': 1979, 'fp': 514, 'fn': 582, 'auroc': 0.8648680605655948, 'auprc': 0.8612655504323727, 'eval_loss': 0.48241135417222974}
Correct predictions are:  3904
Total predictions are:  5000
Accuracy on test set is: 0.7808 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5925401444033584, 'tp': 2155, 'tn': 1814, 'fp': 679, 'fn': 352, 'auroc': 0.7988668231158932, 'auprc': 0.6916812155111662, 'eval_loss': 0.5244604750990868}
Correct predictions are:  3969
Total predictions are:  5000
Accuracy on test set is: 0.7938 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5748217786554717, 'tp': 1790, 'tn': 2132, 'fp': 361, 'fn': 717, 'auroc': 0.8770535961001934, 'auprc': 0.8631334684809286, 'eval_loss': 0.47992018095254896}
Correct predictions are:  3922
Total predictions are:  5000
Accuracy on test set is: 0.7844 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5691758234810582, 'tp': 2314, 'tn': 1543, 'fp': 950, 'fn': 193, 'auroc': 0.80372262118535, 'auprc': 0.7161553610717583, 'eval_loss': 0.5150839552998543}
Correct predictions are:  3857
Total predictions are:  5000
Accuracy on test set is: 0.7714 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6140770181771964, 'tp': 2047, 'tn': 1988, 'fp': 505, 'fn': 460, 'auroc': 0.8848100569108464, 'auprc': 0.8774214718755011, 'eval_loss': 0.47489205641746524}
Correct predictions are:  4035
Total predictions are:  5000
Accuracy on test set is: 0.807 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6086926999527656, 'tp': 1874, 'tn': 2138, 'fp': 355, 'fn': 633, 'auroc': 0.8883304045103715, 'auprc': 0.8840837246008411, 'eval_loss': 0.49946857986450194}
Correct predictions are:  4012
Total predictions are:  5000
Accuracy on test set is: 0.8024 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6060780364530192, 'tp': 1934, 'tn': 2078, 'fp': 415, 'fn': 573, 'auroc': 0.878383526526848, 'auprc': 0.8566164103098344, 'eval_loss': 0.49339949079751966}
Correct predictions are:  4012
Total predictions are:  5000
Accuracy on test set is: 0.8024 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.44756420681075804, 'tp': 1017, 'tn': 2407, 'fp': 86, 'fn': 1490, 'auroc': 0.8758647067793011, 'auprc': 0.8646035617135928, 'eval_loss': 0.5824050602912902}
Correct predictions are:  3424
Total predictions are:  5000
Accuracy on test set is: 0.6848 


[0.5014, 0.7444, 0.7808, 0.7938, 0.7844, 0.7714, 0.807, 0.8024, 0.8024, 0.6848]


 Over all runs maximum accuracies are: [0.807, 0.807, 0.8178, 0.829, 0.8406]
The median is: 0.8178
RoBERTa Accuracy Score on Test set ->  ['0.8178 +/- 0.022800000000000042']


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie



* * * * EVALUATION USING RSW_LOW_STM AS PREPROCESSING FUNCTION * * * *

RUN NUMBER:  1


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.45789209090570343, 'tp': 2093, 'tn': 1524, 'fp': 969, 'fn': 414, 'auroc': 0.807209688523958, 'auprc': 0.8037128491811183, 'eval_loss': 0.5714673122644425}
Correct predictions are:  3617
Total predictions are:  5000
Accuracy on test set is: 0.7234 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5554506397369465, 'tp': 2122, 'tn': 1753, 'fp': 740, 'fn': 385, 'auroc': 0.8653434242924465, 'auprc': 0.8648718931496402, 'eval_loss': 0.5029265428185463}
Correct predictions are:  3875
Total predictions are:  5000
Accuracy on test set is: 0.775 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.591248569835969, 'tp': 1949, 'tn': 2028, 'fp': 465, 'fn': 558, 'auroc': 0.8766114326336318, 'auprc': 0.8708686493797683, 'eval_loss': 0.478712079668045}
Correct predictions are:  3977
Total predictions are:  5000
Accuracy on test set is: 0.7954 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5981204978241678, 'tp': 1954, 'tn': 2040, 'fp': 453, 'fn': 553, 'auroc': 0.8815043509941118, 'auprc': 0.8553998374078406, 'eval_loss': 0.5169130062818528}
Correct predictions are:  3994
Total predictions are:  5000
Accuracy on test set is: 0.7988 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5950474922413889, 'tp': 2109, 'tn': 1873, 'fp': 620, 'fn': 398, 'auroc': 0.8819017941100658, 'auprc': 0.8829339485527575, 'eval_loss': 0.5152659676551818}
Correct predictions are:  3982
Total predictions are:  5000
Accuracy on test set is: 0.7964 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5999117781956329, 'tp': 1887, 'tn': 2106, 'fp': 387, 'fn': 620, 'auroc': 0.886803432538911, 'auprc': 0.8880566458614279, 'eval_loss': 0.4971036990761757}
Correct predictions are:  3993
Total predictions are:  5000
Accuracy on test set is: 0.7986 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6005162536351862, 'tp': 2078, 'tn': 1921, 'fp': 572, 'fn': 429, 'auroc': 0.8648079000939367, 'auprc': 0.8754590900004546, 'eval_loss': 0.508961179959774}
Correct predictions are:  3999
Total predictions are:  5000
Accuracy on test set is: 0.7998 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6065274775643947, 'tp': 2187, 'tn': 1814, 'fp': 679, 'fn': 320, 'auroc': 0.8931274821194597, 'auprc': 0.8913759361739597, 'eval_loss': 0.4820817490935326}
Correct predictions are:  4001
Total predictions are:  5000
Accuracy on test set is: 0.8002 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6158104303274236, 'tp': 2084, 'tn': 1954, 'fp': 539, 'fn': 423, 'auroc': 0.8620200382370998, 'auprc': 0.8832430309386037, 'eval_loss': 0.5009721420049668}
Correct predictions are:  4038
Total predictions are:  5000
Accuracy on test set is: 0.8076 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5858187765317022, 'tp': 2223, 'tn': 1713, 'fp': 780, 'fn': 284, 'auroc': 0.8652326234237677, 'auprc': 0.8091671575854157, 'eval_loss': 0.489897701638937}
Correct predictions are:  3936
Total predictions are:  5000
Accuracy on test set is: 0.7872 


[0.7234, 0.775, 0.7954, 0.7988, 0.7964, 0.7986, 0.7998, 0.8002, 0.8076, 0.7872]

RUN NUMBER:  2


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6522732647149581, 'tp': 1967, 'tn': 2158, 'fp': 335, 'fn': 540, 'auroc': 0.9115066662122631, 'auprc': 0.9085487491557442, 'eval_loss': 0.4172042602002621}
Correct predictions are:  4125
Total predictions are:  5000
Accuracy on test set is: 0.825 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6576730027824512, 'tp': 2168, 'tn': 1972, 'fp': 521, 'fn': 339, 'auroc': 0.904813173735282, 'auprc': 0.8959949880142655, 'eval_loss': 0.4479407170176506}
Correct predictions are:  4140
Total predictions are:  5000
Accuracy on test set is: 0.828 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6157655717111594, 'tp': 2131, 'tn': 1903, 'fp': 590, 'fn': 376, 'auroc': 0.891786511606251, 'auprc': 0.8843450109604926, 'eval_loss': 0.508691627830267}
Correct predictions are:  4034
Total predictions are:  5000
Accuracy on test set is: 0.8068 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.640183857950512, 'tp': 2112, 'tn': 1987, 'fp': 506, 'fn': 395, 'auroc': 0.8989748879631217, 'auprc': 0.8955518218494509, 'eval_loss': 0.4680416236758232}
Correct predictions are:  4099
Total predictions are:  5000
Accuracy on test set is: 0.8198 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6542156771705459, 'tp': 2044, 'tn': 2091, 'fp': 402, 'fn': 463, 'auroc': 0.9048946143737766, 'auprc': 0.9002004527037095, 'eval_loss': 0.5147956947863102}
Correct predictions are:  4135
Total predictions are:  5000
Accuracy on test set is: 0.827 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6365736802063908, 'tp': 2084, 'tn': 2007, 'fp': 486, 'fn': 423, 'auroc': 0.8574580824713666, 'auprc': 0.7757680698718952, 'eval_loss': 0.5000350476503372}
Correct predictions are:  4091
Total predictions are:  5000
Accuracy on test set is: 0.8182 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6455984171001983, 'tp': 2069, 'tn': 2045, 'fp': 448, 'fn': 438, 'auroc': 0.9040282875817747, 'auprc': 0.8966884850652637, 'eval_loss': 0.498767427790165}
Correct predictions are:  4114
Total predictions are:  5000
Accuracy on test set is: 0.8228 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6593723393987881, 'tp': 2162, 'tn': 1983, 'fp': 510, 'fn': 345, 'auroc': 0.9111819436664383, 'auprc': 0.9098591027256466, 'eval_loss': 0.4884637660264969}
Correct predictions are:  4145
Total predictions are:  5000
Accuracy on test set is: 0.829 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6580111944764783, 'tp': 2073, 'tn': 2072, 'fp': 421, 'fn': 434, 'auroc': 0.9086691239659317, 'auprc': 0.9026143542608959, 'eval_loss': 0.5377797941684723}
Correct predictions are:  4145
Total predictions are:  5000
Accuracy on test set is: 0.829 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6445298313375458, 'tp': 2039, 'tn': 2072, 'fp': 421, 'fn': 468, 'auroc': 0.902519155750181, 'auprc': 0.8979177448568143, 'eval_loss': 0.5938370259106159}
Correct predictions are:  4111
Total predictions are:  5000
Accuracy on test set is: 0.8222 


[0.825, 0.828, 0.8068, 0.8198, 0.827, 0.8182, 0.8228, 0.829, 0.829, 0.8222]

RUN NUMBER:  3


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6385922253727493, 'tp': 2195, 'tn': 1891, 'fp': 602, 'fn': 312, 'auroc': 0.9068199094680903, 'auprc': 0.9056844362535006, 'eval_loss': 0.41518423107862473}
Correct predictions are:  4086
Total predictions are:  5000
Accuracy on test set is: 0.8172 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6689801911264991, 'tp': 2124, 'tn': 2048, 'fp': 445, 'fn': 383, 'auroc': 0.9134462014182191, 'auprc': 0.9105631375356436, 'eval_loss': 0.4336043484866619}
Correct predictions are:  4172
Total predictions are:  5000
Accuracy on test set is: 0.8344 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6648370173154943, 'tp': 2076, 'tn': 2086, 'fp': 407, 'fn': 431, 'auroc': 0.916231343253731, 'auprc': 0.9139494096279456, 'eval_loss': 0.4172482461571693}
Correct predictions are:  4162
Total predictions are:  5000
Accuracy on test set is: 0.8324 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.65430512995343, 'tp': 2134, 'tn': 2000, 'fp': 493, 'fn': 373, 'auroc': 0.9029120388303846, 'auprc': 0.8981018377693547, 'eval_loss': 0.5048988553404808}
Correct predictions are:  4134
Total predictions are:  5000
Accuracy on test set is: 0.8268 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6491439031839303, 'tp': 2029, 'tn': 2093, 'fp': 400, 'fn': 478, 'auroc': 0.8981932018347023, 'auprc': 0.8720405385647092, 'eval_loss': 0.5000810361683369}
Correct predictions are:  4122
Total predictions are:  5000
Accuracy on test set is: 0.8244 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6559809114905671, 'tp': 1990, 'tn': 2146, 'fp': 347, 'fn': 517, 'auroc': 0.9104654580491912, 'auprc': 0.9072989820375341, 'eval_loss': 0.4688999691694975}
Correct predictions are:  4136
Total predictions are:  5000
Accuracy on test set is: 0.8272 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6447169147425593, 'tp': 2123, 'tn': 1987, 'fp': 506, 'fn': 384, 'auroc': 0.9069209502602501, 'auprc': 0.9070919198895632, 'eval_loss': 0.5764836792856455}
Correct predictions are:  4110
Total predictions are:  5000
Accuracy on test set is: 0.822 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6288408806311981, 'tp': 2255, 'tn': 1792, 'fp': 701, 'fn': 252, 'auroc': 0.8974009556234921, 'auprc': 0.8796567652528252, 'eval_loss': 0.6011904692351818}
Correct predictions are:  4047
Total predictions are:  5000
Accuracy on test set is: 0.8094 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6316253549224045, 'tp': 1936, 'tn': 2137, 'fp': 356, 'fn': 571, 'auroc': 0.8939479685520735, 'auprc': 0.8965721428633397, 'eval_loss': 0.5730273792266846}
Correct predictions are:  4073
Total predictions are:  5000
Accuracy on test set is: 0.8146 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6445191693632344, 'tp': 1942, 'tn': 2162, 'fp': 331, 'fn': 565, 'auroc': 0.9060596635077619, 'auprc': 0.9049869773236683, 'eval_loss': 0.5461396047234536}
Correct predictions are:  4104
Total predictions are:  5000
Accuracy on test set is: 0.8208 


[0.8172, 0.8344, 0.8324, 0.8268, 0.8244, 0.8272, 0.822, 0.8094, 0.8146, 0.8208]

RUN NUMBER:  4


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.598732534063067, 'auprc': 0.5641156222814498, 'eval_loss': 0.6932618310928345}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.7093102009919757, 'auprc': 0.7151448337435009, 'eval_loss': 0.6931814058303833}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.6985291564685867, 'auprc': 0.6865167601254649, 'eval_loss': 0.6935119316101074}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.45607831868153054, 'tp': 1968, 'tn': 1665, 'fp': 828, 'fn': 539, 'auroc': 0.7843208690756136, 'auprc': 0.7674903946414201, 'eval_loss': 0.6166477967262268}
Correct predictions are:  3633
Total predictions are:  5000
Accuracy on test set is: 0.7266 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.8284898553604662, 'auprc': 0.8174244069827998, 'eval_loss': 0.6933281583786011}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5346135931668134, 'tp': 2099, 'tn': 1724, 'fp': 769, 'fn': 408, 'auroc': 0.8540846960240168, 'auprc': 0.8520058487782176, 'eval_loss': 0.5326273390054703}
Correct predictions are:  3823
Total predictions are:  5000
Accuracy on test set is: 0.7646 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5732171803842407, 'tp': 1987, 'tn': 1946, 'fp': 547, 'fn': 520, 'auroc': 0.8697654589611983, 'auprc': 0.8674683192963593, 'eval_loss': 0.5231236866950989}
Correct predictions are:  3933
Total predictions are:  5000
Accuracy on test set is: 0.7866 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.515039277534465, 'tp': 1386, 'tn': 2307, 'fp': 186, 'fn': 1121, 'auroc': 0.8563067934452606, 'auprc': 0.8614632834668711, 'eval_loss': 0.5564842728376389}
Correct predictions are:  3693
Total predictions are:  5000
Accuracy on test set is: 0.7386 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5824985708324629, 'tp': 1964, 'tn': 1992, 'fp': 501, 'fn': 543, 'auroc': 0.8722175581856562, 'auprc': 0.8675794192773678, 'eval_loss': 0.5113457057356834}
Correct predictions are:  3956
Total predictions are:  5000
Accuracy on test set is: 0.7912 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.592404063700453, 'tp': 1953, 'tn': 2027, 'fp': 466, 'fn': 554, 'auroc': 0.8775508799988991, 'auprc': 0.8785462534117109, 'eval_loss': 0.4772482273101807}
Correct predictions are:  3980
Total predictions are:  5000
Accuracy on test set is: 0.796 


[0.4986, 0.5014, 0.5014, 0.7266, 0.5014, 0.7646, 0.7866, 0.7386, 0.7912, 0.796]

RUN NUMBER:  5


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6705684454856312, 'tp': 2125, 'tn': 2051, 'fp': 442, 'fn': 382, 'auroc': 0.9107372201798062, 'auprc': 0.9060670082783973, 'eval_loss': 0.390125433909893}
Correct predictions are:  4176
Total predictions are:  5000
Accuracy on test set is: 0.8352 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.683242139950775, 'tp': 2127, 'tn': 2081, 'fp': 412, 'fn': 380, 'auroc': 0.922531232644864, 'auprc': 0.919718460612471, 'eval_loss': 0.3898367895424366}
Correct predictions are:  4208
Total predictions are:  5000
Accuracy on test set is: 0.8416 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6896640376756672, 'tp': 2138, 'tn': 2086, 'fp': 407, 'fn': 369, 'auroc': 0.9236114011133847, 'auprc': 0.9224857842056573, 'eval_loss': 0.45016738476529716}
Correct predictions are:  4224
Total predictions are:  5000
Accuracy on test set is: 0.8448 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.688946721580046, 'tp': 2169, 'tn': 2052, 'fp': 441, 'fn': 338, 'auroc': 0.9237219619801819, 'auprc': 0.9213168803731353, 'eval_loss': 0.6083554167695343}
Correct predictions are:  4221
Total predictions are:  5000
Accuracy on test set is: 0.8442 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6864216553458997, 'tp': 2106, 'tn': 2110, 'fp': 383, 'fn': 401, 'auroc': 0.9219518681026461, 'auprc': 0.9199809415861427, 'eval_loss': 0.6356915968537331}
Correct predictions are:  4216
Total predictions are:  5000
Accuracy on test set is: 0.8432 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6851202237249955, 'tp': 2153, 'tn': 2059, 'fp': 434, 'fn': 354, 'auroc': 0.9208481794497269, 'auprc': 0.9190835841174891, 'eval_loss': 0.6342100040309131}
Correct predictions are:  4212
Total predictions are:  5000
Accuracy on test set is: 0.8424 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6774451006577661, 'tp': 2139, 'tn': 2054, 'fp': 439, 'fn': 368, 'auroc': 0.9215476249333795, 'auprc': 0.9181699478494754, 'eval_loss': 0.6701958037793636}
Correct predictions are:  4193
Total predictions are:  5000
Accuracy on test set is: 0.8386 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6814320415775814, 'tp': 2178, 'tn': 2023, 'fp': 470, 'fn': 329, 'auroc': 0.9189972049380868, 'auprc': 0.9156801943256554, 'eval_loss': 0.7407045612581075}
Correct predictions are:  4201
Total predictions are:  5000
Accuracy on test set is: 0.8402 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6797104909658948, 'tp': 2131, 'tn': 2068, 'fp': 425, 'fn': 376, 'auroc': 0.919925692217427, 'auprc': 0.918370206486555, 'eval_loss': 0.725298882190138}
Correct predictions are:  4199
Total predictions are:  5000
Accuracy on test set is: 0.8398 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.673794903037646, 'tp': 2131, 'tn': 2053, 'fp': 440, 'fn': 376, 'auroc': 0.9189280043955544, 'auprc': 0.9169436585691602, 'eval_loss': 0.8425950717188417}
Correct predictions are:  4184
Total predictions are:  5000
Accuracy on test set is: 0.8368 


[0.8352, 0.8416, 0.8448, 0.8442, 0.8432, 0.8424, 0.8386, 0.8402, 0.8398, 0.8368]


 Over all runs maximum accuracies are: [0.796, 0.8076, 0.829, 0.8344, 0.8448]
The median is: 0.829
RoBERTa Accuracy Score on Test set ->  ['0.829 +/- 0.03299999999999992']


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie



* * * * EVALUATION USING RSW_STM_LOW AS PREPROCESSING FUNCTION * * * *

RUN NUMBER:  1


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6512043521693864, 'tp': 2079, 'tn': 2049, 'fp': 444, 'fn': 428, 'auroc': 0.9048778142420638, 'auprc': 0.8994201074860784, 'eval_loss': 0.40557987733483314}
Correct predictions are:  4128
Total predictions are:  5000
Accuracy on test set is: 0.8256 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6740135574287255, 'tp': 2110, 'tn': 2075, 'fp': 418, 'fn': 397, 'auroc': 0.9171555104992023, 'auprc': 0.91256878454957, 'eval_loss': 0.38623946703672407}
Correct predictions are:  4185
Total predictions are:  5000
Accuracy on test set is: 0.837 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6602388923790115, 'tp': 2153, 'tn': 1995, 'fp': 498, 'fn': 354, 'auroc': 0.9081778401142665, 'auprc': 0.9042335937621013, 'eval_loss': 0.48595071543455126}
Correct predictions are:  4148
Total predictions are:  5000
Accuracy on test set is: 0.8296 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6614637339751291, 'tp': 2202, 'tn': 1944, 'fp': 549, 'fn': 305, 'auroc': 0.9157951798342099, 'auprc': 0.9106744510751862, 'eval_loss': 0.5205044457912446}
Correct predictions are:  4146
Total predictions are:  5000
Accuracy on test set is: 0.8292 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.66794795331784, 'tp': 2152, 'tn': 2016, 'fp': 477, 'fn': 355, 'auroc': 0.9136398829366823, 'auprc': 0.9083738960316788, 'eval_loss': 0.5003267734348774}
Correct predictions are:  4168
Total predictions are:  5000
Accuracy on test set is: 0.8336 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6685279836724588, 'tp': 2119, 'tn': 2052, 'fp': 441, 'fn': 388, 'auroc': 0.9115220263326864, 'auprc': 0.901116109701853, 'eval_loss': 0.6172272881940007}
Correct predictions are:  4171
Total predictions are:  5000
Accuracy on test set is: 0.8342 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6615038947744839, 'tp': 2027, 'tn': 2125, 'fp': 368, 'fn': 480, 'auroc': 0.9108234608559331, 'auprc': 0.9060865077608216, 'eval_loss': 0.5880864151716232}
Correct predictions are:  4152
Total predictions are:  5000
Accuracy on test set is: 0.8304 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6538475508249395, 'tp': 2004, 'tn': 2128, 'fp': 365, 'fn': 503, 'auroc': 0.9085292028689503, 'auprc': 0.9032181111949033, 'eval_loss': 0.5642174033731222}
Correct predictions are:  4132
Total predictions are:  5000
Accuracy on test set is: 0.8264 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.666871782797828, 'tp': 2111, 'tn': 2056, 'fp': 437, 'fn': 396, 'auroc': 0.910628259325553, 'auprc': 0.9077934266201785, 'eval_loss': 0.6537563972175121}
Correct predictions are:  4167
Total predictions are:  5000
Accuracy on test set is: 0.8334 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6561422311874182, 'tp': 1957, 'tn': 2176, 'fp': 317, 'fn': 550, 'auroc': 0.9018789907312873, 'auprc': 0.8753867564735003, 'eval_loss': 0.6120661280572415}
Correct predictions are:  4133
Total predictions are:  5000
Accuracy on test set is: 0.8266 


[0.8256, 0.837, 0.8296, 0.8292, 0.8336, 0.8342, 0.8304, 0.8264, 0.8334, 0.8266]

RUN NUMBER:  2


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.580363857452257, 'tp': 1940, 'tn': 2010, 'fp': 483, 'fn': 567, 'auroc': 0.8423128437326949, 'auprc': 0.8556478976643926, 'eval_loss': 0.4944001515865326}
Correct predictions are:  3950
Total predictions are:  5000
Accuracy on test set is: 0.79 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5933891515514262, 'tp': 1891, 'tn': 2087, 'fp': 406, 'fn': 616, 'auroc': 0.8822132365517745, 'auprc': 0.8794683566048469, 'eval_loss': 0.45949745297431943}
Correct predictions are:  3978
Total predictions are:  5000
Accuracy on test set is: 0.7956 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6001148015962388, 'tp': 2163, 'tn': 1825, 'fp': 668, 'fn': 344, 'auroc': 0.8831354037815656, 'auprc': 0.8761316944962345, 'eval_loss': 0.48288584072589874}
Correct predictions are:  3988
Total predictions are:  5000
Accuracy on test set is: 0.7976 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6099279313116635, 'tp': 1980, 'tn': 2044, 'fp': 449, 'fn': 527, 'auroc': 0.8906100223825755, 'auprc': 0.8898229788401081, 'eval_loss': 0.47512486672401427}
Correct predictions are:  4024
Total predictions are:  5000
Accuracy on test set is: 0.8048 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.600424288641508, 'tp': 2022, 'tn': 1979, 'fp': 514, 'fn': 485, 'auroc': 0.8920269134910018, 'auprc': 0.8922655221412166, 'eval_loss': 0.49931270241737363}
Correct predictions are:  4001
Total predictions are:  5000
Accuracy on test set is: 0.8002 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6143858750574951, 'tp': 2231, 'tn': 1782, 'fp': 711, 'fn': 276, 'auroc': 0.8963471073613217, 'auprc': 0.8883083996878981, 'eval_loss': 0.49142840518951414}
Correct predictions are:  4013
Total predictions are:  5000
Accuracy on test set is: 0.8026 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6344305333374848, 'tp': 2065, 'tn': 2021, 'fp': 472, 'fn': 442, 'auroc': 0.8633598887415278, 'auprc': 0.8838902492382232, 'eval_loss': 0.4838585893750191}
Correct predictions are:  4086
Total predictions are:  5000
Accuracy on test set is: 0.8172 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6289904790929196, 'tp': 2014, 'tn': 2058, 'fp': 435, 'fn': 493, 'auroc': 0.8974900763221983, 'auprc': 0.8945735470220456, 'eval_loss': 0.519858059412241}
Correct predictions are:  4072
Total predictions are:  5000
Accuracy on test set is: 0.8144 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6487374906855184, 'tp': 2164, 'tn': 1953, 'fp': 540, 'fn': 343, 'auroc': 0.9087171243422549, 'auprc': 0.9086764073685778, 'eval_loss': 0.4680276470720768}
Correct predictions are:  4117
Total predictions are:  5000
Accuracy on test set is: 0.8234 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6367305423230569, 'tp': 2014, 'tn': 2077, 'fp': 416, 'fn': 493, 'auroc': 0.8967271903411724, 'auprc': 0.8961974671089943, 'eval_loss': 0.4857456985890865}
Correct predictions are:  4091
Total predictions are:  5000
Accuracy on test set is: 0.8182 


[0.79, 0.7956, 0.7976, 0.8048, 0.8002, 0.8026, 0.8172, 0.8144, 0.8234, 0.8182]

RUN NUMBER:  3


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.7715650090696711, 'auprc': 0.7815768044816627, 'eval_loss': 0.6930916538238525}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5007069351022942, 'tp': 2216, 'tn': 1484, 'fp': 1009, 'fn': 291, 'auroc': 0.8472808826821203, 'auprc': 0.8493502957684547, 'eval_loss': 0.5430451931238175}
Correct predictions are:  3700
Total predictions are:  5000
Accuracy on test set is: 0.74 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5284241944255423, 'tp': 2250, 'tn': 1516, 'fp': 977, 'fn': 257, 'auroc': 0.8625804426306702, 'auprc': 0.8584144261169397, 'eval_loss': 0.5384169618606567}
Correct predictions are:  3766
Total predictions are:  5000
Accuracy on test set is: 0.7532 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6000538157622874, 'tp': 1960, 'tn': 2039, 'fp': 454, 'fn': 547, 'auroc': 0.8799606588915656, 'auprc': 0.8803193531767584, 'eval_loss': 0.4787813989043236}
Correct predictions are:  3999
Total predictions are:  5000
Accuracy on test set is: 0.7998 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5769615731535954, 'tp': 1692, 'tn': 2216, 'fp': 277, 'fn': 815, 'auroc': 0.8408499522636258, 'auprc': 0.7762891940756418, 'eval_loss': 0.4923677665829658}
Correct predictions are:  3908
Total predictions are:  5000
Accuracy on test set is: 0.7816 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.11559771800942333, 'tp': 72, 'tn': 2491, 'fp': 2, 'fn': 2435, 'auroc': 0.8575491231851258, 'auprc': 0.8540839978942733, 'eval_loss': 0.6850279695510865}
Correct predictions are:  2563
Total predictions are:  5000
Accuracy on test set is: 0.5126 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5829365457315541, 'tp': 1916, 'tn': 2039, 'fp': 454, 'fn': 591, 'auroc': 0.7827138964769486, 'auprc': 0.7539040991988225, 'eval_loss': 0.5790452673792839}
Correct predictions are:  3955
Total predictions are:  5000
Accuracy on test set is: 0.791 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.40316756083367694, 'auprc': 0.5079811478782825, 'eval_loss': 0.6850839206695557}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5770175966567249, 'tp': 1744, 'tn': 2175, 'fp': 318, 'fn': 763, 'auroc': 0.8712296304403025, 'auprc': 0.8493657653410748, 'eval_loss': 0.5413344659864903}
Correct predictions are:  3919
Total predictions are:  5000
Accuracy on test set is: 0.7838 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.17020092300010053, 'tp': 2507, 'tn': 140, 'fp': 2353, 'fn': 0, 'auroc': 0.8723909195448092, 'auprc': 0.8649467858823205, 'eval_loss': 0.6755483539581298}
Correct predictions are:  2647
Total predictions are:  5000
Accuracy on test set is: 0.5294 


[0.4986, 0.74, 0.7532, 0.7998, 0.7816, 0.5126, 0.791, 0.5014, 0.7838, 0.5294]

RUN NUMBER:  4


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5744012193387706, 'tp': 1690, 'tn': 2212, 'fp': 281, 'fn': 817, 'auroc': 0.8759769476592696, 'auprc': 0.8681356311849255, 'eval_loss': 0.48683997576236726}
Correct predictions are:  3902
Total predictions are:  5000
Accuracy on test set is: 0.7804 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5940643514109394, 'tp': 1781, 'tn': 2183, 'fp': 310, 'fn': 726, 'auroc': 0.884128291565806, 'auprc': 0.8820868640225803, 'eval_loss': 0.5062294722616673}
Correct predictions are:  3964
Total predictions are:  5000
Accuracy on test set is: 0.7928 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6130194704267211, 'tp': 1952, 'tn': 2078, 'fp': 415, 'fn': 555, 'auroc': 0.8893793727342821, 'auprc': 0.8882560601307853, 'eval_loss': 0.4786493512392044}
Correct predictions are:  4030
Total predictions are:  5000
Accuracy on test set is: 0.806 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6114700240237765, 'tp': 1913, 'tn': 2110, 'fp': 383, 'fn': 594, 'auroc': 0.8916185102891206, 'auprc': 0.8883597676968753, 'eval_loss': 0.4781431970477104}
Correct predictions are:  4023
Total predictions are:  5000
Accuracy on test set is: 0.8046 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6084473483729522, 'tp': 2036, 'tn': 1985, 'fp': 508, 'fn': 471, 'auroc': 0.830840113786492, 'auprc': 0.7351550399633098, 'eval_loss': 0.5236061470031739}
Correct predictions are:  4021
Total predictions are:  5000
Accuracy on test set is: 0.8042 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6098636096699076, 'tp': 2135, 'tn': 1883, 'fp': 610, 'fn': 372, 'auroc': 0.876931595143706, 'auprc': 0.8264796577870466, 'eval_loss': 0.47803251518607137}
Correct predictions are:  4018
Total predictions are:  5000
Accuracy on test set is: 0.8036 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.6113991652122043, 'tp': 1991, 'tn': 2037, 'fp': 456, 'fn': 516, 'auroc': 0.8688917721114933, 'auprc': 0.8008433051122514, 'eval_loss': 0.5272000352144242}
Correct predictions are:  4028
Total predictions are:  5000
Accuracy on test set is: 0.8056 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5445543847343279, 'tp': 2341, 'tn': 1431, 'fp': 1062, 'fn': 166, 'auroc': 0.6453358594331379, 'auprc': 0.535807032260618, 'eval_loss': 0.548152256321907}
Correct predictions are:  3772
Total predictions are:  5000
Accuracy on test set is: 0.7544 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.547477658935102, 'tp': 1512, 'tn': 2286, 'fp': 207, 'fn': 995, 'auroc': 0.875479343758055, 'auprc': 0.8674727168069667, 'eval_loss': 0.5778911242961884}
Correct predictions are:  3798
Total predictions are:  5000
Accuracy on test set is: 0.7596 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.555351770833568, 'tp': 2273, 'tn': 1561, 'fp': 932, 'fn': 234, 'auroc': 0.8562730331805801, 'auprc': 0.7896650311983344, 'eval_loss': 0.539209355354309}
Correct predictions are:  3834
Total predictions are:  5000
Accuracy on test set is: 0.7668 


[0.7804, 0.7928, 0.806, 0.8046, 0.8042, 0.8036, 0.8056, 0.7544, 0.7596, 0.7668]

RUN NUMBER:  5


Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifie


EPOCH NUMBER:  0

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 2507, 'tn': 0, 'fp': 2493, 'fn': 0, 'auroc': 0.27644680734296956, 'auprc': 0.3704816650816938, 'eval_loss': 0.6939984322547913}
Correct predictions are:  2507
Total predictions are:  5000
Accuracy on test set is: 0.5014 



EPOCH NUMBER:  1

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.0, 'tp': 0, 'tn': 2493, 'fp': 0, 'fn': 2507, 'auroc': 0.798098017088454, 'auprc': 0.7992092687833255, 'eval_loss': 0.6931283585548401}
Correct predictions are:  2493
Total predictions are:  5000
Accuracy on test set is: 0.4986 



EPOCH NUMBER:  2

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.564176578329915, 'tp': 1890, 'tn': 2018, 'fp': 475, 'fn': 617, 'auroc': 0.8563351136672911, 'auprc': 0.8564736168623516, 'eval_loss': 0.5111249790191651}
Correct predictions are:  3908
Total predictions are:  5000
Accuracy on test set is: 0.7816 



EPOCH NUMBER:  3

NOW TRAIN THE MODEL.

NOW EVALUATE THE TEST DF.




{'mcc': 0.58586652433231, 'tp': 1879, 'tn': 2080, 'fp': 413, 'fn': 628, 'auroc': 0.8770653561923926, 'auprc': 0.8787057944437645, 'eval_loss': 0.4939262263417244}
Correct predictions are:  3959
Total predictions are:  5000
Accuracy on test set is: 0.7918 



EPOCH NUMBER:  4

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5923763401403235, 'tp': 1791, 'tn': 2171, 'fp': 322, 'fn': 716, 'auroc': 0.8838447693429917, 'auprc': 0.8810824967742336, 'eval_loss': 0.5132007195830345}
Correct predictions are:  3962
Total predictions are:  5000
Accuracy on test set is: 0.7924 



EPOCH NUMBER:  5

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.590421741748309, 'tp': 2168, 'tn': 1793, 'fp': 700, 'fn': 339, 'auroc': 0.8813509097911327, 'auprc': 0.868647392117236, 'eval_loss': 0.5071105330109597}
Correct predictions are:  3961
Total predictions are:  5000
Accuracy on test set is: 0.7922 



EPOCH NUMBER:  6

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5967697848278001, 'tp': 2068, 'tn': 1922, 'fp': 571, 'fn': 439, 'auroc': 0.8754818237774984, 'auprc': 0.8667533826427959, 'eval_loss': 0.5109570847392082}
Correct predictions are:  3990
Total predictions are:  5000
Accuracy on test set is: 0.798 



EPOCH NUMBER:  7

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5846683506751273, 'tp': 1888, 'tn': 2069, 'fp': 424, 'fn': 619, 'auroc': 0.8516425968779595, 'auprc': 0.7950767692459505, 'eval_loss': 0.5026116547942161}
Correct predictions are:  3957
Total predictions are:  5000
Accuracy on test set is: 0.7914 



EPOCH NUMBER:  8

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.5674634105378706, 'tp': 1735, 'tn': 2161, 'fp': 332, 'fn': 772, 'auroc': 0.8133990970489209, 'auprc': 0.8423729753470552, 'eval_loss': 0.582154800605774}
Correct predictions are:  3896
Total predictions are:  5000
Accuracy on test set is: 0.7792 



EPOCH NUMBER:  9

NOW TRAIN THE MODEL.





NOW EVALUATE THE TEST DF.




{'mcc': 0.521745499786021, 'tp': 1506, 'tn': 2238, 'fp': 255, 'fn': 1001, 'auroc': 0.8510947525828603, 'auprc': 0.8436577444028861, 'eval_loss': 0.5632236823916436}
Correct predictions are:  3744
Total predictions are:  5000
Accuracy on test set is: 0.7488 


[0.5014, 0.4986, 0.7816, 0.7918, 0.7924, 0.7922, 0.798, 0.7914, 0.7792, 0.7488]


 Over all runs maximum accuracies are: [0.798, 0.7998, 0.806, 0.8234, 0.837]
The median is: 0.806
RoBERTa Accuracy Score on Test set ->  ['0.806 +/- 0.030999999999999917']


## Now show compact results in a table.

In [None]:
print(" PREPRO FUNCTION    |  Test Accuracy   |",end = '')

print("\n")
for prepro_func in prepro_functions_dict_comb:
  #print(prepro_func,"\t\t\t",format(round(model_results[prepro_func][0],4),'.4f'),"\t\t",end='')
  result = model_results[prepro_func][0]
  # result = format(round(model_results[prepro_func][0],4),'.4f')
  print(f'{prepro_func:27}{ result :12}')
  print("\n")