<a href="https://colab.research.google.com/github/engr-owais-ali/Drum-Kit-JS/blob/patch-2/classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Classification

To run this script, you need the following files found in the /data directory:
- "final_labels_SG1.xlsx"
- "final_labels_SG2.xlsx"

## Imports and set-up

In [1]:
!pip install transformers
!pip install sentencepiece

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.25.1-py3-none-any.whl (5.8 MB)
[K     |████████████████████████████████| 5.8 MB 38.2 MB/s 
Collecting huggingface-hub<1.0,>=0.10.0
  Downloading huggingface_hub-0.11.1-py3-none-any.whl (182 kB)
[K     |████████████████████████████████| 182 kB 74.5 MB/s 
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
[K     |████████████████████████████████| 7.6 MB 54.6 MB/s 
Installing collected packages: tokenizers, huggingface-hub, transformers
Successfully installed huggingface-hub-0.11.1 tokenizers-0.13.2 transformers-4.25.1
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting sentencepiece
  Downloading sentencepiece-0.1.97-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[K   

In [2]:
!nvidia-smi

Mon Jan  2 13:31:25 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   58C    P0    29W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [3]:
import sys
import os
import time
import re
import random
from typing import Dict, List, Optional, Union
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow.keras as keras

from sklearn.model_selection import train_test_split, KFold, StratifiedKFold
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score
import tensorflow as tf
from transformers import BertTokenizer, BertConfig, TFBertForSequenceClassification
from transformers import DistilBertTokenizer, TFDistilBertForSequenceClassification
from transformers import RobertaTokenizer, TFRobertaForSequenceClassification
from transformers import ElectraTokenizer, TFElectraForSequenceClassification
from transformers import XLNetTokenizer, TFXLNetForSequenceClassification
from transformers import LongformerTokenizer, TFLongformerForSequenceClassification
from transformers import BartTokenizer, BartForSequenceClassification, FlaxBartForSequenceClassification
from transformers import AlbertTokenizer, TFAlbertForSequenceClassification
from transformers import ConvBertTokenizer, TFConvBertForSequenceClassification
from transformers import XLNetTokenizer, TFXLNetForSequenceClassification
from transformers import DebertaTokenizer, TFDebertaForSequenceClassification

In [4]:
# set seed, TF uses python ramdom and numpy library, so these must also be fixed
tf.random.set_seed(0)
random.seed(0)
np.random.seed(0)
os.environ['PYTHONHASHSEED']=str(0)
os.environ['TF_DETERMINISTIC_OPS'] = '0'

In [5]:
# see if hardware accelerator available
tf.config.experimental.list_physical_devices()

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

In [6]:
tf.test.gpu_device_name()

'/device:GPU:0'

In [7]:
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


In [8]:
%cd drive/MyDrive/Github/Neural-Media-Bias-Detection-Using-Distant-Supervision-With-BABE
!ls

/content/drive/MyDrive/Github/Neural-Media-Bias-Detection-Using-Distant-Supervision-With-BABE
annotation_guidelines_BABE.pdf	     demographic_questionnaire.pdf
annotator_demographics.csv	     distant_supervision.ipynb
checkpoints			     features_engineering.ipynb
classification_baseline_model.ipynb  LICENSE
classification.ipynb		     README.md
data				     topics_keywords_platforms.txt
data_set_evaluation.ipynb


If GPUs are available, tensorflow will give priority to it automatically and computations will be performed on the GPU as default. That behavior can be changed by assigning a task explicitly to a device. Example:

```
with tf.device('/CPU:0'):
```



## Preprocessing

In [9]:
PATH_sg1 = "data/final_labels_SG1.xlsx"
PATH_sg2 = "data/final_labels_SG2.xlsx"
df_sg1 = pd.read_excel(PATH_sg1)
df_sg2 = pd.read_excel(PATH_sg2)
df_sg1.rename(columns={'text': 'sentence', 'label_bias': 'Label_bias'}, inplace=True)
df_sg2.rename(columns={'text': 'sentence', 'label_bias': 'Label_bias'}, inplace=True)
df_sg1.head()

Unnamed: 0,sentence,news_link,outlet,topic,type,Label_bias,label_opinion,biased_words
0,The Republican president assumed he was helpin...,http://www.msnbc.com/rachel-maddow-show/auto-i...,msnbc,environment,left,Biased,Expresses writer’s opinion,[]
1,Though the indictment of a woman for her own p...,https://eu.usatoday.com/story/news/nation/2019...,usa-today,abortion,center,Non-biased,Somewhat factual but also opinionated,[]
2,Ingraham began the exchange by noting American...,https://www.breitbart.com/economy/2020/01/12/d...,breitbart,immigration,right,No agreement,No agreement,['flood']
3,The tragedy of America’s 18 years in Afghanist...,http://feedproxy.google.com/~r/breitbart/~3/ER...,breitbart,international-politics-and-world-news,right,Biased,Somewhat factual but also opinionated,"['tragedy', 'stubborn']"
4,The justices threw out a challenge from gun ri...,https://www.huffpost.com/entry/supreme-court-g...,msnbc,gun-control,left,Non-biased,Entirely factual,[]


In [10]:
# binarize classification problem
df_sg1 = df_sg1[df_sg1['Label_bias']!='No agreement']
df_sg1 = df_sg1[df_sg1['Label_bias'].isna()==False]
df_sg1.replace(to_replace='Biased', value=1, inplace=True)
df_sg1.replace(to_replace='Non-biased', value=0, inplace=True)

df_sg2 = df_sg2[df_sg2['Label_bias']!='No agreement']
df_sg2.replace(to_replace='Biased', value=1, inplace=True)
df_sg2.replace(to_replace='Non-biased', value=0, inplace=True)

In [11]:
# Stratified k-Fold instance
skfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=42)

The rest of the preprocessing needs to be performed inside the folds as a) encoder layers shouldn't be allowed to see whole data to construct the lookups and b) indexing with skfold is not possible when data is in tensorflow format.

In [12]:
# helper functions called in skfold loop

def pd_to_tf(df):
    """convert a pandas dataframe into a tensorflow dataset"""
    target = df.pop('Label_bias')
    sentence = df.pop('sentence')
    return tf.data.Dataset.from_tensor_slices((sentence.values, target.values))

def plot_graphs(history, metric):
  plt.plot(history.history[metric])
  plt.plot(history.history['val_'+metric], '')
  plt.xlabel("Epochs")
  plt.ylabel(metric)
  plt.legend([metric, 'val_'+metric])
  plt.show()

def tokenize(df):
    """convert a pandas dataframe into a tensorflow dataset and run hugging face's tokenizer on data"""
    target = df.pop('Label_bias')
    sentence = df.pop('sentence')

    train_encodings = tokenizer(
                        sentence.tolist(),                      
                        add_special_tokens = True, # add [CLS], [SEP]
                        truncation = True, # cut off at max length of the text that can go to BERT
                        padding = True, # add [PAD] tokens
                        return_attention_mask = True, # add attention mask to not focus on pad tokens
              )
    
    dataset = tf.data.Dataset.from_tensor_slices(
        (dict(train_encodings), 
         target.tolist()))
    return dataset

## Attention-based models


In [13]:
def run_model_5fold(df_train, model_name, freeze_encoder=True, pretrained=False, plot=False):
  """"freeze flags whether encoder layer should be frozen to not destroy transfer learning. Only set to false when enough data is provided"""

  # these variables will be needed for skfold to select indices
  Y = df_train['Label_bias']
  X = df_train['sentence']

  # hyperparams
  BUFFER_SIZE = 10000
  BATCH_SIZE = 32
  k = 1

  val_loss = []
  val_acc = []
  val_prec = []
  val_rec = []
  val_f1 = []
  val_f1_micro = []
  val_f1_wmacro = []

  for train_index, val_index in skfold.split(X,Y):
    print('### Start fold {}'.format(k))
    
    # split into train and validation set
    train_dataset = df_train.iloc[train_index]
    val_dataset = df_train.iloc[val_index]

    # prepare data for transformer
    train_dataset = tokenize(train_dataset)
    val_dataset = tokenize(val_dataset)

    # mini-batch it
    train_dataset = train_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)
    val_dataset = val_dataset.batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)

    # create new model
    if model_name == 'bert':
      model = TFBertForSequenceClassification.from_pretrained("bert-base-uncased")
    if model_name == 'distilbert':
      model = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased')
    elif model_name == 'roberta':
      model = TFRobertaForSequenceClassification.from_pretrained('roberta-base')
    elif model_name == 'electra':
      model = TFElectraForSequenceClassification.from_pretrained('google/electra-small-discriminator')
    elif model_name == 'xlnet':
      model = TFXLNetForSequenceClassification.from_pretrained('xlnet-base-cased')
    elif model_name == 'convbert':
      model = TFConvBertForSequenceClassification.from_pretrained("YituTech/conv-bert-base")
    elif model_name == 'deberta':
      model = TFDebertaForSequenceClassification.from_pretrained("kamalkraj/deberta-base")




    if freeze_encoder == True:
      for w in model.get_layer(index=0).weights:
        w._trainable = False

    # compile it
    optimizer = tf.keras.optimizers.Adam(learning_rate=2e-5) 
    model.compile(optimizer=optimizer, loss=model.hf_compute_loss) 

    # transfer learning
    if pretrained == True:
      model.get_layer(index=0).set_weights(trained_model_layer) # load bias-specific weights
      #model.load_weights('./checkpoints/')
    
    # after 2 epochs without improvement, stop training
    callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=1, restore_best_weights=True)

    # fit it
    history = model.fit(train_dataset, epochs=10, validation_data = val_dataset, callbacks=[callback])
    
    # plot history
    if plot:
      plot_graphs(history,'loss')

    # evaluate
    loss = model.evaluate(val_dataset)
    
    if model_name == 'xlnet':
      yhats = []
      for row in df_train.iloc[val_index]['sentence']:
        input = tokenizer(row, return_tensors="tf")
        output = model(input)
        logits = output.logits.numpy()[0]
        candidates = logits.tolist()
        decision = candidates.index(max(candidates))
        yhats.append(decision)
    else:
      logits = model.predict(val_dataset)  
      yhats = []
      for i in logits[0]:
        # assign class label according to highest logit
        candidates = i.tolist()
        decision = candidates.index(max(candidates))
        yhats.append(decision)
    
    y = []
    for text, label in val_dataset.unbatch():   
      y.append(label.numpy())
    
    val_loss.append(loss)
    val_acc.append(accuracy_score(y, yhats))
    val_prec.append(precision_score(y, yhats))
    val_rec.append(recall_score(y, yhats))
    val_f1.append(f1_score(y, yhats))
    val_f1_micro.append(f1_score(y, yhats, average='micro'))
    val_f1_wmacro.append(f1_score(y, yhats, average='weighted'))

    tf.keras.backend.clear_session()

    k += 1

  return val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro

### BERT

In [14]:
import time
time.sleep(200)

KeyboardInterrupt: ignored

In [None]:
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
# without distant signal pretraining


df_train = df_sg1 
model_name='bert' 
freeze_encoder=False
pretrained=False


  # these variables will be needed for skfold to select indices
Y = df_train['Label_bias']
X = df_train['sentence']

# hyperparams
BUFFER_SIZE = 10000
BATCH_SIZE = 32
k = 1

val_loss = []
val_acc = []
val_prec = []
val_rec = []
val_f1 = []
val_f1_micro = []
val_f1_wmacro = []

for train_index, val_index in skfold.split(X,Y):
  print('### Start fold {}'.format(k))
  
  # split into train and validation set
  train_dataset = df_train.iloc[train_index]
  val_dataset = df_train.iloc[val_index]
  
  # prepare data for transformer
  train_dataset = tokenize(train_dataset)
  val_dataset = tokenize(val_dataset)

  # mini-batch it
  train_dataset = train_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)
  val_dataset = val_dataset.batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)



  # create new model

  model = TFBertForSequenceClassification.from_pretrained("bert-base-uncased")

  if freeze_encoder == True:
    for w in model.get_layer(index=0).weights:
      w._trainable = False

  # compile it
  optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5) 
  model.compile(optimizer=optimizer, loss=model.hf_compute_loss) 




  # print("Val_dataset: ", val_dataset)
  # for a in val_dataset:
  #   for b in a:
  #     print("a: ", b)
  #     print("NweLLine")
  #   print("GOING")
  # print("AND")
  # print("train_dataset: ", train_dataset)





  # transfer learning
  if pretrained == True:
    model.get_layer(index=0).set_weights(trained_model_layer) # load bias-specific weights
    #model.load_weights('./checkpoints/')
  
  # after 2 epochs without improvement, stop training
  callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=1, restore_best_weights=True)

  # # fit it
  history = model.fit(train_dataset, epochs=10, validation_data = val_dataset, callbacks=[callback])
  
  # # plot history
  # if plot:
  #   plot_graphs(history,'loss')

  # evaluate
  # loss = model.evaluate(val_dataset)
  
  if model_name == 'xlnet':
    yhats = []
    for row in df_train.iloc[val_index]['sentence']:
      input = tokenizer(row, return_tensors="tf")
      output = model(input)
      logits = output.logits.numpy()[0]
      candidates = logits.tolist()
      decision = candidates.index(max(candidates))
      yhats.append(decision)
  else:
    logits = model.predict(val_dataset)  
    yhats = []
    for i in logits[0]:
      # assign class label according to highest logit
      candidates = i.tolist()
      decision = candidates.index(max(candidates))
      yhats.append(decision)
  
  y = []
  for text, label in val_dataset.unbatch():   
    y.append(label.numpy())
  
  # val_loss.append(loss)
  val_acc.append(accuracy_score(y, yhats))
  val_prec.append(precision_score(y, yhats))
  val_rec.append(recall_score(y, yhats))
  val_f1.append(f1_score(y, yhats))
  val_f1_micro.append(f1_score(y, yhats, average='micro'))
  val_f1_wmacro.append(f1_score(y, yhats, average='weighted'))

  tf.keras.backend.clear_session()

  k += 1


# val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='bert', 
                                                                                            # freeze_encoder=False, pretrained=False)
# inspect metrics
# loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

# print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
# without distant signal pretraining
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg2, model_name='bert', 
                                                                                            freeze_encoder=False, pretrained=False)

In [None]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for BERT on SG2')
print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
# without distant signal pretraining
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='bert', 
                                                                                            freeze_encoder=False, pretrained=False)

In [None]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for BERT on SG1')
print('10-Fold CV Loss: {}'.format(loss_cv))
print('10-Fold CV Accuracy: {}'.format(acc_cv))
print('10-Fold CV Precision: {}'.format(prec_cv))
print('10-Fold CV Recall: {}'.format(rec_cv))
print('10-Fold CV F1 Score: {}'.format(f1_cv))
print('10-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('10-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

### BERT + distant

In [23]:
# load model layer weights from pretraining on distant dataset 
# compile model
#transfer_model = TFRobertaForSequenceClassification.from_pretrained('roberta-base')
transfer_model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased')
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5)
transfer_model.compile(optimizer=optimizer, loss=transfer_model.compute_loss) 

transfer_model.load_weights('./checkpoints/roberta_final_checkpoint_news_headlines_USA')
trained_model_layer = transfer_model.get_layer(index=0).get_weights()

Downloading:   0%|          | 0.00/570 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/536M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [24]:
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
# with distant signal pretraining
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='bert', 
                                                                                            freeze_encoder=False, pretrained=True)

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

### Start fold 1


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10




Epoch 2/10
Epoch 3/10
### Start fold 2


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
### Start fold 3


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
### Start fold 4


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
### Start fold 5


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
### Start fold 6


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
### Start fold 7


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
### Start fold 8


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
### Start fold 9


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
### Start fold 10


All model checkpoint layers were used when initializing TFBertForSequenceClassification.

Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10


In [25]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for BERT + distant on SG1')
print('10-Fold CV Loss: {}'.format(loss_cv))
print('10-Fold CV Accuracy: {}'.format(acc_cv))
print('10-Fold CV Precision: {}'.format(prec_cv))
print('10-Fold CV Recall: {}'.format(rec_cv))
print('10-Fold CV F1 Score: {}'.format(f1_cv))
print('10-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('10-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

Results for BERT + distant on SG1
10-Fold CV Loss: 0.5196687072515488
10-Fold CV Accuracy: 0.7632551319648093
10-Fold CV Precision: 0.7828247128144844
10-Fold CV Recall: 0.7181801801801801
10-Fold CV F1 Score: 0.7429367998666617
10-Fold CV Micro F1 Score: 0.7632551319648093
10-Fold CV Weighted Macro F1 Score: 0.7608342066327985


In [None]:
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
# with distant signal pretraining
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg2, model_name='bert', 
                                                                                            freeze_encoder=False, pretrained=True)

In [None]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for BERT + distant on SG2')
print('10-Fold CV Loss: {}'.format(loss_cv))
print('10-Fold CV Accuracy: {}'.format(acc_cv))
print('10-Fold CV Precision: {}'.format(prec_cv))
print('10-Fold CV Recall: {}'.format(rec_cv))
print('10-Fold CV F1 Score: {}'.format(f1_cv))
print('10-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('10-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

### DistilBERT

In [None]:
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='distilbert', 
                                                                                            freeze_encoder=False, pretrained=False)

In [None]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for DistilBERT on SG1')
print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg2, model_name='distilbert', 
                                                                                            freeze_encoder=False, pretrained=False)

In [None]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for DistilBERT on SG2')
print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
#@title TFALBERT
from transformers import AlbertTokenizer, TFAlbertForSequenceClassification

tokenizer = AlbertTokenizer.from_pretrained("vumichien/albert-base-v2-imdb")



df_train = df_sg1 
model_name='albert' 
freeze_encoder=False
pretrained=False


  # these variables will be needed for skfold to select indices
Y = df_train['Label_bias']
X = df_train['sentence']

# hyperparams
BUFFER_SIZE = 250
BATCH_SIZE = 32
k = 1

val_loss = []
val_acc = []
val_prec = []
val_rec = []
val_f1 = []
val_f1_micro = []
val_f1_wmacro = []

for train_index, val_index in skfold.split(X,Y):
  print('### Start fold {}'.format(k))
  
  # split into train and validation set
  train_dataset = df_train.iloc[train_index]
  val_dataset = df_train.iloc[val_index]
  
  # prepare data for transformer
  train_dataset = tokenize(train_dataset)
  val_dataset = tokenize(val_dataset)

  # mini-batch it
  train_dataset = train_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)
  val_dataset = val_dataset.batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)

  # create new model

  model = TFAlbertForSequenceClassification.from_pretrained("vumichien/albert-base-v2-imdb")

  # compile it
  optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5) 
  model.compile(optimizer=optimizer, loss=model.hf_compute_loss) 

  
  # after 2 epochs without improvement, stop training
  callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=1, restore_best_weights=True)

  # # fit it
  history = model.fit(train_dataset, epochs=10, validation_data = val_dataset, callbacks=[callback])
  
  # # plot history
  # if plot:
  #   plot_graphs(history,'loss')

  # evaluate
  # loss = model.evaluate(val_dataset)
  
  if model_name == 'xlnet':
    yhats = []
    for row in df_train.iloc[val_index]['sentence']:
      input = tokenizer(row, return_tensors="tf")
      output = model(input)
      logits = output.logits.numpy()[0]
      candidates = logits.tolist()
      decision = candidates.index(max(candidates))
      yhats.append(decision)
  else:
    logits = model.predict(val_dataset)  
    yhats = []
    for i in logits[0]:
      # assign class label according to highest logit
      candidates = i.tolist()
      decision = candidates.index(max(candidates))
      yhats.append(decision)
  
  y = []
  for text, label in val_dataset.unbatch():   
    y.append(label.numpy())
  
  # val_loss.append(loss)
  val_acc.append(accuracy_score(y, yhats))
  val_prec.append(precision_score(y, yhats))
  val_rec.append(recall_score(y, yhats))
  val_f1.append(f1_score(y, yhats))
  val_f1_micro.append(f1_score(y, yhats, average='micro'))
  val_f1_wmacro.append(f1_score(y, yhats, average='weighted'))

  tf.keras.backend.clear_session()

  k += 1


# val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='bert', 
                                                                                            # freeze_encoder=False, pretrained=False)
# inspect metrics
# loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

# print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
  tokenizer = AlbertTokenizer.from_pretrained("vumichien/albert-base-v2-imdb")
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg2, model_name='alberta', 
                                                                                            freeze_encoder=False, pretrained=False)

In [None]:
#@title CTRL
from transformers import CTRLTokenizer, TFCTRLForSequenceClassification


tokenizer = CTRLTokenizer.from_pretrained("ctrl")


df_train = df_sg1 
model_name='albert' 
freeze_encoder=False
pretrained=False


  # these variables will be needed for skfold to select indices
Y = df_train['Label_bias']
X = df_train['sentence']

# hyperparams
BUFFER_SIZE = 250
BATCH_SIZE = 32
k = 1

val_loss = []
val_acc = []
val_prec = []
val_rec = []
val_f1 = []
val_f1_micro = []
val_f1_wmacro = []

for train_index, val_index in skfold.split(X,Y):
  print('### Start fold {}'.format(k))
  
  # split into train and validation set
  train_dataset = df_train.iloc[train_index]
  val_dataset = df_train.iloc[val_index]
  
  if tokenizer.pad_token is None:
      tokenizer.add_special_tokens({'pad_token': '[PAD]'})

  # prepare data for transformer
  train_dataset = tokenize(train_dataset)
  
  val_dataset = tokenize(val_dataset)


  # mini-batch it
  train_dataset = train_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)
  val_dataset = val_dataset.batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)

  # create new model

  model = TFCTRLForSequenceClassification.from_pretrained("ctrl")



  # compile it
  optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5) 
  model.compile(optimizer=optimizer, loss=model.hf_compute_loss) 

  
  # after 2 epochs without improvement, stop training
  callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=1, restore_best_weights=True)

  # # fit it
  history = model.fit(train_dataset, epochs=10, validation_data = val_dataset, callbacks=[callback])
  
  # # plot history
  # if plot:
  #   plot_graphs(history,'loss')

  # evaluate
  # loss = model.evaluate(val_dataset)
  
  if model_name == 'xlnet':
    yhats = []
    for row in df_train.iloc[val_index]['sentence']:
      input = tokenizer(row, return_tensors="tf")
      output = model(input)
      logits = output.logits.numpy()[0]
      candidates = logits.tolist()
      decision = candidates.index(max(candidates))
      yhats.append(decision)
  else:
    logits = model.predict(val_dataset)  
    yhats = []
    for i in logits[0]:
      # assign class label according to highest logit
      candidates = i.tolist()
      decision = candidates.index(max(candidates))
      yhats.append(decision)
  
  y = []
  for text, label in val_dataset.unbatch():   
    y.append(label.numpy())
  
  # val_loss.append(loss)
  val_acc.append(accuracy_score(y, yhats))
  val_prec.append(precision_score(y, yhats))
  val_rec.append(recall_score(y, yhats))
  val_f1.append(f1_score(y, yhats))
  val_f1_micro.append(f1_score(y, yhats, average='micro'))
  val_f1_wmacro.append(f1_score(y, yhats, average='weighted'))

  tf.keras.backend.clear_session()

  k += 1


# val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='bert', 
                                                                                            # freeze_encoder=False, pretrained=False)
# inspect metrics
# loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

# print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
#@title Deberta
from transformers import DebertaTokenizer, TFDebertaForSequenceClassification


tokenizer = DebertaTokenizer.from_pretrained("kamalkraj/deberta-base")


df_train = df_sg1 
model_name='albert' 
freeze_encoder=False
pretrained=False


  # these variables will be needed for skfold to select indices
Y = df_train['Label_bias']
X = df_train['sentence']

# hyperparams
BUFFER_SIZE = 250
BATCH_SIZE = 32
k = 1

val_loss = []
val_acc = []
val_prec = []
val_rec = []
val_f1 = []
val_f1_micro = []
val_f1_wmacro = []

for train_index, val_index in skfold.split(X,Y):
  print('### Start fold {}'.format(k))
  
  # split into train and validation set
  train_dataset = df_train.iloc[train_index]
  val_dataset = df_train.iloc[val_index]
  
  if tokenizer.pad_token is None:
      tokenizer.add_special_tokens({'pad_token': '[PAD]'})

  # prepare data for transformer
  train_dataset = tokenize(train_dataset)
  
  val_dataset = tokenize(val_dataset)


  # mini-batch it
  train_dataset = train_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)
  val_dataset = val_dataset.batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)

  # create new model

  model = TFDebertaForSequenceClassification.from_pretrained("kamalkraj/deberta-base")




  # compile it
  optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5) 
  model.compile(optimizer=optimizer, loss=model.hf_compute_loss) 

  
  # after 2 epochs without improvement, stop training
  callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=1, restore_best_weights=True)

  # # fit it
  history = model.fit(train_dataset, epochs=10, validation_data = val_dataset, callbacks=[callback])
  
  # # plot history
  # if plot:
  #   plot_graphs(history,'loss')

  # evaluate
  # loss = model.evaluate(val_dataset)
  
  if model_name == 'xlnet':
    yhats = []
    for row in df_train.iloc[val_index]['sentence']:
      input = tokenizer(row, return_tensors="tf")
      output = model(input)
      logits = output.logits.numpy()[0]
      candidates = logits.tolist()
      decision = candidates.index(max(candidates))
      yhats.append(decision)
  else:
    logits = model.predict(val_dataset)  
    yhats = []
    for i in logits[0]:
      # assign class label according to highest logit
      candidates = i.tolist()
      decision = candidates.index(max(candidates))
      yhats.append(decision)
  
  y = []
  for text, label in val_dataset.unbatch():   
    y.append(label.numpy())
  
  # val_loss.append(loss)
  val_acc.append(accuracy_score(y, yhats))
  val_prec.append(precision_score(y, yhats))
  val_rec.append(recall_score(y, yhats))
  val_f1.append(f1_score(y, yhats))
  val_f1_micro.append(f1_score(y, yhats, average='micro'))
  val_f1_wmacro.append(f1_score(y, yhats, average='weighted'))

  tf.keras.backend.clear_session()

  k += 1


# val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='bert', 
                                                                                            # freeze_encoder=False, pretrained=False)
# inspect metrics
# loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

# print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
#@title DebertaV2
from transformers import DebertaV2Tokenizer, TFDebertaV2ForSequenceClassification


tokenizer = DebertaV2Tokenizer.from_pretrained("kamalkraj/deberta-v2-xlarge")


df_train = df_sg1 
model_name='albert' 
freeze_encoder=False
pretrained=False


  # these variables will be needed for skfold to select indices
Y = df_train['Label_bias']
X = df_train['sentence']

# hyperparams
BUFFER_SIZE = 250
BATCH_SIZE = 32
k = 1

val_loss = []
val_acc = []
val_prec = []
val_rec = []
val_f1 = []
val_f1_micro = []
val_f1_wmacro = []

for train_index, val_index in skfold.split(X,Y):
  print('### Start fold {}'.format(k))
  
  # split into train and validation set
  train_dataset = df_train.iloc[train_index]
  val_dataset = df_train.iloc[val_index]
  
  if tokenizer.pad_token is None:
      tokenizer.add_special_tokens({'pad_token': '[PAD]'})

  # prepare data for transformer
  train_dataset = tokenize(train_dataset)
  
  val_dataset = tokenize(val_dataset)


  # mini-batch it
  train_dataset = train_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)
  val_dataset = val_dataset.batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)

  # create new model

  model = TFDebertaV2ForSequenceClassification.from_pretrained("kamalkraj/deberta-v2-xlarge")




  # compile it
  optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5) 
  model.compile(optimizer=optimizer, loss=model.hf_compute_loss) 

  
  # after 2 epochs without improvement, stop training
  callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=1, restore_best_weights=True)

  # # fit it
  history = model.fit(train_dataset, epochs=10, validation_data = val_dataset, callbacks=[callback])
  
  # # plot history
  # if plot:
  #   plot_graphs(history,'loss')

  # evaluate
  # loss = model.evaluate(val_dataset)
  
  if model_name == 'xlnet':
    yhats = []
    for row in df_train.iloc[val_index]['sentence']:
      input = tokenizer(row, return_tensors="tf")
      output = model(input)
      logits = output.logits.numpy()[0]
      candidates = logits.tolist()
      decision = candidates.index(max(candidates))
      yhats.append(decision)
  else:
    logits = model.predict(val_dataset)  
    yhats = []
    for i in logits[0]:
      # assign class label according to highest logit
      candidates = i.tolist()
      decision = candidates.index(max(candidates))
      yhats.append(decision)
  
  y = []
  for text, label in val_dataset.unbatch():   
    y.append(label.numpy())
  
  # val_loss.append(loss)
  val_acc.append(accuracy_score(y, yhats))
  val_prec.append(precision_score(y, yhats))
  val_rec.append(recall_score(y, yhats))
  val_f1.append(f1_score(y, yhats))
  val_f1_micro.append(f1_score(y, yhats, average='micro'))
  val_f1_wmacro.append(f1_score(y, yhats, average='weighted'))

  tf.keras.backend.clear_session()

  k += 1


# val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='bert', 
                                                                                            # freeze_encoder=False, pretrained=False)
# inspect metrics
# loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

# print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
!pip install sacremoses

In [None]:
#@title FlauBERT
from transformers import FlaubertTokenizer, TFFlaubertForSequenceClassification


tokenizer = FlaubertTokenizer.from_pretrained("flaubert/flaubert_base_cased")


df_train = df_sg1 
model_name='albert' 
freeze_encoder=False
pretrained=False


  # these variables will be needed for skfold to select indices
Y = df_train['Label_bias']
X = df_train['sentence']

# hyperparams
BUFFER_SIZE = 250
BATCH_SIZE = 32
k = 1

val_loss = []
val_acc = []
val_prec = []
val_rec = []
val_f1 = []
val_f1_micro = []
val_f1_wmacro = []

for train_index, val_index in skfold.split(X,Y):
  print('### Start fold {}'.format(k))
  
  # split into train and validation set
  train_dataset = df_train.iloc[train_index]
  val_dataset = df_train.iloc[val_index]
  
  if tokenizer.pad_token is None:
      tokenizer.add_special_tokens({'pad_token': '[PAD]'})

  # prepare data for transformer
  train_dataset = tokenize(train_dataset)
  
  val_dataset = tokenize(val_dataset)


  # mini-batch it
  train_dataset = train_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)
  val_dataset = val_dataset.batch(BATCH_SIZE).prefetch(tf.data.experimental.AUTOTUNE)

  # create new model

  model = TFFlaubertForSequenceClassification.from_pretrained("flaubert/flaubert_base_cased", from_pt=True)




  # compile it
  optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5) 
  model.compile(optimizer=optimizer, loss=model.hf_compute_loss) 

  
  # after 2 epochs without improvement, stop training
  callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=1, restore_best_weights=True)

  # # fit it
  history = model.fit(train_dataset, epochs=10, validation_data = val_dataset, callbacks=[callback])
  
  # # plot history
  # if plot:
  #   plot_graphs(history,'loss')

  # evaluate
  # loss = model.evaluate(val_dataset)
  
  if model_name == 'xlnet':
    yhats = []
    for row in df_train.iloc[val_index]['sentence']:
      input = tokenizer(row, return_tensors="tf")
      output = model(input)
      logits = output.logits.numpy()[0]
      candidates = logits.tolist()
      decision = candidates.index(max(candidates))
      yhats.append(decision)
  else:
    logits = model.predict(val_dataset)  
    yhats = []
    for i in logits[0]:
      # assign class label according to highest logit
      candidates = i.tolist()
      decision = candidates.index(max(candidates))
      yhats.append(decision)
  
  y = []
  for text, label in val_dataset.unbatch():   
    y.append(label.numpy())
  
  # val_loss.append(loss)
  val_acc.append(accuracy_score(y, yhats))
  val_prec.append(precision_score(y, yhats))
  val_rec.append(recall_score(y, yhats))
  val_f1.append(f1_score(y, yhats))
  val_f1_micro.append(f1_score(y, yhats, average='micro'))
  val_f1_wmacro.append(f1_score(y, yhats, average='weighted'))

  tf.keras.backend.clear_session()

  k += 1


# val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='bert', 
                                                                                            # freeze_encoder=False, pretrained=False)
# inspect metrics
# loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

# print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
!git clone https://github.com/markusschanta/advent-of-code-2022/blame/main/2022/04/

### RoBERTa

In [None]:
tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='roberta', 
                                                                                            freeze_encoder=False, pretrained=False)

In [None]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for RoBERTa on SG1')
print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg2, model_name='roberta', 
                                                                                            freeze_encoder=False, pretrained=False)

In [None]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for RoBERTa on SG2')
print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

### RoBERTa + distant

In [15]:
# load model layer weights from pretraining on distant dataset 
# compile model
transfer_model = TFRobertaForSequenceClassification.from_pretrained('roberta-base')
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5)
transfer_model.compile(optimizer=optimizer, loss=transfer_model.compute_loss) 

transfer_model.load_weights('./checkpoints/roberta_final_checkpoint_news_headlines_USA')
trained_model_layer = transfer_model.get_layer(index=0).get_weights()

Downloading:   0%|          | 0.00/481 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/657M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.

Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [16]:
tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='roberta', 
                                                                                            freeze_encoder=False, pretrained=True)

Downloading:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/456k [00:00<?, ?B/s]

### Start fold 1


All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.

Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
### Start fold 2


All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.

Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
### Start fold 3


All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.

Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
### Start fold 4


All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.

Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
### Start fold 5


All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.

Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
### Start fold 6


All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.

Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
### Start fold 7


All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.

Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
### Start fold 8


All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.

Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
### Start fold 9


All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.

Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
### Start fold 10


All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.

Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10


In [18]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for RoBERTa + distant on SG1')
print('10-Fold CV Loss: {}'.format(loss_cv))
print('10-Fold CV Accuracy: {}'.format(acc_cv))
print('10-Fold CV Precision: {}'.format(prec_cv))
print('10-Fold CV Recall: {}'.format(rec_cv))
print('10-Fold CV F1 Score: {}'.format(f1_cv))
print('10-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('10-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

Results for RoBERTa + distant on SG1
10-Fold CV Loss: 0.4377031669020653
10-Fold CV Accuracy: 0.8117469627147047
10-Fold CV Precision: 0.8320801665489419
10-Fold CV Recall: 0.7679819819819821
10-Fold CV F1 Score: 0.7960682866695172
10-Fold CV Micro F1 Score: 0.8117469627147047
10-Fold CV Weighted Macro F1 Score: 0.8106482285827301


In [None]:
tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg2, model_name='roberta', 
                                                                                            freeze_encoder=False, pretrained=True)

In [None]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for RoBERTa + distant on SG2')
print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

### ELECTRA

In [None]:
tokenizer = ElectraTokenizer.from_pretrained('google/electra-small-discriminator')
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='electra', 
                                                                                            freeze_encoder=False, pretrained=False)

In [None]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for ELECTRA on SG1')
print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
tokenizer = ElectraTokenizer.from_pretrained('google/electra-small-discriminator')
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg2, model_name='electra', 
                                                                                            freeze_encoder=False, pretrained=False)

In [None]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for ELECTRA on SG2')
print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

### XLNET

In [None]:
tokenizer = XLNetTokenizer.from_pretrained('xlnet-base-cased')
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='xlnet', 
                                                                                            freeze_encoder=False, pretrained=False)

In [None]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for XLNET on SG1')
print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
tokenizer = XLNetTokenizer.from_pretrained('xlnet-base-cased')
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg2, model_name='xlnet', 
                                                                                            freeze_encoder=False, pretrained=False)

In [None]:
# inspect metrics
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for XLNET on SG2')
print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
tokenizer = XLNetTokenizer.from_pretrained('xlnet-base-cased')
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='xlnet', 
                                                                                            freeze_encoder=False, pretrained=False)

### ConvBERT

In [19]:
tokenizer = ConvBertTokenizer.from_pretrained("YituTech/conv-bert-base")

# without distant signal pretraining
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='convbert', 
                                                                                            freeze_encoder=False, pretrained=False)

Downloading:   0%|          | 0.00/267k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/674 [00:00<?, ?B/s]

### Start fold 1


Downloading:   0%|          | 0.00/423M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFConvBertForSequenceClassification.

Some layers of TFConvBertForSequenceClassification were not initialized from the model checkpoint at YituTech/conv-bert-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10


KeyboardInterrupt: ignored

In [None]:
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for Convbert on SG1')
print('10-Fold CV Loss: {}'.format(loss_cv))
print('10-Fold CV Accuracy: {}'.format(acc_cv))
print('10-Fold CV Precision: {}'.format(prec_cv))
print('10-Fold CV Recall: {}'.format(rec_cv))
print('10-Fold CV F1 Score: {}'.format(f1_cv))
print('10-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('10-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

In [None]:
tokenizer = ConvBertTokenizer.from_pretrained("YituTech/conv-bert-base")

# without distant signal pretraining
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg2, model_name='convbert', 
                                                                                            freeze_encoder=False, pretrained=False)

In [None]:
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for Convbert on SG2')
print('10-Fold CV Loss: {}'.format(loss_cv))
print('10-Fold CV Accuracy: {}'.format(acc_cv))
print('10-Fold CV Precision: {}'.format(prec_cv))
print('10-Fold CV Recall: {}'.format(rec_cv))
print('10-Fold CV F1 Score: {}'.format(f1_cv))
print('10-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('10-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

### ConvBERT + Distant


In [20]:
# load model layer weights from pretraining on distant dataset 
# compile model
transfer_model = TFConvBertForSequenceClassification.from_pretrained("YituTech/conv-bert-base")
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5)
transfer_model.compile(optimizer=optimizer, loss=transfer_model.hf_compute_loss) 

transfer_model.load_weights('./checkpoints/roberta_final_checkpoint_news_headlines_USA')
trained_model_layer = transfer_model.get_layer(index=0).get_weights()

All model checkpoint layers were used when initializing TFConvBertForSequenceClassification.

Some layers of TFConvBertForSequenceClassification were not initialized from the model checkpoint at YituTech/conv-bert-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [21]:
tokenizer = ConvBertTokenizer.from_pretrained("YituTech/conv-bert-base")

# without distant signal pretraining
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg1, model_name='convbert', 
                                                                                            freeze_encoder=False, pretrained=True)

### Start fold 1


All model checkpoint layers were used when initializing TFConvBertForSequenceClassification.

Some layers of TFConvBertForSequenceClassification were not initialized from the model checkpoint at YituTech/conv-bert-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10




Epoch 2/10
Epoch 3/10
Epoch 4/10
### Start fold 2


All model checkpoint layers were used when initializing TFConvBertForSequenceClassification.

Some layers of TFConvBertForSequenceClassification were not initialized from the model checkpoint at YituTech/conv-bert-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
### Start fold 3


All model checkpoint layers were used when initializing TFConvBertForSequenceClassification.

Some layers of TFConvBertForSequenceClassification were not initialized from the model checkpoint at YituTech/conv-bert-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
### Start fold 4


All model checkpoint layers were used when initializing TFConvBertForSequenceClassification.

Some layers of TFConvBertForSequenceClassification were not initialized from the model checkpoint at YituTech/conv-bert-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
### Start fold 5


All model checkpoint layers were used when initializing TFConvBertForSequenceClassification.

Some layers of TFConvBertForSequenceClassification were not initialized from the model checkpoint at YituTech/conv-bert-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
### Start fold 6


All model checkpoint layers were used when initializing TFConvBertForSequenceClassification.

Some layers of TFConvBertForSequenceClassification were not initialized from the model checkpoint at YituTech/conv-bert-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
### Start fold 7


All model checkpoint layers were used when initializing TFConvBertForSequenceClassification.

Some layers of TFConvBertForSequenceClassification were not initialized from the model checkpoint at YituTech/conv-bert-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
### Start fold 8


All model checkpoint layers were used when initializing TFConvBertForSequenceClassification.

Some layers of TFConvBertForSequenceClassification were not initialized from the model checkpoint at YituTech/conv-bert-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
### Start fold 9


All model checkpoint layers were used when initializing TFConvBertForSequenceClassification.

Some layers of TFConvBertForSequenceClassification were not initialized from the model checkpoint at YituTech/conv-bert-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10
### Start fold 10


All model checkpoint layers were used when initializing TFConvBertForSequenceClassification.

Some layers of TFConvBertForSequenceClassification were not initialized from the model checkpoint at YituTech/conv-bert-base and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Epoch 1/10
Epoch 2/10
Epoch 3/10


In [22]:
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for Distant + Convbert on SG1')
print('10-Fold CV Loss: {}'.format(loss_cv))
print('10-Fold CV Accuracy: {}'.format(acc_cv))
print('10-Fold CV Precision: {}'.format(prec_cv))
print('10-Fold CV Recall: {}'.format(rec_cv))
print('10-Fold CV F1 Score: {}'.format(f1_cv))
print('10-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('10-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))

Results for Distant + Convbert on SG1
10-Fold CV Loss: 0.4511946588754654
10-Fold CV Accuracy: 0.8046837033933809
10-Fold CV Precision: 0.835336794376578
10-Fold CV Recall: 0.7559459459459459
10-Fold CV F1 Score: 0.7893015910839514
10-Fold CV Micro F1 Score: 0.8046837033933809
10-Fold CV Weighted Macro F1 Score: 0.8033101509608663


In [None]:
tokenizer = ConvBertTokenizer.from_pretrained("YituTech/conv-bert-base")

# without distant signal pretraining
val_loss, val_acc, val_prec, val_rec, val_f1, val_f1_micro, val_f1_wmacro = run_model_5fold(df_sg2, model_name='convbert', 
                                                                                            freeze_encoder=False, pretrained=True)

In [None]:
loss_cv = np.mean(val_loss)
acc_cv = np.mean(val_acc)
prec_cv = np.mean(val_prec)
rec_cv = np.mean(val_rec)
f1_cv = np.mean(val_f1)
f1_micro_cv = np.mean(val_f1_micro)
f1_wmacro_cv = np.mean(val_f1_wmacro)

print('Results for Convbert+Distant on SG2')
print('5-Fold CV Loss: {}'.format(loss_cv))
print('5-Fold CV Accuracy: {}'.format(acc_cv))
print('5-Fold CV Precision: {}'.format(prec_cv))
print('5-Fold CV Recall: {}'.format(rec_cv))
print('5-Fold CV F1 Score: {}'.format(f1_cv))
print('5-Fold CV Micro F1 Score: {}'.format(f1_micro_cv))
print('5-Fold CV Weighted Macro F1 Score: {}'.format(f1_wmacro_cv))