# BERT Meets Cranfield - Enrichment and Transfer Learning Approach
*the NSP transfer learning runs*

The BM25 step finds a significant portion of the relevant documents for each query, but not all. The following notebooks implements a function that enriches the training set to find wether it would be benficial include those missed documents.

In [1]:
# %cd /content/drive/MyDrive/COMPUTING SCIENCE/THESIS_PROJECT/BERT-BM25-Thesis-Project/bert-meets-cranfield-enrich/Code
%cd /home/jupyter/BERT-BM25-Thesis-Project/bert-meets-cranfield-enrich/Code

/home/jupyter/BERT-BM25-Thesis-Project/bert-meets-cranfield-enrich/Code


In [2]:
# from google.colab import drive
# drive.mount('/content/drive')

In [3]:
!pip3 install -r ../requirements.txt



## Import

In [4]:
import utils
import data_utils
from operator import itemgetter
import os
import numpy as np

import torch
import importlib
# from transformers import BertForSequenceClassification, BertTokenizer, BertForMaskedLM, BertForNextSentencePrediction
from transformers import BertForSequenceClassification

import timeit

### Import Refresh
When a supporting py-file (such as utils.py) is changed, this code will have the lib reloaded while not reloading the entire notebook.

In [5]:
# call after making any changes in utils.py
importlib.reload(utils) 
importlib.reload(data_utils)

<module 'data_utils' from '/home/jupyter/BERT-BM25-Thesis-Project/bert-meets-cranfield-enrich/Code/data_utils.py'>

## Set hyper-paramters and test settings

In [5]:
# ========================================
#               Hyper-Parameters
# ========================================
SEED = 76
MODE = 'Re-ranker'
MODEL_TYPE = 'bert-base-uncased'
LEARNING_RATE = 2e-5
MAX_LENGTH = 128
BATCH_SIZE = 32
EPOCHS = 1
TOP_BM25 = 100
MAP_CUT = 100
NDCG_CUT = 20
if MODE == 'Full-ranker':
    TEST_BATCH_SIZE = 1400
else:
    TEST_BATCH_SIZE = 100

# Set the seed value all over the place to make this reproducible.
utils.initialize_random_generators(SEED)

BM25_ENRICH = 'default' # or 'add' or 'swap' (default=no enrichment of BM25 results)

LOAD_CUSTOM_TRAINED_MODEL = True
DO_FREEZING = False

In [6]:
models_dir = "/home/jupyter/BERT-BM25-Thesis-Project/Models/" #@param {type:"string"}
custom_model_name = "BERT_Cranfield_MLM_model-128-16-5e-05-2.bin" #@param {type:"string"}

custom_model_path = models_dir + custom_model_name 

### Enriching function for BM25 results

In [7]:
def get_bm25_plus_other_rel(bm25_tn, labels, queries):
      bm25_top_n_rel_padded = [0]*len(queries) # a bm25_top_n list padded with the remaining relevant documents
      bm25_top_n_swap = [0]*len(queries) 
    
      for qi in range(len(queries)):
        # get the list of relelvant documents
        lbi = np.where(labels[qi] == 1)
        # note this numbering is only compatible with the labels list


        # get the list of bm25_top_n
        np_bm25_qi_docs = np.array(bm25_top_n[qi]) 

        # evaluate what relevant documents should be added
        pad_rel = np.setdiff1d(lbi, np_bm25_qi_docs)
        # if len(pad_rel) > 0:
        pad_rel = tuple(pad_rel)
        bm25_top_n_rel_padded[qi] = bm25_top_n[qi] + pad_rel
        # create a list with least relevant items swapped for unfound relevant
        for i in range(len(pad_rel)):
          # CHECK
          # are we to swap a relevant document?
          current_doc = np_bm25_qi_docs[-(i+1)] 
          
          if np.count_nonzero(current_doc == lbi) > 0:
            print('Relevant doc overwritten!')
          # CONTINUE  
          np_bm25_qi_docs[-(i+1)] = pad_rel[i]
          
        bm25_top_n_swap[qi] = np_bm25_qi_docs
      return bm25_top_n_rel_padded, bm25_top_n_swap

### Function for loading custom model
Load in fact an encoder, that is trained with a specific specification

In [8]:
def load_specific_encoder(model_path):
  '''
    function to load saved encoder paramters

    use this function to start every fold with a fresh model
  '''
  model = BertForSequenceClassification.from_pretrained(
        MODEL_TYPE,
        num_labels=2,
        output_attentions=False,
        output_hidden_states=False,
    )
  model.cuda
  print('LOAD : ', model_path )

  # =======================
  # NOTE WHAT MODEL IS USED
  model.load_state_dict(torch.load(model_path), strict=False)
  # now you get a warning that extra training is required

  if DO_FREEZING:
    print('FREEZING: set requires_grad to False')
    # freeze the encoder parameters (credits thomwolf of Huggingface)
    # for param in model.bert.encoder.parameters():
    #   param.requires_grad = False

    # other method
    model.bert.encoder.requires_grad_(False)
  return model

## Train and Test

In [9]:
# if __name__ == "__main__":
def train_test():
    print("# ========================================")
    print("#               Hyper-Parameters")
    print(MODE)
    print(MODEL_TYPE)
    print(LEARNING_RATE)
    print(MAX_LENGTH)
    print(BATCH_SIZE)
    print(EPOCHS)
    print("# ========================================")
    print("#               Experiment-Settings")
    print('BM25_ENRICHMENT: ', BM25_ENRICH)
    print('BM25_ENRICHMENT: ', BM25_ENRICH)


    print("# ========================================")
    print("#               Other")
    print(torch.cuda.get_device_name())
    print("# ========================================")
    
    start = timeit.default_timer()
    
    device = utils.get_gpu_device()
    if not os.path.exists('../Output_Folder'):
        os.makedirs('../Output_Folder')

    queries = data_utils.get_queries('../Data/cran/cran.qry')
    corpus = data_utils.get_corpus('../Data/cran/cran.all.1400')
    rel_fed = data_utils.get_judgments('../Data/cran/cranqrel')

    labels = utils.get_binary_labels(rel_fed)
    tokenized_corpus = [doc.split(" ") for doc in corpus]
    tokenized_queries = [query.split(" ") for query in queries]

    bm25, bm25_top_n = utils.get_bm25_top_results(tokenized_corpus, tokenized_queries, TOP_BM25)

    # no matter what BM25_ENRICH is, this line is needed to get `temp_feedback` for the test set
    padded_all, attention_mask_all, token_type_ids_all, temp_feedback = utils.bert_tokenizer(MODE, bm25_top_n, corpus,
                                                                                             labels, queries,
                                                                                             MAX_LENGTH, MODEL_TYPE)
    if BM25_ENRICH == 'swap':
        bm25_top_n_ext, bm25_top_n_swap = get_bm25_plus_other_rel(bm25_top_n, labels, queries)
        padded_all_swap, attention_mask_all_swap, token_type_ids_all_swap, temp_feedback_swap = utils.bert_tokenizer(MODE, bm25_top_n_swap, corpus,
                                                                                                                     labels, queries,
                                                                                                                     MAX_LENGTH, MODEL_TYPE)
    elif BM25_ENRICH == 'add':
        bm25_top_n_add, bm25_top_n_swap = get_bm25_plus_other_rel(bm25_top_n, labels, queries)
        padded_all_add, attention_mask_all_add, token_type_ids_all_add, temp_feedback_add = utils.bert_tokenizer(MODE, bm25_top_n_add, corpus,
                                                                                                                 labels, queries,
                                                                                                                 MAX_LENGTH, MODEL_TYPE)

    # ========================================
    #               Folds
    # ========================================
    mrr_bm25_list, map_bm25_list, ndcg_bm25_list = [], [], []
    mrr_bert_list, map_bert_list, ndcg_bert_list = [], [], []
    mrr_bm25, map_bm25, ndcg_bm25 = 0, 0, 0
    mrr_bert, map_bert, ndcg_bert = 0, 0, 0

    for fold_number in range(1, 6):
        print('======== Fold {:} / {:} ========'.format(fold_number, 5))
        train_index, test_index = data_utils.load_fold(fold_number)

        padded, attention_mask, token_type_ids = [], [], []
        if MODE == 'Re-ranker':
            # no matter BM25_ENRICH-mode, next line required for test set construction
            padded, attention_mask, token_type_ids = padded_all, attention_mask_all, token_type_ids_all
            if BM25_ENRICH == 'swap':
                padded_swap, attention_mask_swap, token_type_ids_swap = padded_all_swap, attention_mask_all_swap, token_type_ids_all_swap
            elif BM25_ENRICH == 'add':
                padded_add, attention_mask_add, token_type_ids_add = padded_all_add, attention_mask_all_add, token_type_ids_all_add
            
        else:
            temp_feedback = []
            for query_num in range(0, len(bm25_top_n)):
                if query_num in test_index:
                    doc_nums = range(0, 1400)
                else:
                    doc_nums = bm25_top_n[query_num]
                padded.append(list(itemgetter(*doc_nums)(padded_all[query_num])))
                attention_mask.append(list(itemgetter(*doc_nums)(attention_mask_all[query_num])))
                token_type_ids.append(list(itemgetter(*doc_nums)(token_type_ids_all[query_num])))
                temp_feedback.append(list(itemgetter(*doc_nums)(labels[query_num])))

        # Enricht the training set (or keep default)
        if BM25_ENRICH == 'default':
            train_dataset = data_utils.get_tensor_dataset(train_index, padded, attention_mask, token_type_ids,
                                                          temp_feedback)
        elif BM25_ENRICH == 'swap':
            train_dataset = data_utils.get_tensor_dataset(train_index, padded_swap, attention_mask_swap, token_type_ids_swap,
                                                    temp_feedback_swap)
        elif BM25_ENRICH == 'add':
            train_dataset = data_utils.get_tensor_dataset(train_index, padded_add, attention_mask_add, token_type_ids_add,
                                                    temp_feedback_add)

        test_dataset = data_utils.get_tensor_dataset(test_index, padded, attention_mask, token_type_ids, temp_feedback)

        mrr_bm25, map_bm25, ndcg_bm25, mrr_bm25_list, map_bm25_list, ndcg_bm25_list = utils.get_bm25_results(
            mrr_bm25_list, map_bm25_list, ndcg_bm25_list, test_index, tokenized_queries, bm25, mrr_bm25, map_bm25,
            ndcg_bm25, rel_fed, fold_number, MAP_CUT, NDCG_CUT)

          
        # Option to load a custom trained model (used in transfer learning)
        if LOAD_CUSTOM_TRAINED_MODEL:
          model = load_specific_encoder(custom_model_path)
        else:
          model = None
          # with None the model_preparation loads the 'default' model
        train_dataloader, test_dataloader, model, optimizer, scheduler = utils.model_preparation(MODEL_TYPE, train_dataset,
                                                                                                 test_dataset,
                                                                                                 BATCH_SIZE, TEST_BATCH_SIZE,
                                                                                                 LEARNING_RATE, EPOCHS, model=model)


        # ========================================
        #               Training Loop
        # ========================================
        epochs_train_loss, epochs_val_loss = [], []
        for epoch_i in range(0, EPOCHS):
            # ========================================
            #               Training
            # ========================================
            print('======== Epoch {:} / {:} ========'.format(epoch_i + 1, EPOCHS))
            print('Training...')
            model, optimizer, scheduler = utils.training(model, train_dataloader, device, optimizer, scheduler)
        # ========================================
        #               Testing
        # ========================================
        print('Testing...')
        mrr_bert, map_bert, ndcg_bert, mrr_bert_list, map_bert_list, ndcg_bert_list = utils.testing(MODE, model,
                                                                                                    test_dataloader,
                                                                                                    device, test_index,
                                                                                                    bm25_top_n,
                                                                                                    mrr_bert_list,
                                                                                                    map_bert_list,
                                                                                                    ndcg_bert_list,
                                                                                                    mrr_bert, map_bert,
                                                                                                    ndcg_bert, rel_fed,
                                                                                                    fold_number,
                                                                                                    MAP_CUT, NDCG_CUT)
    print("  BM25 MRR:  " + "{:.4f}".format(mrr_bm25 / 5))
    print("  BM25 MAP:  " + "{:.4f}".format(map_bm25 / 5))
    print("  BM25 NDCG: " + "{:.4f}".format(ndcg_bm25 / 5))

    print("  BERT MRR:  " + "{:.4f}".format(mrr_bert / 5))
    print("  BERT MAP:  " + "{:.4f}".format(map_bert / 5))
    print("  BERT NDCG: " + "{:.4f}".format(ndcg_bert / 5))

    utils.t_test(mrr_bm25_list, mrr_bert_list, 'MRR')
    utils.t_test(map_bm25_list, map_bert_list, 'MAP')
    utils.t_test(ndcg_bm25_list, ndcg_bert_list, 'NDCG')
    
    stop = timeit.default_timer()
    wall_time = (stop - start) / 60 

    print('Time: ', wall_time, ' min') 

    # utils.results_to_csv('./mrr_bm25_list.csv', mrr_bm25_list)
    # utils.results_to_csv('./mrr_bert_list.csv', mrr_bert_list)
    # utils.results_to_csv('./map_bm25_list.csv', map_bm25_list)
    # utils.results_to_csv('./map_bert_list.csv', map_bert_list)
    # utils.results_to_csv('./ndcg_bm25_list.csv', ndcg_bm25_list)
    # utils.results_to_csv('./ndcg_bert_list.csv', ndcg_bert_list)

# Results

## a + 50/50

In [11]:
# ========================================
#               Hyper-Parameters
# ========================================
SEED = 76
MODE = 'Re-ranker'
MODEL_TYPE = 'bert-base-uncased'
LEARNING_RATE = 2e-5
MAX_LENGTH = 128
BATCH_SIZE = 32
EPOCHS = 1
TOP_BM25 = 100
MAP_CUT = 100
NDCG_CUT = 20
if MODE == 'Full-ranker':
    TEST_BATCH_SIZE = 1400
else:
    TEST_BATCH_SIZE = 100

# Set the seed value all over the place to make this reproducible.
utils.initialize_random_generators(SEED)

BM25_ENRICH = 'default' # or 'add' or 'swap' (default=no enrichment of BM25 results)

LOAD_CUSTOM_TRAINED_MODEL = True
DO_FREEZING = False

In [12]:
models_dir = "/home/jupyter/BERT-BM25-Thesis-Project/Models/" #@param {type:"string"}
custom_model_name = "BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin" #@param {type:"string"}

custom_model_path = models_dir + custom_model_name 

In [13]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
2e-05
128
32
1
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4




MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1634
Testing...
  Test MRR:  0.8308
  Test MAP:  0.4201
  Test NDCG: 0.5612
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1498
Testing...
  Test MRR:  0.7231
  Test MAP:  0.3548
  Test NDCG: 0.5006
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1508
Testing...
  Test MRR:  0.8270
  Test MAP:  0.4412
  Test NDCG: 0.5722
135
MRR:  0.6859
MAP:  0.3317
NDCG: 0.4408
180


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1660
Testing...
  Test MRR:  0.7409
  Test MAP:  0.3711
  Test NDCG: 0.4806
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1522
Testing...
  Test MRR:  0.8330
  Test MAP:  0.4289
  Test NDCG: 0.5867
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.7910
  BERT MAP:  0.4032
  BERT NDCG: 0.5403
p-value MRR: 0.1016
p-value MAP: 0.0020
p-value NDCG: 0.0065
Time:  39.14952600188332  min


In [14]:
# ========================================
#               Hyper-Parameters
# ========================================
SEED = 76
MODE = 'Re-ranker'
MODEL_TYPE = 'bert-base-uncased'
LEARNING_RATE = 2e-5
MAX_LENGTH = 128
BATCH_SIZE = 32
EPOCHS = 2
TOP_BM25 = 100
MAP_CUT = 100
NDCG_CUT = 20
if MODE == 'Full-ranker':
    TEST_BATCH_SIZE = 1400
else:
    TEST_BATCH_SIZE = 100

# Set the seed value all over the place to make this reproducible.
utils.initialize_random_generators(SEED)

BM25_ENRICH = 'default' # or 'add' or 'swap' (default=no enrichment of BM25 results)

LOAD_CUSTOM_TRAINED_MODEL = True
DO_FREEZING = False

In [15]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
2e-05
128
32
2
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4
MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1651
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1134
Testing...
  Test MRR:  0.8550
  Test MAP:  0.4338
  Test NDCG: 0.5766
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1586
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1078
Testing...
  Test MRR:  0.7189
  Test MAP:  0.3603
  Test NDCG: 0.5021
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1615
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1154
Testing...
  Test MRR:  0.8279
  Test MAP:  0.4323
  Test NDCG: 0.5656
135
MRR:  0.6859
MAP:  0.3317
NDCG: 0.4408
180


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1569
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1119
Testing...
  Test MRR:  0.6918
  Test MAP:  0.3777
  Test NDCG: 0.4566
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1497
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1022
Testing...
  Test MRR:  0.8508
  Test MAP:  0.4374
  Test NDCG: 0.5923
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.7889
  BERT MAP:  0.4083
  BERT NDCG: 0.5386
p-value MRR: 0.1121
p-value MAP: 0.0010
p-value NDCG: 0.0074
Time:  70.37577252115001  min


In [16]:
# ========================================
#               Hyper-Parameters
# ========================================
SEED = 76
MODE = 'Re-ranker'
MODEL_TYPE = 'bert-base-uncased'
LEARNING_RATE = 3e-5
MAX_LENGTH = 128
BATCH_SIZE = 32
EPOCHS = 1
TOP_BM25 = 100
MAP_CUT = 100
NDCG_CUT = 20
if MODE == 'Full-ranker':
    TEST_BATCH_SIZE = 1400
else:
    TEST_BATCH_SIZE = 100

# Set the seed value all over the place to make this reproducible.
utils.initialize_random_generators(SEED)

BM25_ENRICH = 'default' # or 'add' or 'swap' (default=no enrichment of BM25 results)

LOAD_CUSTOM_TRAINED_MODEL = True
DO_FREEZING = False

In [17]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
3e-05
128
32
1
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4
MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1675
Testing...
  Test MRR:  0.8521
  Test MAP:  0.4241
  Test NDCG: 0.5671
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1504
Testing...
  Test MRR:  0.7489
  Test MAP:  0.3577
  Test NDCG: 0.5062
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1512
Testing...
  Test MRR:  0.8686
  Test MAP:  0.4439
  Test NDCG: 0.5782
135
MRR:  0.6859
MAP:  0.3317
NDCG: 0.4408
180


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1824
Testing...
  Test MRR:  0.7103
  Test MAP:  0.3611
  Test NDCG: 0.4598
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1476
Testing...
  Test MRR:  0.8658
  Test MAP:  0.4383
  Test NDCG: 0.6024
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.8092
  BERT MAP:  0.4050
  BERT NDCG: 0.5427
p-value MRR: 0.0279
p-value MAP: 0.0015
p-value NDCG: 0.0045
Time:  37.709212311933335  min


In [18]:
# ========================================
#               Hyper-Parameters
# ========================================
SEED = 76
MODE = 'Re-ranker'
MODEL_TYPE = 'bert-base-uncased'
LEARNING_RATE = 3e-5
MAX_LENGTH = 128
BATCH_SIZE = 32
EPOCHS = 2
TOP_BM25 = 100
MAP_CUT = 100
NDCG_CUT = 20
if MODE == 'Full-ranker':
    TEST_BATCH_SIZE = 1400
else:
    TEST_BATCH_SIZE = 100

# Set the seed value all over the place to make this reproducible.
utils.initialize_random_generators(SEED)

BM25_ENRICH = 'default' # or 'add' or 'swap' (default=no enrichment of BM25 results)

LOAD_CUSTOM_TRAINED_MODEL = True
DO_FREEZING = False

In [19]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
3e-05
128
32
2
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4
MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1711
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1153
Testing...
  Test MRR:  0.8689
  Test MAP:  0.4277
  Test NDCG: 0.5721
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1577
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1022
Testing...
  Test MRR:  0.7603
  Test MAP:  0.3570
  Test NDCG: 0.5058
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1688
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1256
Testing...
  Test MRR:  0.8615
  Test MAP:  0.4332
  Test NDCG: 0.5691
135
MRR:  0.6859
MAP:  0.3317
NDCG: 0.4408
180


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1590
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1082
Testing...
  Test MRR:  0.7406
  Test MAP:  0.3848
  Test NDCG: 0.4846
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-a-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1600
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1138
Testing...
  Test MRR:  0.8676
  Test MAP:  0.4458
  Test NDCG: 0.5933
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.8198
  BERT MAP:  0.4097
  BERT NDCG: 0.5450
p-value MRR: 0.0111
p-value MAP: 0.0008
p-value NDCG: 0.0031
Time:  70.34992991628336  min


## all-sentence

In [20]:
# ========================================
#               Hyper-Parameters
# ========================================
SEED = 76
MODE = 'Re-ranker'
MODEL_TYPE = 'bert-base-uncased'
LEARNING_RATE = 2e-5
MAX_LENGTH = 128
BATCH_SIZE = 32
EPOCHS = 1
TOP_BM25 = 100
MAP_CUT = 100
NDCG_CUT = 20
if MODE == 'Full-ranker':
    TEST_BATCH_SIZE = 1400
else:
    TEST_BATCH_SIZE = 100

# Set the seed value all over the place to make this reproducible.
utils.initialize_random_generators(SEED)

BM25_ENRICH = 'default' # or 'add' or 'swap' (default=no enrichment of BM25 results)

LOAD_CUSTOM_TRAINED_MODEL = True
DO_FREEZING = False

In [21]:
models_dir = "/home/jupyter/BERT-BM25-Thesis-Project/Models/" #@param {type:"string"}
custom_model_name = "BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin" #@param {type:"string"}

custom_model_path = models_dir + custom_model_name 

In [22]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
2e-05
128
32
1
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4
MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1608
Testing...
  Test MRR:  0.8371
  Test MAP:  0.4209
  Test NDCG: 0.5645
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1484
Testing...
  Test MRR:  0.7304
  Test MAP:  0.3657
  Test NDCG: 0.5037
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1519
Testing...
  Test MRR:  0.8419
  Test MAP:  0.4462
  Test NDCG: 0.5781
135
MRR:  0.6859
MAP:  0.3317
NDCG: 0.4408
180


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1637
Testing...
  Test MRR:  0.7386
  Test MAP:  0.3940
  Test NDCG: 0.4990
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1499
Testing...
  Test MRR:  0.8437
  Test MAP:  0.4467
  Test NDCG: 0.5938
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.7983
  BERT MAP:  0.4147
  BERT NDCG: 0.5478
p-value MRR: 0.0616
p-value MAP: 0.0004
p-value NDCG: 0.0026
Time:  37.64391728051669  min


In [23]:
LEARNING_RATE = 2e-5
EPOCHS = 2

In [24]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
2e-05
128
32
2
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4
MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Average training loss: 0.1509
Training...
  Batch   100  of    563.
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1540
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1059
Testing...
  Test MRR:  0.7309
  Test MAP:  0.3620
  Test NDCG: 0.5050
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1083
Testing...
  Test MRR:  0.9028
  Test MAP:  0.4545
  Test NDCG: 0.5912
135
MRR:  0.6859
MAP:  0.3317
NDCG: 0.4408
180


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Batch   500  of    563.
  Average training loss: 0.1079
Testing...
  Test MRR:  0.7680
  Test MAP:  0.4038
  Test NDCG: 0.5058
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1506
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1014
Testing...
  Test MRR:  0.8069
  Test MAP:  0.4308
  Test NDCG: 0.5775
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.8097
  BERT MAP:  0.4151
  BERT NDCG: 0.5494
p-value MRR: 0.0261
p-value MAP: 0.0004
p-value NDCG: 0.0018
Time:  70.2372263993167  min


In [25]:
LEARNING_RATE = 3e-5
EPOCHS = 1

In [26]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
3e-05
128
32
1
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4
MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1569
Testing...
  Test MRR:  0.8104
  Test MAP:  0.4031
  Test NDCG: 0.5554
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1598
Testing...
  Test MRR:  0.7265
  Test MAP:  0.3479
  Test NDCG: 0.4910
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1536
Testing...
  Test MRR:  0.8671
  Test MAP:  0.4479
  Test NDCG: 0.5825
135
MRR:  0.6859
MAP:  0.3317
NDCG: 0.4408
180


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1689
Testing...
  Test MRR:  0.7173
  Test MAP:  0.3799
  Test NDCG: 0.4760
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1572
Testing...
  Test MRR:  0.8225
  Test MAP:  0.4046
  Test NDCG: 0.5532
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.7887
  BERT MAP:  0.3967
  BERT NDCG: 0.5316
p-value MRR: 0.1159
p-value MAP: 0.0046
p-value NDCG: 0.0178
Time:  37.59691467214998  min


In [27]:
LEARNING_RATE = 3e-5
EPOCHS = 2

In [28]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
3e-05
128
32
2
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4
MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1723
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1122
Testing...
  Test MRR:  0.8553
  Test MAP:  0.4223
  Test NDCG: 0.5710
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1624
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1080
Testing...
  Test MRR:  0.7431
  Test MAP:  0.3577
  Test NDCG: 0.4994
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1587
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1706
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1089
Testing...
  Test MRR:  0.6924
  Test MAP:  0.3801
  Test NDCG: 0.4725
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1564
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1069
Testing...
  Test MRR:  0.8370
  Test MAP:  0.4246
  Test NDCG: 0.5738
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.7927
  BERT MAP:  0.4023
  BERT NDCG: 0.5341
p-value MRR: 0.0875
p-value MAP: 0.0022
p-value NDCG: 0.0123
Time:  70.22659064921669  min


## title+50/50 rand/next

In [10]:
# ========================================
#               Hyper-Parameters
# ========================================
SEED = 76
MODE = 'Re-ranker'
MODEL_TYPE = 'bert-base-uncased'
LEARNING_RATE = 2e-5
MAX_LENGTH = 128
BATCH_SIZE = 32
EPOCHS = 1
TOP_BM25 = 100
MAP_CUT = 100
NDCG_CUT = 20
if MODE == 'Full-ranker':
    TEST_BATCH_SIZE = 1400
else:
    TEST_BATCH_SIZE = 100

# Set the seed value all over the place to make this reproducible.
utils.initialize_random_generators(SEED)

BM25_ENRICH = 'default' # or 'add' or 'swap' (default=no enrichment of BM25 results)

LOAD_CUSTOM_TRAINED_MODEL = True
DO_FREEZING = False

In [11]:
models_dir = "/home/jupyter/BERT-BM25-Thesis-Project/Models/" #@param {type:"string"}
custom_model_name = "BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin" #@param {type:"string"}

custom_model_path = models_dir + custom_model_name 

In [31]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
2e-05
128
32
1
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4
MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1563
Testing...
  Test MRR:  0.8325
  Test MAP:  0.4252
  Test NDCG: 0.5603
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1486
Testing...
  Test MRR:  0.7161
  Test MAP:  0.3613
  Test NDCG: 0.5057
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1489
Testing...
  Test MRR:  0.8763
  Test MAP:  0.4536
  Test NDCG: 0.5895
135
MRR:  0.6859
MAP:  0.3317
NDCG: 0.4408
180


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1650
Testing...
  Test MRR:  0.7425
  Test MAP:  0.3978
  Test NDCG: 0.4915
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1486
Testing...
  Test MRR:  0.8224
  Test MAP:  0.4313
  Test NDCG: 0.5921
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.7980
  BERT MAP:  0.4138
  BERT NDCG: 0.5478
p-value MRR: 0.0636
p-value MAP: 0.0005
p-value NDCG: 0.0022
Time:  37.61406568605004  min


In [32]:
LEARNING_RATE = 2e-5
EPOCHS = 2

In [33]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
2e-05
128
32
2
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4
MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1489
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1046
Testing...
  Test MRR:  0.8578
  Test MAP:  0.4273
  Test NDCG: 0.5676
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1734
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1252
Testing...
  Test MRR:  0.6777
  Test MAP:  0.3464
  Test NDCG: 0.4822
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1540
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1101
Testing...
  Test MRR:  0.8811
  Test MAP:  0.4555
  Test NDCG: 0.5885
135
MRR:  0.6859
MAP:  0.3317
NDCG: 0.4408
180


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1563
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1071
Testing...
  Test MRR:  0.7350
  Test MAP:  0.3883
  Test NDCG: 0.4985
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1458
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1020
Testing...
  Test MRR:  0.8340
  Test MAP:  0.4249
  Test NDCG: 0.5732
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.7971
  BERT MAP:  0.4085
  BERT NDCG: 0.5420
p-value MRR: 0.0654
p-value MAP: 0.0011
p-value NDCG: 0.0047
Time:  70.19557152359994  min


In [34]:
LEARNING_RATE = 3e-5
EPOCHS = 1

In [35]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
3e-05
128
32
1
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4
MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1551
Testing...
  Test MRR:  0.8303
  Test MAP:  0.4117
  Test NDCG: 0.5599
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1801
Testing...
  Test MRR:  0.6881
  Test MAP:  0.3292
  Test NDCG: 0.4628
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1530
Testing...
  Test MRR:  0.8608
  Test MAP:  0.4458
  Test NDCG: 0.5792
135
MRR:  0.6859
MAP:  0.3317
NDCG: 0.4408
180


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1646
Testing...
  Test MRR:  0.7289
  Test MAP:  0.3882
  Test NDCG: 0.4787
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1514
Testing...
  Test MRR:  0.8218
  Test MAP:  0.4126
  Test NDCG: 0.5613
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.7859
  BERT MAP:  0.3975
  BERT NDCG: 0.5284
p-value MRR: 0.1341
p-value MAP: 0.0042
p-value NDCG: 0.0236
Time:  37.57241623715005  min


In [12]:
LEARNING_RATE = 3e-5
EPOCHS = 2

In [13]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
3e-05
128
32
2
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4




MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1563
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1046
Testing...
  Test MRR:  0.8718
  Test MAP:  0.4393
  Test NDCG: 0.5819
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1552
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1027
Testing...
  Test MRR:  0.7205
  Test MAP:  0.3559
  Test NDCG: 0.5002
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1589
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1075
Testing...
  Test MRR:  0.8273
  Test MAP:  0.4386
  Test NDCG: 0.5573
135
MRR:  0.6859
MAP:  0.3317
NDCG: 0.4408
180


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1839
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1300
Testing...
  Test MRR:  0.7536
  Test MAP:  0.3890
  Test NDCG: 0.4774
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1531
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1057
Testing...
  Test MRR:  0.8411
  Test MAP:  0.4358
  Test NDCG: 0.5843
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.8029
  BERT MAP:  0.4117
  BERT NDCG: 0.5402
p-value MRR: 0.0441
p-value MAP: 0.0007
p-value NDCG: 0.0060
Time:  79.56072618319999  min


## title+all doc

In [47]:
# ========================================
#               Hyper-Parameters
# ========================================
SEED = 76
MODE = 'Re-ranker'
MODEL_TYPE = 'bert-base-uncased'
LEARNING_RATE = 2e-5
MAX_LENGTH = 128
BATCH_SIZE = 32
EPOCHS = 1
TOP_BM25 = 100
MAP_CUT = 100
NDCG_CUT = 20
if MODE == 'Full-ranker':
    TEST_BATCH_SIZE = 1400
else:
    TEST_BATCH_SIZE = 100

# Set the seed value all over the place to make this reproducible.
utils.initialize_random_generators(SEED)

BM25_ENRICH = 'default' # or 'add' or 'swap' (default=no enrichment of BM25 results)

LOAD_CUSTOM_TRAINED_MODEL = True
DO_FREEZING = False

In [48]:
models_dir = "/home/jupyter/BERT-BM25-Thesis-Project/Models/" #@param {type:"string"}
custom_model_name = "BERT_Cranfield_NSP_model-title-all-128-16-2e-05-1.bin" #@param {type:"string"}

custom_model_path = models_dir + custom_model_name 

In [None]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
2e-05
128
32
1
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4
MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1554
Testing...
  Test MRR:  0.8103
  Test MAP:  0.4065
  Test NDCG: 0.5539
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1559
Testing...
  Test MRR:  0.7234
  Test MAP:  0.3593
  Test NDCG: 0.5065
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-all-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.


In [None]:
LEARNING_RATE = 2e-5
EPOCHS = 2

In [None]:
train_test()

In [None]:
LEARNING_RATE = 3e-5
EPOCHS = 1

In [None]:
train_test()

In [None]:
LEARNING_RATE = 3e-5
EPOCHS = 2

In [46]:
train_test()

#               Hyper-Parameters
Re-ranker
bert-base-uncased
3e-05
128
32
2
#               Experiment-Settings
BM25_ENRICHMENT:  default
BM25_ENRICHMENT:  default
#               Other
Tesla T4
GPU Type: Tesla T4
MRR:  0.7837
MAP:  0.3493
NDCG: 0.5011
45


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1690
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1175
Testing...
  Test MRR:  0.7840
  Test MAP:  0.3893
  Test NDCG: 0.5409
45
MRR:  0.6596
MAP:  0.3036
NDCG: 0.4546
90


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1652
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1152
Testing...
  Test MRR:  0.7226
  Test MAP:  0.3499
  Test NDCG: 0.4907
90
MRR:  0.7611
MAP:  0.3341
NDCG: 0.4826
135


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1572
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1103
Testing...
  Test MRR:  0.8213
  Test MAP:  0.4415
  Test NDCG: 0.5685
135
MRR:  0.6859
MAP:  0.3317
NDCG: 0.4408
180


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1601
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.0995
Testing...
  Test MRR:  0.7410
  Test MAP:  0.3902
  Test NDCG: 0.4831
180
MRR:  0.7796
MAP:  0.3182
NDCG: 0.4780
225


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at

LOAD :  /home/jupyter/BERT-BM25-Thesis-Project/Models/BERT_Cranfield_NSP_model-title-50-50-128-16-2e-05-1.bin
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1559
Training...
  Batch   100  of    563.
  Batch   200  of    563.
  Batch   300  of    563.
  Batch   400  of    563.
  Batch   500  of    563.
  Average training loss: 0.1097
Testing...
  Test MRR:  0.7995
  Test MAP:  0.4087
  Test NDCG: 0.5627
225
  BM25 MRR:  0.7340
  BM25 MAP:  0.3274
  BM25 NDCG: 0.4714
  BERT MRR:  0.7737
  BERT MAP:  0.3959
  BERT NDCG: 0.5292
p-value MRR: 0.2560
p-value MAP: 0.0047
p-value NDCG: 0.0214
Time:  70.21518395101666  min
