In [1]:
import sys
sys.path.insert(0, "/home/dzigen/Desktop/ITMO/ВКР/КМУ2024/inference.ipynb")

from src.retrievers.bm25colbert import BM25ColBertRetriever
from src.retrievers.bm25e5 import BM25E5Retriever
from src.readers.fid import FiDReader
from src.retrievers.e5 import E5Retriever

import torch
from tqdm import tqdm

In [2]:
TUNED_READER_PATH = '/home/dzigen/Desktop/ITMO/ВКР/КМУ2024/logs/join_e5_fid_triviaqa/reader_bestmodel.pt'
BASE_PATH = '/home/dzigen/Desktop/ITMO/ВКР/КМУ2024/data/bases/e5_scipdf_base'
READER_INPUT_FORMAT = "context: {c}\n\nquestion: {q}"
READER_GEN_ML = 256

QUESTIONS = [
    "What is a RETRO approach",
    "What is a kNN-LM approach",
    "What is a DPR approach",
    "What is a RAG approach",
    "What is a FiD approach",
    "What is a EMDR2 approach",
    "What is a Atlas approach",
    "What is a REPLUG approach",
    "What is a ColBERT approach"
]

In [3]:
def inference(reader, retriever, queries):
    answers = []
    for query in tqdm(queries):
        print("QUERY: ", query)
        texts, k_scores, metadata = retriever.search(query)

        print("CONTEXTS:\n", '\n\n'.join(texts))

        formated_txts = list(map(
            lambda t: READER_INPUT_FORMAT.format(q=query,c=t), texts))

        tokenized_txts = reader.tokenize(formated_txts)
        
        cands_k = len(texts)

        # Generating Answers by predicted indices
        output = reader.model.generate(
            input_ids=tokenized_txts['input_ids'].view(1, cands_k, -1),
            attention_mask=tokenized_txts['attention_mask'].view(1, cands_k, -1), 
            max_length=READER_GEN_ML, eos_token_id=reader.tokenizer.eos_token_id)
        
        predicted = reader.tokenizer.batch_decode(output, skip_special_tokens=True)
        answers += predicted

        print("ANSWER: ", predicted[0])

    return answers

In [5]:
reader = FiDReader()
reader.model.load_state_dict(torch.load(TUNED_READER_PATH))
#reader.load_model(TUNED_READER_PATH)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


#### E5 + FID

In [9]:
TUNED_RETRIEVER_PATH = '/home/dzigen/Desktop/ITMO/ВКР/КМУ2024/logs/join_e5_fid_triviaqa/retriever_bestmodel.pt'

e5_retriever = E5Retriever()
e5_retriever.model.load_state_dict(torch.load(TUNED_RETRIEVER_PATH))
#colb_retriever.load_model(TUNED_RETRIEVER_PATH)
e5_retriever.load_base(BASE_PATH)

Loading query E5-model...
Loading document E5-model...
Loading precomputed e5-base...


`embedding_function` is expected to be an Embeddings object, support for passing in a function will soon be removed.


In [12]:
inference(reader, e5_retriever, QUESTIONS)

  0%|          | 0/9 [00:00<?, ?it/s]

QUERY:  What is a RETRO approach
CONTEXTS:
 Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
To answer the above question and bridge the missing gap, we perform an extensive study on RETRO, as to the best of our knowledge, RETRO is the only retrieval-augmented autoregressive LM that supports large-scale pretraining with retrieval on the massive pretraining corpus with hundreds of billion or trillion tokens.

Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
These results further substantiate the potential of RETRO, which is pre-trained with retrieval capabilities, as a promising approach.

Improving language models by retrieving
from trillions of tokens
Retro models are ﬂexible and can be used without retrieval at evaluation and still achieve comparable performance to baseline models.

Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
For open-domain QA tasks, RETRO achieves 

 11%|█         | 1/9 [00:02<00:21,  2.73s/it]

ANSWER:  Retrieval
QUERY:  What is a kNN-LM approach
CONTEXTS:
 GENERALIZATION THROUGH MEMORIZATION : NEAREST NEIGHBOR LANGUAGE MODELS
We introduce kNN-LM, an approach that extends a pre-trained LM by linearly interpolating its next word distribution with a k-nearest neighbors (kNN) model.

COPY IS ALL YOU NEED
• kNN-LM (Khandelwal et al., 2020) is a retrieval-augmented generation model, which extends a pre-trained neural language model by linearly interpolating its next token distribution with a k-nearest neighbors (kNN) model.

GENERALIZATION THROUGH MEMORIZATION : NEAREST NEIGHBOR LANGUAGE MODELS
We introduce kNN-LMs, which extend a pre-trained neural language model (LM) by linearly interpolating it with a k-nearest neighbors (kNN) model.

GENERALIZATION THROUGH MEMORIZATION : NEAREST NEIGHBOR LANGUAGE MODELS
The kNN-LM involves augmenting such a pre-trained LM with a nearest neighbors retrieval mechanism, without any additional training (the representations learned by the LM remain

 22%|██▏       | 2/9 [00:03<00:10,  1.56s/it]

ANSWER:  We introduce kNN-LMs
QUERY:  What is a DPR approach
CONTEXTS:
 Relevance-guided Supervision for OpenQA with ColBERT
(2020) propose a dense passage retriever (DPR) that directly trains the architecture in Figure 2(a) for retrieval, relying on a simple approach to collect positives and negatives.

Dense Passage Retrieval for Open-Domain Question Answering
Given a collection of M text passages, the goal of our dense passage retriever (DPR) is to index all the passages in a low-dimensional and continuous space, such that it can retrieve efﬁciently the top k passages relevant to the input question for the reader at run-time.

Relevance-guided Supervision for OpenQA with ColBERT
Using this simple strategy, DPR considerably outperforms both ORQA and REALM and established a new state-of-the-art for extractive OpenQA.

Dense Passage Retrieval for Open-Domain Question Answering
While both methods include additional pretraining tasks and employ an expensive end-to-end training regime, DP

 33%|███▎      | 3/9 [00:04<00:07,  1.21s/it]

ANSWER:  Dense Passage Retrieval
QUERY:  What is a RAG approach
CONTEXTS:
 SELF-RAG : LEARNING TO RETRIEVE , GENERATE , AND CRITIQUE THROUGH SELF-REFLECTION
Retrieval-Augmented Generation (RAG), an ad hoc approach that augments LMs with retrieval of relevant knowledge, decreases such issues.

Retrieval-Augmented Generation for Large Language Models: A Survey
The RAG research paradigm is continuously evolving, and we categorize it into three stages: Naive RAG, Advanced RAG, and Modular RAG, as showed in Figure 3.

Retrieval-Augmented Generation for Large Language Models: A Survey
Retrieval-Augmented Generation (RAG) has emerged as a promising solution by incorporating knowledge from external databases.

SELF-RAG : LEARNING TO RETRIEVE , GENERATE , AND CRITIQUE THROUGH SELF-REFLECTION
A few concurrent works2 on RAG propose new training or prompting strategies to improve widely-adopted RAG approaches.


 44%|████▍     | 4/9 [00:04<00:04,  1.00it/s]

ANSWER:  ad hoc approach
QUERY:  What is a FiD approach
CONTEXTS:
 End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering
FiD-KD is a complex training procedure that requires multiple training stages and performs knowledge distillation with inter-attention scores.

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering
The current best approach for training multi-document reader and retriever is FiD-KD (Izacard and Grave, 2021a).

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering
First, the FiD reader is trained from the ﬁrst term of the EMDR2 objective in which its likelihood is conditioned on all the retrieved documents, similar to how the reader is used at test time.

Improving language models by retrieving
from trillions of tokens
More recently, Emdr2 (Sachan et al., 2021) extends FiD by using an expectation-maximization algorithm to train the retriever end-to-e

 56%|█████▌    | 5/9 [00:05<00:04,  1.01s/it]

ANSWER:  Achieves State of the Art Results Compared to Similar sized Models
QUERY:  What is a EMDR2 approach
CONTEXTS:
 End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering
EMDR2 is a framework that can be used to train retrieval-augmented text generation models for any task.

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering
EMDR2 achieves new state-of-the-art results for models of comparable size on all datasets, outperforming recent approaches by 2-3 absolute exact match points.

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering
We presented EMDR2, an end-to-end training method for retrievalaugmented question answering systems.

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering
While EMDR2 has the potential to improve language models in the low-resource setting (as demonstrated by our results on WebQ in §3.4), it

 67%|██████▋   | 6/9 [00:07<00:03,  1.14s/it]

ANSWER:  A Framework for End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering
QUERY:  What is a Atlas approach
CONTEXTS:
 Atlas: Few-shot Learning with
Retrieval Augmented Language Models
We also provided detailed ablations and analyses for what factors are important when training such retrieval-augmented models, and demonstrated Atlas’s updateability, interpretability and controlability capabilities.

Atlas: Few-shot Learning with
Retrieval Augmented Language Models
We ﬁnally evaluate this model, called Atlas, on diﬀerent natural language understanding tasks in few-shot and full dataset settings.

Atlas: Few-shot Learning with
Retrieval Augmented Language Models
In this work we present Atlas, a carefully designed and pre-trained retrieval augmented language model able to learn knowledge intensive tasks with very few training examples.

Atlas: Few-shot Learning with
Retrieval Augmented Language Models
In the full dataset setting, Atlas is within

 78%|███████▊  | 7/9 [00:08<00:02,  1.08s/it]

ANSWER:  Atlas: Few-shot Learning with Retrieval Augmented Language Models
QUERY:  What is a REPLUG approach
CONTEXTS:
 REPLUG: Retrieval-Augmented Black-Box Language Models
In all settings, REPLUG ˜ improve the performance of various black-box language models, showing the effectiveness and generality of our approach.

REPLUG: Retrieval-Augmented Black-Box Language Models
We introduce REPLUG, a retrieval-augmented language modeling paradigm that treats the language model as a black box and augments it with a tuneable retrieval model.

REPLUG: Retrieval-Augmented Black-Box Language Models
We introduce REPLUG, a retrieval-augmented language modeling framework that treats the language model (LM) as a black box and augments it with a tuneable retrieval model.

Retrieval-Augmented Generation for Large Language Models: A Survey
REPLUG [72] utilizes a retriever and an LLM to calculate the probability distributions of the retrieved documents and then performs supervised training by computing t

 89%|████████▉ | 8/9 [00:09<00:00,  1.00it/s]

ANSWER:  Retrieval-Augmented Black-Box Language Models
QUERY:  What is a ColBERT approach
CONTEXTS:
 ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
ColBERT prescribes a simple framework for balancing the quality and cost of neural IR, particularly deep language models like BERT.

ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
Recall that ColBERT can be used for re-ranking the output of another retrieval model, typically a term-based model, or directly for end-to-end retrieval from a document collection.

ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
To reconcile efciency and contextualization in IR, we propose ColBERT, a ranking model based on contextualized late interaction over BERT.

ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
In this paper, we introduced ColBERT, a novel ranking model that emp

100%|██████████| 9/9 [00:09<00:00,  1.09s/it]

ANSWER:  Efficient and Effective Passage Search





['Retrieval',
 'We introduce kNN-LMs',
 'Dense Passage Retrieval',
 'ad hoc approach',
 'Achieves State of the Art Results Compared to Similar sized Models',
 'A Framework for End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering',
 'Atlas: Few-shot Learning with Retrieval Augmented Language Models',
 'Retrieval-Augmented Black-Box Language Models',
 'Efficient and Effective Passage Search']

#### BM25ColBERT + FID

In [None]:
TUNED_RETRIEVER_PATH = ''

colb_retriever = BM25ColBertRetriever()
colb_retriever.load_model(TUNED_RETRIEVER_PATH)
colb_retriever.load_base(BASE_PATH)

In [None]:
inference(reader, colb_retriever, QUESTIONS)

#### BM25E5 + FID

In [None]:
TUNED_RETRIEVER_PATH = ''

e5_retriever = BM25ColBertRetriever()
e5_retriever.load_model(TUNED_RETRIEVER_PATH)
e5_retriever.load_base(BASE_PATH)

In [None]:
inference(reader, e5_retriever, QUESTIONS)