Skip to content

seonglae/ReSRer

Repository files navigation

ReSRer (Retriever, Summarizer, Reader)

Reducing context size and increasing QA score simultaneously for ODQA(Open-Domain Question Answering)

Abstract

Large Language Models (LLMs) demonstrate strong performance in various tasks like Question Answering and Reasoning. However, due to the nature of the Transformer structure, they are limited to considering only a restricted length of context. Despite recent attempts to extend context using techniques like Sparse Attention, there is a lack of research on context shortening. Reducing context while maintaining the same performance can be computationally efficient and particularly effective in removing noise that contained unrelated to query in document retrieved by Retrieval-Augmented Generation (RAG). Summarization is a task that creates a shorter version of text while preserving its principal information content. We proposes a ReSRer architecture, which incorporates a Summarizer model between the traditional Reader-Retriever architecture in an Open-Domain Question Answering system. This approach provides the several Reader and Retriever with improving the overall QA pipeline performance.

Results

Demo on Huggingface Space

  • Demo in Huggingface Space
ReSRer Demo

Score results

Total score resulst

Exact Match Increase Along Top-k Increase

ReSRer Demo

Exact Match Shrinking Along QA Pipeline

ReSRer Demo

Token Count Changes Along Top-k Changing

Token count

Prompt

We mainly focused on NQ(Natural Question) dataset for this time.

Reader prompt for NQ

Extract a concise noun-based answer from the provided context for the question. Your answer should be under three words and extracted directly from a context of no more than five words. You can analyze the context step by step to derive the answer. Avoid using prefixes that indicate the type of answer; simply present the shortest relevant answer span from the context.

Summarizer prompt for NQ

We did several

Condense the provided passages to focus on key elements directly answering the question. Your summary should be a third of the original passages' length and at least 150 words. Highlight critical information and evidence supporting the answer. Avoid generalizations or unrelated details. Ensure the final answer is present in the summary, keeping the exact span of the answer to under five words. Present the summary in a clear, bullet-point format for each key element related to the question. Aim for a balance between conciseness and completeness.

Models

Trained model for ReSRer reader on Huggingface trained in 55k Training Dataset generated from GPT-3 with the below prompt Our main goal was not to train a summarizing small model, but rather to prove that a summarizer module between the retriever and reader is an efficient method. So, we did not delve into training with the most recent summarizer prompt dataset. Therefore, this model's performance is not as good as with the original context (even better than native summarizer though). We disclose this because it might be helpful for people who want to reduce computing costs dramatically.

Contribution

As I mentioned earlier, our research was aimed at exploring the potential benefits of an effective abstractive summarizer for QA tasks. Initially, we planned to test this approach. However, given the significant advancements made by SuRe and LLMLingua in this domain, we decided to halt our research.

Although our improvements of 4% (which translates to nearly 20% from the original score) may not seem impressive, we demonstrate that a single summarizer module can effectively handle simple tasks such as single-hop question answering, in contrast to more complex multi-step approaches. However, we were unable to confirm whether this single-step context pruning is effective for more intricate tasks like reasoning and code generation. Therefore, there may be room for further contributions in this area in the future.

Get Started

1. Install dependencies

git clone https://github.com/seonglae/ReSRer
cd ReSRer
rye sync
# or
pip insatll .
# for training
pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable
pip install --force-reinstall typing-extensions==4.5.0
pip uninstall deepspeed
pip install deepspeed
pip uninstall -y apex

2. create .env

MILVUS_PW=
MILVUS_HOST=resrer

3. QA pipeline

python qa_pipeline.py

Index to Vector DB

indexing.json

  • check embedding dimension of tei
  • subset target
  • streaming or not
  • collection name
python indexing.py

TEI

install guide

npm i -g pm2
model=
pm2 start data/tei.json

About

Retriever, Summarizer, Reader for LLM ODQA(Open-Domain Question Answering)

Topics

Resources

Stars

Watchers

Forks

Languages