# **Advanced RAG Technique**

In this document we use RAG with hybrid search (sparse retriever & dense retriever) and contextual compression on the retrieved documents.
We use GPT3.5 Turbo as LLM, and BM25 & OpenAI Embeddings with Chroma as sparse and dense retriever.

How to use:
Make sure you have executed the Embedding-OpenAI-Chroma.ipynb, to generate a folder /Chroma/ which contains the vector database. Link the folder in the following sections accordingly.

In [49]:
!pip -q install langchain openai chromadb sentence_transformers evaluate rouge_score bert_score bleu_score

Collecting bert_score
  Downloading bert_score-0.3.13-py3-none-any.whl (61 kB)
     ---------------------------------------- 61.1/61.1 kB 3.2 MB/s eta 0:00:00
Installing collected packages: bert_score
Successfully installed bert_score-0.3.13


ERROR: Could not find a version that satisfies the requirement bleu_score (from versions: none)
ERROR: No matching distribution found for bleu_score


In [1]:
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA

## **OpenAI Authenticatation**
We use OpenAIs GPT3.5 Turbo. Make sure to have balance on your OpenAI Dashboard and create a personal secret key at https://platform.openai.com/api-keys.

In [2]:
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

········


## **Load Chroma and GPT3.5 Turbo LLM**
We first load the Chroma vector database.

In [3]:
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings,GPT4AllEmbeddings,HuggingFaceBgeEmbeddings

In [4]:
import os
#Here we can check if the folder exists
persist_directory = "./Chroma/chroma_openai"
# Create the directory if it does not exist
if not os.path.exists(persist_directory):
    print(f"Please execute first LangChainRAG/Embedding-OpenAI-Chroma.ipynb, we didn't find any Chroma vector storage.")
else:
    print(f"Directory '{persist_directory}' exists, perfect!")

Directory './Chroma/chroma_openai' exists, perfect!


In [32]:
import json
from langchain.schema import Document
from langchain_community.retrievers import BM25Retriever
from langchain.retrievers import EnsembleRetriever
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from tqdm.auto import tqdm


class HybridSearch:
    def __init__(self, data_path):
        self.data_path = data_path
        os.environ['OPENAI_API_KEY'] = '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'
        self.embedding = OpenAIEmbeddings()
        self.ensemble_retriever = None

    def load_data(self):
        with open(self.data_path, 'r') as file:
            data = json.load(file)
        return data

    def initialize_bm25_retriever(self, docs):
        if not all(isinstance(doc, Document) for doc in docs):
            raise ValueError("All items in docs must be Document instances.")
        abstracts = [doc.page_content for doc in docs]
        bm25_retriever = BM25Retriever.from_texts(abstracts, metadatas=[doc.metadata for doc in docs])
        bm25_retriever.k = 3
        return bm25_retriever

    def transform_data_to_documents(self, data):
        docs = []
        for doc in data:
            title = doc.get('title', {}).get('full_text', '')
            abstract = doc.get('abstract', {}).get('full_text', '')
            keywords = doc.get('keywords', [[]])[0] if doc['keywords'] and isinstance(doc['keywords'][0], list) else []
            document = Document(page_content=abstract, metadata={'title': title, 'keywords': keywords})
            docs.append(document)
        return docs

    def process_documents_with_chroma(self, docs):
        persist_directory = './Chroma/chroma_openai'
        db3 = Chroma(persist_directory=persist_directory, embedding_function=self.embedding)
        if not all(isinstance(doc, Document) for doc in docs):
            raise ValueError("All items in docs must be Document instances.")
        chroma_retriever = db3.as_retriever(search_kwargs={'k': 3})
        return chroma_retriever

    def create_ensemble_retriever(self, bm25_retriever, chroma_retriever):
        # faiss_retriever = faiss_vectorstore.as_retriever(search_kwargs={'k': 10})
        ensemble_retriever = EnsembleRetriever(retrievers=[bm25_retriever, chroma_retriever], weights=[0.7, 0.3])
        self.ensemble_retriever = ensemble_retriever

    def get_relevant_documents(self, query):
        results = self.ensemble_retriever.get_relevant_documents(query)
        print(results)
        formatted_results = []
        for document in results:
            doc_info = {
                'title': document.metadata.get('title', document.metadata.get('source/title','No Title')),
                'keywords': document.metadata.get('keywords', []),
                'abstract': document.page_content
            }
            formatted_results.append(doc_info)
        return formatted_results

hs = HybridSearch(
        '../papers.json')
data = hs.load_data()
docs = hs.transform_data_to_documents(data)
bm25_retriever = hs.initialize_bm25_retriever(docs)

chroma_vectorstore = hs.process_documents_with_chroma(docs)
hs.create_ensemble_retriever(bm25_retriever, chroma_vectorstore)

In [33]:
from langchain import hub
from langchain_openai import ChatOpenAI

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
# Here we prepare the hybrid-search retriever, the prompt, and the LLM.

retriever = hs.ensemble_retriever
prompt = hub.pull("rlm/rag-prompt")
llm = ChatOpenAI(model_name="gpt-3.5-turbo",temperature=0)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

## **Generate answers using contextual compression**
Here we use LLMChainExtractor to only take the relevant information from each document. We prepare the compressor based retriever and generate answers in an analogous way as above.

In [34]:
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor,LLMChainFilter
from langchain.llms import OpenAI

# Here we prepare the Contextual Retriever
compressor = LLMChainExtractor.from_llm(
    llm=llm
)
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever
)

In [35]:
rag_chain_compressor = (
    {"context": compression_retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)


In [38]:
query = 'What is ease-mm ?'# 'How does the new prediction method, EASE-MM, select the final prediction model?''

compressed_docs = compression_retriever.get_relevant_documents(query)
print(compressed_docs) # USE THESE DOCS
print("RAG compressor:",rag_chain_compressor.invoke(query))




_______
[Document(page_content='surgical practice in gastro-enterology is concerned by deep technological advances. in the past century, the technological advances were conducted by clinical challenges and strategies. the 21th century is clearly led by the inversion of the paradigms. medical practice does not only depend on the access to the technologies, but it seems submitted to her. should the physician follow the engineer ? does the clinical data collection, depend on the computer ? who decides ? the doctor, the patient or the artificial intelligence ? materiel et methods : the present essay that definitely does not answer all these questions, is achieved thanks the practical experience of our colleagues. we also collected the recent literature devoted to new and promising technologies. the pubmed review is completed by several think tanks reports coming from the industry.', metadata={'title': '[recent advances in digestive surgery].', 'keywords': ['Artificial intelligence', 'Augme



[Document(page_content='Ease-mm is a new prediction method called evolutionary, amino acid, and structural encodings with multiple models (ease-mm), which comprises five specialised support vector machine (svm) models and makes the final prediction from a consensus of two models selected based on the predicted secondary structure and accessible surface area of the mutated residue. Ease-mm yielded a pearson correlation coefficient of 0.53-0.59 in 10-fold cross-validation and independent testing and was able to outperform other sequence-based methods. Ease-mm achieved a comparable or better performance when compared to structure-based energy functions. The application to a large dataset of human germline non-synonymous snvs showed that the disease-causing variants tend to be associated with larger magnitudes of δδgu predicted with ease-mm.', metadata={'title': 'ease-mm: sequence-based prediction of mutation-induced stability changes with feature-based multiple models.', 'keywords': ['ami



RAG compressor: Ease-mm is a new prediction method developed for accurate prediction of protein stability changes induced by single amino acid substitutions. It utilizes automated in-line AI-measured mapse and GL-shortening to deliver immediate and highly reproducible results.
