In [1]:
from dotenv import load_dotenv
import os
load_dotenv()
api_key=os.getenv("GEMINI_API_KEY")

In [2]:
from langchain_google_genai import ChatGoogleGenerativeAI,GoogleGenerativeAIEmbeddings
generation_config = {
            "temperature": 0.2,
            "top_p": 0.9,
            "max_output_tokens": 4096,
            "response_mime_type": "application/json"
        }
llm=ChatGoogleGenerativeAI(model="gemini-pro",generation_config=generation_config)
embedding=GoogleGenerativeAIEmbeddings(model="models/embedding-001")

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
from langchain_community.document_loaders import WebBaseLoader  #load data

load=WebBaseLoader("https://www.falkordb.com/blog/advanced-rag/")
data=load.load()

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [4]:
from langchain_text_splitters import RecursiveCharacterTextSplitter  #text_splitter
text=RecursiveCharacterTextSplitter(chunk_size=500,chunk_overlap=50)
docs=text.split_documents(data)

In [5]:
from langchain_community.vectorstores import FAISS   #vector_Store 
vector_stores=FAISS.from_documents(documents=docs,embedding=embedding)
retriever=vector_stores.as_retriever(search_type="similarity",search_kwargs={"k":3})  #similarity search and maximum no chunk is 3
vector_stores.save_local("rag_index")


In [None]:
from langchain.prompts import ChatPromptTemplate #HyDE logic
template=""" 
Given the question: {question}, write a detailed and informative passage that provides an answer, explanation, or context about the topic. Be specific and concise, 
focusing on relevant facts, examples, and insights. The response should resemble an excerpt from an article, book, or scholarly discussion.
"""
prompt_hyde=ChatPromptTemplate.from_template(template)
user_question="What is Self-Query Retrieval in rag ?"
query=prompt_hyde.format(question=user_question)  #format provides replacing your variable into the variable in template. 'question' is variable in template 'user question' is your variable out of template . it has been replaced
query

'Human:  \nGiven the question: What is Self-Query Retrieval in rag ?, write a detailed and informative passage that provides an answer, explanation, or context about the topic. Be specific and concise, \nfocusing on relevant facts, examples, and insights. The response should resemble an excerpt from an article, book, or scholarly discussion.\n'

In [7]:
hypothetical_answer=llm.invoke(query).content 
hypothetical_answer

'**Self-Query Retrieval in RAG**\n\nSelf-Query Retrieval (SQR) in Retrieval-Augmented Generation (RAG) is a technique that enables a RAG model to retrieve relevant information from a knowledge base to augment its text generation process. Specifically, SQR involves generating self-queries based on the input text and then using these queries to retrieve relevant documents from the knowledge base.\n\nIn RAG, the self-query is generated by a self-query generator module, which takes the input text as input. The self-query generator is typically a neural network model, trained on a dataset of input texts and corresponding relevant documents. The self-query is designed to capture the main information need of the input text.\n\nOnce the self-query is generated, it is used to retrieve relevant documents from the knowledge base. The retrieval module in RAG is typically based on a vector-based search algorithm, such as BM25 or TF-IDF. The retrieval module ranks the documents in the knowledge base

In [None]:
#According to the answer to our question answered by llm, getting the relevant chunks from the document 
relevant_docs=retriever.invoke(hypothetical_answer)
relevant_docs


[Document(metadata={'source': 'https://www.falkordb.com/blog/advanced-rag/', 'title': 'Advanced RAG Techniques: What They Are & How to Use Them', 'description': 'Master advanced RAG techniques to enhance AI performance, accuracy, and efficiency. Learn methods for optimizing retrieval and generation in complex queries.', 'language': 'en-US'}, page_content='Every RAG application can be broken down into two phases: retrieval and generation. First, RAG retrieves relevant documents or knowledge snippets from external sources, such as knowledge graphs or vector stores, using search and indexing techniques. This retrieved data is then fed into a language model, which generates contextually rich and accurate responses by synthesizing the retrieved information with its pre-trained knowledge.RAG systems have evolved as the requirements have become more'),
 Document(metadata={'source': 'https://www.falkordb.com/blog/advanced-rag/', 'title': 'Advanced RAG Techniques: What They Are & How to Use The

In [9]:
relevant_texts=["/n".join(doc.page_content for doc in relevant_docs)]
relevant_texts

['Every RAG application can be broken down into two phases: retrieval and generation. First, RAG retrieves relevant documents or knowledge snippets from external sources, such as knowledge graphs or vector stores, using search and indexing techniques. This retrieved data is then fed into a language model, which generates contextually rich and accurate responses by synthesizing the retrieved information with its pre-trained knowledge.RAG systems have evolved as the requirements have become more/nSelf-RAG is an advanced technique that empowers your system to refine its own retrieval and generation process by iterating on its outputs. In Self-RAG, the model doesn’t just rely on the initial retrieval but actively re-evaluates and adjusts its approach by generating follow-up queries and responses. This iterative process allows the model to correct its own mistakes, fill in gaps, and enhance the quality of the final output.You can think of Self-RAG as your model’s ability to self-correct/nRe

In [10]:
# general template for rag implementations
template="""  
answer the following question in detailed based on context:
context:{context}
question:{question}
"""
prompt=ChatPromptTemplate.from_template(template)


In [None]:
query=prompt.format(context=relevant_texts,question=user_question) #^^# this response is taken based on HyDE Method 
response=llm.invoke(query)
hyde_answer=response.content


In [None]:
print(hyde_answer)

Self-Query Retrieval (Self-RAG) is an advanced technique used in Retrieval-Augmented Generation (RAG) systems. In Self-RAG, the RAG system iteratively refines its retrieval and generation process by generating follow-up queries and responses. This allows the model to self-correct, fill in gaps, and enhance the quality of the final output.

Unlike basic RAG systems, which rely solely on the initial retrieval of documents or knowledge snippets, Self-RAG actively re-evaluates and adjusts its approach. This iterative process enables the model to:

1. **Correct its own mistakes:** If the initial retrieval or generation step produces inaccurate or incomplete results, the model can generate follow-up queries to gather additional information and refine its response.

2. **Fill in gaps:** Self-RAG allows the model to identify and address gaps in its knowledge or the retrieved documents. By generating follow-up queries, the model can gather missing information and incorporate it into its respons

In [None]:
# this response is taken  from relevant documents based on user question
relevant_docs2=retriever.invoke(user_question)
relevant_texts2=["/n".join(doc.page_content for doc in relevant_docs2)]
query2=template.format(context=relevant_texts2,question=user_question)

In [14]:
response=llm.invoke(query2)
answer=response.content


In [15]:
print(answer)

Self-Query Retrieval in RAG (Self-RAG) is an advanced technique that allows the model to refine its own retrieval and generation process by iterating on its outputs. In Self-RAG, the model doesn't just rely on the initial retrieval but actively re-evaluates and adjusts its approach by generating follow-up queries and responses. This iterative process allows the model to correct its own mistakes, fill in gaps, and enhance the quality of the final output.
