In [2]:
import warnings
warnings.filterwarnings("ignore")
from dotenv import load_dotenv
import os
load_dotenv()
api_key=os.getenv("GEMINI_API_KEY")

In [3]:
from langchain_google_genai import ChatGoogleGenerativeAI,GoogleGenerativeAIEmbeddings
generation_config = {
            "temperature": 0.2,
            "top_p": 0.9,
            "max_output_tokens": 4096,
            "response_mime_type": "application/json"
        }
llm=ChatGoogleGenerativeAI(model="gemini-pro",generation_config=generation_config)
embedding=GoogleGenerativeAIEmbeddings(model="models/embedding-001")

In [4]:
from langchain_community.document_loaders import WebBaseLoader  #load data

load=WebBaseLoader("https://www.falkordb.com/blog/advanced-rag/")
data=load.load()

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [5]:
from langchain_text_splitters import RecursiveCharacterTextSplitter  #text_splitter
text=RecursiveCharacterTextSplitter(chunk_size=500,chunk_overlap=50)
docs=text.split_documents(data)

In [6]:
from langchain_community.vectorstores import FAISS   #vector_Store 
vector_stores=FAISS.from_documents(documents=docs,embedding=embedding)
retriever=vector_stores.as_retriever(search_type="similarity",search_kwargs={"k":3})  #similarity search and maximum no chunk is 3
vector_stores.save_local("rag_index")


In [7]:
from langchain.prompts import ChatPromptTemplate #HyDE logic
template=""" 
Given the question: {question}, write a detailed and informative passage that provides an answer, explanation, or context about the topic. Be specific and concise, 
focusing on relevant facts, examples, and insights. The response should resemble an excerpt from an article, book, or scholarly discussion.
"""
prompt_hyde=ChatPromptTemplate.from_template(template)
user_question="What is Self-Query Retrieval in rag ?"
query=prompt_hyde.format(question=user_question)  #format provides replacing your variable into the variable in template. 'question' is variable in template 'user question' is your variable out of template . it has been replaced
query

'Human:  \nGiven the question: What is Self-Query Retrieval in rag ?, write a detailed and informative passage that provides an answer, explanation, or context about the topic. Be specific and concise, \nfocusing on relevant facts, examples, and insights. The response should resemble an excerpt from an article, book, or scholarly discussion.\n'

In [8]:
hypothetical_answer=llm.invoke(query).content 
hypothetical_answer

'**Self-Query Retrieval in RAG**\n\nSelf-Query Retrieval (SQR) is a technique used in Retrieval-Augmented Generation (RAG) models, where a query is generated from the input context and used to retrieve relevant passages from a knowledge base. This enhances the model\'s ability to generate informative and coherent responses.\n\nIn RAG models, the input context is first encoded into a vector representation. A query generator module then creates a query based on the encoded context. This query is used to search for relevant passages in a pre-built knowledge base. The retrieved passages are then used to augment the input context and provide additional information for response generation.\n\nSQR offers several advantages:\n\n* **Relevance:** By generating a query from the input context, SQR ensures that retrieved passages are highly relevant to the user\'s question.\n* **Efficiency:** SQR reduces the search space by focusing on passages that are likely to contain useful information, making 

In [9]:
#According to the answer to our question answered by llm, getting the relevant chunks from the document 
relevant_docs=retriever.invoke(hypothetical_answer)
relevant_docs


[Document(metadata={'source': 'https://www.falkordb.com/blog/advanced-rag/', 'title': 'Advanced RAG Techniques: What They Are & How to Use Them', 'description': 'Master advanced RAG techniques to enhance AI performance, accuracy, and efficiency. Learn methods for optimizing retrieval and generation in complex queries.', 'language': 'en-US'}, page_content='Every RAG application can be broken down into two phases: retrieval and generation. First, RAG retrieves relevant documents or knowledge snippets from external sources, such as knowledge graphs or vector stores, using search and indexing techniques. This retrieved data is then fed into a language model, which generates contextually rich and accurate responses by synthesizing the retrieved information with its pre-trained knowledge.RAG systems have evolved as the requirements have become more'),
 Document(metadata={'source': 'https://www.falkordb.com/blog/advanced-rag/', 'title': 'Advanced RAG Techniques: What They Are & How to Use The

In [10]:
relevant_texts=["/n".join(doc.page_content for doc in relevant_docs)]
relevant_texts

['Every RAG application can be broken down into two phases: retrieval and generation. First, RAG retrieves relevant documents or knowledge snippets from external sources, such as knowledge graphs or vector stores, using search and indexing techniques. This retrieved data is then fed into a language model, which generates contextually rich and accurate responses by synthesizing the retrieved information with its pre-trained knowledge.RAG systems have evolved as the requirements have become more/nSelf-RAG is an advanced technique that empowers your system to refine its own retrieval and generation process by iterating on its outputs. In Self-RAG, the model doesn’t just rely on the initial retrieval but actively re-evaluates and adjusts its approach by generating follow-up queries and responses. This iterative process allows the model to correct its own mistakes, fill in gaps, and enhance the quality of the final output.You can think of Self-RAG as your model’s ability to self-correct/nRe

In [11]:
# general template for rag implementations
template="""  
answer the following question in detailed based on context:
context:{context}
question:{question}
"""
prompt=ChatPromptTemplate.from_template(template)


In [12]:
query=prompt.format(context=relevant_texts,question=user_question) #^^# this response is taken based on HyDE Method 
response=llm.invoke(query)
hyde_answer=response.content


In [13]:
print(hyde_answer)

Self-Query Retrieval is an advanced technique that empowers your RAG system to refine its own retrieval and generation process by iterating on its outputs. In Self-Query Retrieval, the model doesn’t just rely on the initial retrieval but actively re-evaluates and adjusts its approach by generating follow-up queries and responses. This iterative process allows the model to correct its own mistakes, fill in gaps, and enhance the quality of the final output. You can think of Self-Query Retrieval as your model’s ability to self-correct.


In [14]:
# this response is taken  from relevant documents based on user question
relevant_docs2=retriever.invoke(user_question)
relevant_texts2=["/n".join(doc.page_content for doc in relevant_docs2)]
query2=template.format(context=relevant_texts2,question=user_question)

In [15]:
response=llm.invoke(query2)
answer=response.content


In [16]:
print(answer)

**Self-Query Retrieval (Self-RAG)** is an advanced technique used in Retrieval-Augmented Generation (RAG) models that enables the model to refine its own retrieval and generation process by iterating on its outputs.

In Self-RAG, the model does not solely rely on the initial retrieval of documents or knowledge snippets but actively re-evaluates and adjusts its approach by generating follow-up queries and responses. This iterative process allows the model to:

* **Correct its own mistakes:** By evaluating the accuracy and relevance of its initial response, the model can identify and correct any errors or inconsistencies.
* **Fill in gaps:** If the initial retrieval does not provide enough information to generate a comprehensive response, the model can generate follow-up queries to gather additional relevant data.
* **Enhance the quality of the final output:** By iteratively refining its retrieval and generation process, the model can produce more nuanced, accurate, and contextually rich