<a href="https://colab.research.google.com/github/singhbishtabhishek/RAG-Implementation/blob/main/RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
#installing the needed modules
!pip install langchain-google-genai google-ai-generativelanguage google-generativeai
!pip install langchain_chroma
!pip install langchain_community
!pip install langchain_text_splitters

Collecting langchain-google-genai
  Downloading langchain_google_genai-2.1.5-py3-none-any.whl.metadata (5.2 kB)
Collecting filetype<2.0.0,>=1.2.0 (from langchain-google-genai)
  Downloading filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB)
Collecting google-ai-generativelanguage
  Downloading google_ai_generativelanguage-0.6.18-py3-none-any.whl.metadata (9.8 kB)
INFO: pip is looking at multiple versions of google-generativeai to determine which version is compatible with other requirements. This could take a while.
Collecting google-generativeai
  Downloading google_generativeai-0.8.4-py3-none-any.whl.metadata (4.2 kB)
  Downloading google_generativeai-0.8.3-py3-none-any.whl.metadata (3.9 kB)
  Downloading google_generativeai-0.8.2-py3-none-any.whl.metadata (3.9 kB)
  Downloading google_generativeai-0.8.1-py3-none-any.whl.metadata (3.9 kB)
  Downloading google_generativeai-0.8.0-py3-none-any.whl.metadata (3.9 kB)
  Downloading google_generativeai-0.7.2-py3-none-any.whl.metadata (4.

In [2]:

import os
import bs4
from getpass import getpass

In [3]:

os.environ["GOOGLE_API_KEY"] = getpass("GoogleAPIkey") #my google API key
os.environ["LANGCHAIN_API_KEY"] = getpass("LangchainAPIkey")  #my langchain API key

os.environ["LANGCHAIN_ENDPOINT"] = "https://api.smith.langchain.com"
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "RAG"
os.environ["USER_AGENT"]="langchain-rag-app/0.1"


GoogleAPIkey··········
LangchainAPIkey··········


In [4]:
#using google generative AI for Chat and Embiddings

import warnings
warnings.filterwarnings("ignore", category=UserWarning)

from langchain_google_genai import GoogleGenerativeAIEmbeddings
gemini_embeddings = GoogleGenerativeAIEmbeddings(
    model="models/embedding-001", # this model is used to convert texts into vectors for search.
)

from langchain_google_genai import ChatGoogleGenerativeAI
model = ChatGoogleGenerativeAI(model = "gemma-3-1b-it", convert_system_message_to_human=True)
print(model.invoke("hello").content)

Hello there! How's your day going so far? 😊 

Is there anything you'd like to chat about or need help with?


In [5]:
#importing the libraries to be used

import bs4
from langchain import hub
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_chroma import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.prompts import ChatPromptTemplate #used to generate promp
from langchain_text_splitters import RecursiveCharacterTextSplitter


In [6]:
#Using the Web Base Loader to load the desired webpage

loader=WebBaseLoader(
    web_path=("https://lilianweng.github.io/posts/2019-06-23-meta-rl/",),
    bs_kwargs=dict(parse_only=bs4.SoupStrainer(class_=("post-content","post-title","post-header"))),
)

doc=loader.load()
doc

[Document(metadata={'source': 'https://lilianweng.github.io/posts/2019-06-23-meta-rl/'}, page_content='\n\n      Meta Reinforcement Learning\n    \nDate: June 23, 2019  |  Estimated Reading Time: 22 min  |  Author: Lilian Weng\n\n\n\nIn my earlier post on meta-learning, the problem is mainly defined in the context of few-shot classification. Here I would like to explore more into cases when we try to “meta-learn” Reinforcement Learning (RL) tasks by developing an agent that can solve unseen tasks fast and efficiently.\nTo recap, a good meta-learning model is expected to generalize to new tasks or new environments that have never been encountered during training. The adaptation process, essentially a mini learning session, happens at test with limited exposure to the new configurations. Even without any explicit fine-tuning (no gradient backpropagation on trainable variables), the meta-learning model autonomously adjusts internal hidden states to learn.\nTraining RL algorithms can be no

In [7]:
#split the web-loaded document into chunks for embedding and retrieval

text_splitter=RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
)

splits=text_splitter.split_documents(doc)
splits

[Document(metadata={'source': 'https://lilianweng.github.io/posts/2019-06-23-meta-rl/'}, page_content='Meta Reinforcement Learning\n    \nDate: June 23, 2019  |  Estimated Reading Time: 22 min  |  Author: Lilian Weng'),
 Document(metadata={'source': 'https://lilianweng.github.io/posts/2019-06-23-meta-rl/'}, page_content='In my earlier post on meta-learning, the problem is mainly defined in the context of few-shot classification. Here I would like to explore more into cases when we try to “meta-learn” Reinforcement Learning (RL) tasks by developing an agent that can solve unseen tasks fast and efficiently.\nTo recap, a good meta-learning model is expected to generalize to new tasks or new environments that have never been encountered during training. The adaptation process, essentially a mini learning session, happens at test with limited exposure to the new configurations. Even without any explicit fine-tuning (no gradient backpropagation on trainable variables), the meta-learning mode

In [8]:
#creating the Chroma vector store for the split documents and embeddings using Gemini

vectorStore = Chroma.from_documents(
    documents=splits,
    embedding=gemini_embeddings,
)
vectorStore

<langchain_chroma.vectorstores.Chroma at 0x782b2052a6d0>

In [9]:
retriever=vectorStore.as_retriever() #retrieve from vector database
retriever

VectorStoreRetriever(tags=['Chroma', 'GoogleGenerativeAIEmbeddings'], vectorstore=<langchain_chroma.vectorstores.Chroma object at 0x782b2052a6d0>, search_kwargs={})

In [10]:
#Giving instructions to guide how the language  model should behave during RAG interaction.

system_prompt=(
    "You will be answering the questions asked"
    "Use the following pieces of retrieved context to answer the question."
    "If you don't know the answer, just say that you don't know, don't try to make up an answer."
    "\n\n"
    "{context}"
)

In [11]:
#Try to mimic human respose in generate answer

chat_prompt=ChatPromptTemplate.from_messages([
    ("system",system_prompt),
    ("human","{input}"),
])

In [12]:
question_answering_chain=create_stuff_documents_chain(model,chat_prompt)

In [13]:
rag_chain=create_retrieval_chain(retriever,question_answering_chain)

In [14]:
rag_chain.invoke({"input":"What are key components in Meta RL"}) #asking question relate to the provided link

{'input': 'What are key components in Meta RL',
 'context': [Document(id='93cda865-969a-4a06-940b-9442bf0829ae', metadata={'source': 'https://lilianweng.github.io/posts/2019-06-23-meta-rl/'}, page_content='According to Botvinick et al. (2019), one source of slowness in RL training is weak inductive bias ( = “a set of assumptions that the learner uses to predict outputs given inputs that it has not encountered”). As a general ML rule, a learning algorithm with weak inductive bias will be able to master a wider range of variance, but usually, will be less sample-efficient. Therefore, to narrow down the hypotheses with stronger inductive biases help improve the learning speed.\nIn meta-RL, we impose certain types of inductive biases from the task distribution and store them in memory. Which inductive bias to adopt at test time depends on the algorithm. Together, these three key components depict a compelling view of meta-RL: Adjusting the weights of a recurrent network is slow but it allo