<a href="https://colab.research.google.com/github/geeta-gwalior/sql-query-generator-geminiai/blob/main/Gemini_with_Rag.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RAG application built on gemini

# New section

In [10]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("edufund.pdf")
data = loader.load()


In [11]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

# split data
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=1000)
docs = text_splitter.split_documents(data)


print("Total number of documents: ",len(docs))
for chunk in docs:
    print(chunk)
print(docs[0].page_content)

Total number of documents:  72
page_content='WHAT ARE MUTUAL FUNDS?
A mutual fund is a pool of money managed by a professional Fund Manager.
It is a trust that collects money from a number of investors who share a common investment objective and invests the same in equities, bonds, money market instruments and/or other securities. And the income / gains generated from this collective investment is distributed proportionately amongst the investors after deducting applicable expenses and levies, by calculating a scheme’s “Net Asset Value” or NAV. Simply put, the money pooled in by a large number of investors is what makes up a Mutual Fund.' metadata={'source': 'edufund.pdf', 'page': 0}
page_content='Here’s a simple way to understand the concept of a Mutual Fund Unit. Let’s say that there is a box of 12 chocolates costing ₹40. Four friends decide to buy the same, but they have only ₹10 each and the shopkeeper only sells by the box. So the friends then decide to pool in ₹10 each and buy th

In [19]:
from langchain_chroma import Chroma
from langchain_google_genai import GoogleGenerativeAIEmbeddings

from dotenv import load_dotenv
from google.colab import userdata



embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001",google_api_key=userdata.get('gemini_api'))
vector = embeddings.embed_query("hello, world!")
len(vector)
vector[0]


0.05168594419956207

In [20]:
vectorstore = Chroma.from_documents(documents=docs, embedding=embeddings)

In [21]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 5})

retrieved_docs = retriever.invoke("What is Mutual funds")
len(retrieved_docs)
print(retrieved_docs[0].page_content)

WHAT ARE MUTUAL FUNDS?
A mutual fund is a pool of money managed by a professional Fund Manager.
It is a trust that collects money from a number of investors who share a common investment objective and invests the same in equities, bonds, money market instruments and/or other securities. And the income / gains generated from this collective investment is distributed proportionately amongst the investors after deducting applicable expenses and levies, by calculating a scheme’s “Net Asset Value” or NAV. Simply put, the money pooled in by a large number of investors is what makes up a Mutual Fund.


In [22]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash",temperature=0.3)

In [23]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

system_prompt = (
     "You are a financial expert. Provide clear, concise answers based on the provided context. "
    "If the information is not found in the context, state that the answer is unavailable. "
    "Use a maximum of three sentences."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

In [24]:
question_answer_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

In [None]:
response=rag_chain.invoke({"input":"What is Mutual funds"})
print(response)