# RAG application built on gemini

In [3]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("yolov9_paper.pdf")
data = loader.load()  # entire PDF is loaded as a single Document
#data


In [8]:
data[0]

Document(metadata={'source': 'yolov9_paper.pdf', 'page': 0}, page_content='YOLOv9: Learning What You Want to Learn\nUsing Programmable Gradient Information\nChien-Yao Wang1,2, I-Hau Yeh2, and Hong-Yuan Mark Liao1,2,3\n1Institute of Information Science, Academia Sinica, Taiwan\n2National Taipei University of Technology, Taiwan\n3Department of Information and Computer Engineering, Chung Yuan Christian University, Taiwan\nkinyiu@iis.sinica.edu.tw, ihyeh@emc.com.tw, and liao@iis.sinica.edu.tw\nAbstract\nToday’s deep learning methods focus on how to design\nthe most appropriate objective functions so that the pre-\ndiction results of the model can be closest to the ground\ntruth. Meanwhile, an appropriate architecture that can\nfacilitate acquisition of enough information for prediction\nhas to be designed. Existing methods ignore a fact that\nwhen input data undergoes layer-by-layer feature extrac-\ntion and spatial transformation, large amount of informa-\ntion will be lost. This paper wi

In [9]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

# split data
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000)
docs = text_splitter.split_documents(data)


print("Total number of documents: ",len(docs))

Total number of documents:  96


In [12]:
docs[1]

Document(metadata={'source': 'yolov9_paper.pdf', 'page': 0}, page_content='when input data undergoes layer-by-layer feature extrac-\ntion and spatial transformation, large amount of informa-\ntion will be lost. This paper will delve into the important is-\nsues of data loss when data is transmitted through deep net-\nworks, namely information bottleneck and reversible func-\ntions. We proposed the concept of programmable gradi-\nent information (PGI) to cope with the various changes\nrequired by deep networks to achieve multiple objectives.\nPGI can provide complete input information for the tar-\nget task to calculate objective function, so that reliable\ngradient information can be obtained to update network\nweights. In addition, a new lightweight network architec-\nture – Generalized Efficient Layer Aggregation Network\n(GELAN), based on gradient path planning is designed.\nGELAN’s architecture confirms that PGI has gained su-\nperior results on lightweight models. We verified the 

In [14]:
from langchain_chroma import Chroma
from langchain_google_genai import GoogleGenerativeAIEmbeddings

from dotenv import load_dotenv
load_dotenv()

#Get an API key: 
# Head to https://ai.google.dev/gemini-api/docs/api-key to generate a Google AI API key. Paste in .env file

# Embedding models: https://python.langchain.com/v0.1/docs/integrations/text_embedding/

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
vector = embeddings.embed_query("hello, world!")
vector[:5]

[0.05168594419956207,
 -0.030764883384108543,
 -0.03062233328819275,
 -0.02802734449505806,
 0.01813092641532421]

In [15]:
vectorstore = Chroma.from_documents(documents=docs, embedding=GoogleGenerativeAIEmbeddings(model="models/embedding-001"))

In [16]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 10})

retrieved_docs = retriever.invoke("What is new in yolov9?")

In [18]:
print(retrieved_docs[5].page_content)

YOLOv9: Learning What You Want to Learn
Using Programmable Gradient Information
Chien-Yao Wang1,2, I-Hau Yeh2, and Hong-Yuan Mark Liao1,2,3
1Institute of Information Science, Academia Sinica, Taiwan
2National Taipei University of Technology, Taiwan
3Department of Information and Computer Engineering, Chung Yuan Christian University, Taiwan
kinyiu@iis.sinica.edu.tw, ihyeh@emc.com.tw, and liao@iis.sinica.edu.tw
Abstract
Today’s deep learning methods focus on how to design
the most appropriate objective functions so that the pre-
diction results of the model can be closest to the ground
truth. Meanwhile, an appropriate architecture that can
facilitate acquisition of enough information for prediction
has to be designed. Existing methods ignore a fact that
when input data undergoes layer-by-layer feature extrac-
tion and spatial transformation, large amount of informa-
tion will be lost. This paper will delve into the important is-


In [19]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-1.5-pro",temperature=0.3, max_tokens=500)

In [21]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

In [22]:
question_answer_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

In [23]:
response = rag_chain.invoke({"input": "what is new in YOLOv9?"})
print(response["answer"])

YOLOv9 introduces two new components: **Programmable Gradient Information (PGI)** and **Generalized ELAN (GELAN)**. PGI helps the model learn more effectively by manipulating gradient flow, while GELAN improves architecture flexibility for better accuracy and speed trade-offs. Together, these enhancements allow YOLOv9 to achieve state-of-the-art performance in object detection. 

