In [1]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("yolov9_paper.pdf")
data = loader.load() 

In [2]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

# split data
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000)
docs = text_splitter.split_documents(data)


print("Total number of documents: ",len(docs))

Total number of documents:  96


In [3]:
from langchain_chroma import Chroma
from langchain_google_genai import GoogleGenerativeAIEmbeddings

from dotenv import load_dotenv
load_dotenv() 

#Get an API key: 
# Head to https://ai.google.dev/gemini-api/docs/api-key to generate a Google AI API key. Paste in .env file

# Embedding models: https://python.langchain.com/v0.1/docs/integrations/text_embedding/

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
vector = embeddings.embed_query("hello, world!")
vector[:5]

[0.05168594419956207,
 -0.030764883384108543,
 -0.03062233328819275,
 -0.02802734263241291,
 0.01813093200325966]

In [4]:
vectorstore = Chroma.from_documents(documents=docs, embedding=GoogleGenerativeAIEmbeddings(model="models/embedding-001"))

In [5]:
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 10})

retrieved_docs = retriever.invoke("What is new in yolov9?")

In [6]:
print(retrieved_docs[5].page_content)

YOLOv9: Learning What You Want to Learn
Using Programmable Gradient Information
Chien-Yao Wang1,2, I-Hau Yeh2, and Hong-Yuan Mark Liao1,2,3
1Institute of Information Science, Academia Sinica, Taiwan
2National Taipei University of Technology, Taiwan
3Department of Information and Computer Engineering, Chung Yuan Christian University, Taiwan
kinyiu@iis.sinica.edu.tw, ihyeh@emc.com.tw, and liao@iis.sinica.edu.tw
Abstract
Today’s deep learning methods focus on how to design
the most appropriate objective functions so that the pre-
diction results of the model can be closest to the ground
truth. Meanwhile, an appropriate architecture that can
facilitate acquisition of enough information for prediction
has to be designed. Existing methods ignore a fact that
when input data undergoes layer-by-layer feature extrac-
tion and spatial transformation, large amount of informa-
tion will be lost. This paper will delve into the important is-


In [7]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-1.5-pro",temperature=0.3, max_tokens=500)

In [8]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

In [9]:
question_answer_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

In [10]:
response = rag_chain.invoke({"input": "what is new in YOLOv9?"})
print(response["answer"])

YOLOv9 introduces Programmable Gradient Information (PGI) to improve model accuracy by reducing information loss during training.  It also features Generalized ELAN (GELAN) for better parameter and computation management. These improvements make YOLOv9 a top-performing real-time object detector.
