**langchain langchain-google-genai** This installs the langchain-google-genai package, which is a LangChain integration for Google's Generative AI models. It allows developers to easily use Google's AI models within their LangChain applications.

**pillow** This installs the Pillow library, which is a Python Imaging Library (PIL). Pillow provides a simple interface for processing images, including opening, manipulating, and saving images in various file formats.

In [1]:
!pip install langchain langchain-google-genai pillow



---
**import os** The provided code snippet is a single line of Python code that imports the os module. The os module is a part of the Python standard library and is widely used in Python programming. It provides a platform-independent way to interact with the underlying operating system

In [2]:
import os

if "GOOGLE_API_KEY" not in os.environ:
    os.environ["GOOGLE_API_KEY"] = "Enter your API key here"

In [3]:
!pip install langchain_community



---
**Necessary Libraries**

In [4]:
from langchain_community.document_loaders import PyPDFLoader
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain_core.prompts.prompt import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_core.runnables import RunnableLambda


---
**!pip install pypdf** is used to install the pypdf library in your Python environment. This library is a popular and widely-used library for working with PDF files in Python.

In [5]:
!pip install pypdf



---
The **PyPDFLoader** is a component used in the RAG (Retrieval-Augmented Generation) model

In [6]:
 # Load PDF document using PyPDFLoader
loader = PyPDFLoader("path to pdf file")
pages = loader.load()

---
The **CharacterTextSplitter** is a component used in the RAG (Retrieval-Augmented Generation)

In [7]:
# Split the loaded document into smaller chunks for processing
text_splitter = CharacterTextSplitter(
    separator="\n",
    chunk_size=900,
    chunk_overlap=150,
    length_function=len,
)

splits = text_splitter.split_documents(pages)

---
Generate embeddings for the text extracted from documents. These **embeddings** are used to represent the semantic meaning of the text in a numerical format, which is essential for efficient retrieval and comparison of relevant information.

In [8]:
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

---

The **ChromaDB** library is a powerful vector database  to efficiently store and query high-dimensional embeddings.

In [9]:
!pip install chromadb



In [10]:
# Initializes a Chroma vector database with document embeddings and text chunks.
vectorDb = Chroma.from_documents(
    embedding=embeddings,
    documents=splits,
    persist_directory="/docs/chroma"
)

The **Retriever interface** in LangChain defines a standard way for retrieving relevant documents from a data source. By creating a Retriever from a VectorStore, you can use the Retriever interface to interact with the VectorStore in a consistent way, without having to worry about the underlying implementation details.

The **as_retriever()** method creates a VectorStoreRetriever object, which is a wrapper around the VectorStore object. This VectorStoreRetriever implements the Retriever interface and uses the search capabilities of the VectorStore to find relevant documents.

In [11]:
retriever = vectorDb.as_retriever()

---
This **prompt** provides clear instructions for an AI assistant to generate responses based on a provided document. It specifies the expected format for the response, including a direct answer, supporting information from the document, and an acknowledgment if the relevant information is not found in the document. This structure helps ensure the assistant's responses are accurate, detailed, and transparent about the limitations of the available information.

In [12]:
prompt = """You are an AI assistant designed to answer questions based on a provided document.
   Please use the information in the document to generate accurate and detailed responses.
   If the information is not found in the document, indicate that the document does not contain the relevant information."


   Use the following format for structuring response:

   Question: {question}
   Document: {document}

   Answer Format:
        Direct Answer: Provide a concise and direct answer to the question based on the document.
        Supporting Information: Include relevant details or excerpts from the document to support the answer. Use quotes or references to specific sections of the document. """

In [13]:
# creates a PromptTemplate object that formats the provided prompt with the specified input variables (question and document).

prompt_template = PromptTemplate(template=prompt, input_variables=["question", "document"])

In [14]:
# creates an instance of the class, assigning it to the variable llm.

llm = ChatGoogleGenerativeAI(model="gemini-pro")

In [15]:
# Concatenates the content of multiple pages (docs) into a single string, separating each page's content with two newline characters.
def format_pages(pages):
    return "\n\n".join([page.page_content for page in pages])

# Custom Chain

This RAG chain combines several components to generate a response to a user's question:
1. The "document" input is processed by the retriever, the format_pages function, and a RunnableLambda to convert the result to a string.
2. The "question" input is passed through the RunnablePassthrough component.
3. The processed "document" and "question" are then fed into the prompt_template.
4. The output from the prompt_template is passed to the language model (llm).
5. Finally, the StrOutputParser component processes the language model's output to produce the final response.

In [16]:
rag_chain = (
    {"document": retriever | format_pages |  RunnableLambda(lambda x: str(x)), "question": RunnablePassthrough()}
    | prompt_template
    | llm
    | StrOutputParser()
)

# Question Answering

In [25]:
question = "Which paper received the highest number of stars per hour?"
result = rag_chain.invoke(question)
print(result)

Question: Which paper received the highest number of stars per hour?
Direct Answer: The paper "MeshAnything: Artist -Created Mesh Generation with Autoregressive Transformers" received the highest number of stars per hour, with 5.09 stars per hour.
Supporting Information: "Stats: 417, 5.09 stars / hour"


In [26]:
question = "What is the focus of the 'MeshAnything' project?"
result = rag_chain.invoke(question)
print(result)

Question: What is the focus of the 'MeshAnything' project?
Direct Answer: The 'MeshAnything' project focuses on utilizing autoregressive transformers to facilitate artist-created mesh generation.
Supporting Information: "Title: MeshAnything: Artist -Created Mesh Generation with Autoregressive Transformers"


In [27]:
question = "Which paper discusses the integration of Large Language Models with Monte Carlo Tree Search?"
result = rag_chain.invoke(question)
print(result)

Question: Which paper discusses the integration of Large Language Models with Monte Carlo Tree Search?
Direct Answer: The paper titled "Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-Refine with LLaMa-3 8B" discusses the integration of Large Language Models (LLMs) with Monte Carlo Tree Search (MCTS).
Supporting Information: "This paper introduces the MCT Self-Refine algorithm, an innovative integration of Large Language Models (LLMs) with Monte Carlo Tree Search (MCTS), designed to enhance performance in complex mathematical reasoning tasks."


In [18]:
question = "What advancements does the 'VideoLLaMA 2' paper propose?"
result = rag_chain.invoke(question)
print(result)

Question: What advancements does the 'VideoLLaMA 2' paper propose?
Direct Answer: The "VideoLLaMA 2" paper presents a set of Video Large Language Models (Video-LLMs) designed to enhance spatial-temporal modeling and audio understanding in video and audio-oriented tasks.
Supporting Information: "In this paper, we present the VideoLLaMA 2, a set of Video Large Language Models (Video -LLMs) designed to enhance spatial -temporal modeling and audio understanding in video and audio -oriented tasks."


In [19]:
question = "Which paper discusses the integration of Large Language Models with Monte Carlo Tree Search?"
result = rag_chain.invoke(question)
print(result)

Question: Which paper discusses the integration of Large Language Models with Monte Carlo Tree Search?
Direct Answer: The paper titled "Accessing GPT -4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self -refine with LLaMa -3 8B" discusses the integration of Large Language Models (LLMs) with Monte Carlo Tree Search (MCTS).
Supporting Information: "This paper introduces the MCT Self -Refine algorithm, an innovative integration of Large Language Models (LLMs) with Monte Carlo Tree Search (MCTS), designed to enhance performance in complex mathematical reasoning tasks."


In [20]:
question = "What is the focus of the 'MeshAnything' project?"
result = rag_chain.invoke(question)
print(result)

Question: What is the focus of the 'MeshAnything' project?
Direct Answer: The 'MeshAnything' project focuses on enabling artists to create 3D meshes using autoregressive transformers, which can generate high-quality assets that rival manually crafted ones.
Supporting Information: "MeshAnything: Artist -Created Mesh Generation with Autoregressive Transformers."


In [21]:
question = "Which paper was published most recently?"
result = rag_chain.invoke(question)
print(result)

Question: Which paper was published most recently?
Direct Answer: The two papers with the most recent publication dates are "MeshAnything: Artist -Created Mesh Generation with Autoregressive Transformers" and "Accessing GPT -4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self -refine with LLaMa -3 8B", both published on June 14, 2024.
Supporting Information: "Title:  MeshAnything: Artist -Created Mesh Generation with Autoregressive Transformers  
Authors:  buaacyw/meshanything  
Date:  14 Jun 2024  
Description:  Recently, 3D assets created via reconstruction and generation have matched the 
quality of manually crafted assets, highlighting their potential for replacement.  
Stats:  417, 5.09 stars / hour  
Categories:  Decoder  
Links:  Paper, Code  
 
Title:  Accessing GPT -4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self -
refine with LLaMa -3 8B  
Authors:  trotsky1997/mathblackbox  
Date:  11 Jun 2024  
Description:  This paper introduces the MCT Self

In [22]:
question = "Identify a paper that deals with language modeling and its scalability."
result = rag_chain.invoke(question)
print(result)

Question: Identify a paper that deals with language modeling and its scalability.
Direct Answer: The document provides two papers related to language modeling and scalability:
1. "Scalable MatMul-free Language Modeling" by ridgerchu/matmulfreellm
2. "VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs" by damo-nlp-sg/videollama2

Supporting Information:
1. "Scalable MatMul-free Language Modeling": This paper proposes a novel "MatMul-free" language modeling approach that achieves performance on par with state-of-the-art Transformers while requiring significantly less memory during inference, enabling scalability to larger model sizes.
2. "VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs": This paper focuses on enhancing spatial-temporal modeling and audio understanding in Video Large Language Models (Video-LLMs), but also addresses scalability challenges by introducing a modular and flexible design that allows fo

In [23]:
question = "Which paper aims at improving accuracy in Google-Proof Question Answering?"
result = rag_chain.invoke(question)
print(result)

Question: Which paper aims at improving accuracy in Google-Proof Question Answering?
Direct Answer: The paper titled "TextGrad: Automatic 'Differentiation' via Text" aims at improving accuracy in Google-Proof Question Answering.
Supporting Information: The document states that "Without modifying the framework, TextGrad improves the zero -shot accuracy of GPT -4o in Google -Proof Question Answering."


In [24]:
question = "List the categories covered by the paper titled TextGrad: Automatic 'Differentiation' via Text."
result = rag_chain.invoke(question)
print(result)

Question: List the categories covered by the paper titled TextGrad: Automatic 'Differentiation' via Text.
Direct Answer: The categories covered by the paper titled "TextGrad: Automatic 'Differentiation' via Text" are:
1. Question Answering
2. Specificity

Supporting Information: "Categories: Question Answering, Specificity"
