# Local PDF Questions
In this notebook we use `python 3.11.6` and `Ollama` to interact with PDFs from you local computer

## Table of Content
1. Setup
2. Ingesting PDFs
3. Embeddings
4. Retrieval


## Setup

In [1]:
!pip uninstall -y chromadb protobuf

Found existing installation: chromadb 1.0.9
Uninstalling chromadb-1.0.9:
  Successfully uninstalled chromadb-1.0.9
Found existing installation: protobuf 5.29.4
Uninstalling protobuf-5.29.4:
  Successfully uninstalled protobuf-5.29.4


In [None]:
# pip install using requirements.txt
!pip install -r requirements.txt
# Individual pip installs 
# !pip install tqdm "unstructured[all-docs]>=0.16.12" langchain chromadb 
# !pip install langchain_community
# !pip install langchain-text-splitters langchain-ollama protobuf

## Ingesting PDFs
It is required to create on the same directory level a folder `pdfs` and in there store your desired `pdf` file to use. 
> Make sure to set the `pdf_name` variable below to the name of your pdf file

In [3]:
import os
from langchain_community.document_loaders import UnstructuredPDFLoader
from langchain_community.document_loaders import OnlinePDFLoader

In [4]:
pdf_name = "ReAct-AI-agents.pdf"
pdf_path = os.path.join("pdfs", pdf_name)
print(pdf_path)

if pdf_path:
    loader = UnstructuredPDFLoader(file_path=pdf_path)
    data = loader.load()
    print(f"Uploaded '{pdf_path}'.")
else:
    print("Upload a pdf in 'pdfs' folder. Create the folder if needed.")

pdfs/ReAct-AI-agents.pdf
Uploaded 'pdfs/ReAct-AI-agents.pdf'.


In [5]:
# Preview (output is too long, so only uncomment for curiosity)
# data[0].page_content

## Embeddings

This section requires that you have [Ollama](https://ollama.com/) installed in your local computer and that you have pulled an embedding model of your choice. For this notebook, we chose [nomic-embed-text](https://ollama.com/library/nomic-embed-text) as it has large context length and outperforms OpenAI's `text-embedding-ada-002` and `text-embedding-3-small`.

In [6]:
# pull the model in case you do not have it yet
!ollama serve
!ollama pull nomic-embed-text

Error: listen tcp 127.0.0.1:11434: bind: address already in use
[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling 970aa74c0a90: 100% ▕██████████████████▏ 274 MB                         [K
pulling c71d239df917: 100% ▕██████████████████▏  11 KB                         [K
pulling ce4a164fc046: 100% ▕██████████████████▏   17 B                         [K
pulling 31df23ea7daa: 100% ▕██████████████████▏  420 B                         [K
verifying sha256 digest [K
writing manifest [K
success [K[?25h[?2026l


In [7]:
!ollama list

NAME                       ID              SIZE      MODIFIED               
nomic-embed-text:latest    0a109f422b47    274 MB    Less than a second ago    
mxbai-embed-large:335m     468836162de7    669 MB    4 hours ago               
nomic-embed-text:v1.5      0a109f422b47    274 MB    4 hours ago               
llama3.2:3b                a80c4f17acd5    2.0 GB    4 hours ago               


In [8]:
EMBEDDING_MODEL = "nomic-embed-text"

In [9]:
from langchain_ollama import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma

In [10]:
# Split and chunk pdf
text_splitter = RecursiveCharacterTextSplitter(chunk_size=8000, chunk_overlap=150)
chunks = text_splitter.split_documents(data)

In [11]:
# visualize two chunks (long output cell, uncomment to visualize if curious)
# chunks[:2]

In [12]:
# Add chunks to vector database (chunks get embeded)
vdb = Chroma.from_documents(
    documents=chunks,
    persist_directory=None, # to Chroma an epehemral client aka in memory
    embedding=OllamaEmbeddings(model=EMBEDDING_MODEL),
    collection_name="local_rag", # collection created in chromadb
)

## Retrieval 

In [13]:
from langchain.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_ollama import ChatOllama
from langchain_core.runnables import RunnablePassthrough
from langchain.retrievers.multi_query import MultiQueryRetriever

In [14]:
# LLM from Ollama
LOCAL_MODEL = "llama3.2:3b"

In [15]:
llm = ChatOllama(model=LOCAL_MODEL)

In [16]:
# Query prompt template
QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are an AI language model assistant. Your task is to generate 2
    different versions of the given user question to retrieve relevant documents from
    a vector database. By generating multiple perspectives on the user question, your
    goal is to help the user overcome some of the limitations of the distance-based
    similarity search. Provide these alternative questions separated by newlines.
    Original question: {question}""",
)

In [17]:
# Retriever
retriever = MultiQueryRetriever.from_llm(
    vdb.as_retriever(),
    llm,
    prompt=QUERY_PROMPT,
)

In [18]:
# RAG prompt
template = """
Answer the question based ONLY on the following context:
{context}
Question: 
{question}
"""

rag_prompt = ChatPromptTemplate.from_template(template)

In [19]:
# create a chain to run all pipeline of steps
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
    | StrOutputParser()
)

In [20]:
# call the whole pipeline of steps aka do RAG
# uncomment this for getting an input to type your question
response = chain.invoke(input(""))

 what is this about?


In [21]:
response

'This appears to be an abstract or description of a research paper or conference presentation, likely related to artificial intelligence and reinforcement learning. The text mentions "ReAct" which seems to be a type of behavior correction algorithm for reinforcement learning. Specifically, it describes a scenario where ReAct fails to produce desirable reasoning traces and actions due to a "hallucinating thought", but is corrected by a human editor who manually edits the thoughts to improve the outcome.'

In [22]:
resp2 = chain.invoke("What are the main 5 take aways from the document?")
resp2

"Here are the 5 main takeaways from the document:\n\n1. **ReAct outperforms Act in finding relevant products**: In the Webshop task, ReAct is able to find products that satisfy all target attributes, while Act relies on search and filtering.\n\n2. **ReAct uses reasoning to solve tasks**: The example trajectories show that ReAct uses reasoning to solve tasks, such as thinking about the product's flavor name and features before taking actions.\n\n3. **ReAct's thought process is explicit**: In contrast to Act, which appears to have a more implicit thought process, ReAct's thoughts are explicitly stated in its trajectory, allowing for a clearer understanding of how it arrives at a solution.\n\n4. **ReAct struggles with certain tasks due to limitations in its thought process**: The example trajectories also show that ReAct gets stuck trying to put a knife on a countertop, highlighting potential limitations in its thought process.\n\n5. **ReAct's strengths lie in its ability to reason and th

In [23]:
# Delete all collections in the vector db 
# (aka clean memory because chroma was ephemeral)
vdb.delete_collection()