# PDF Assistant

## Load PDF documents

In [1]:
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader("./data/sample.pdf")
pages = loader.load_and_split()

In [2]:
pages[0]

Document(page_content='Silberschatz, Galvin and Gagne ©2013 Operating System Concepts –9thEdition\nChapter 4:  Threads', metadata={'source': './data/sample.pdf', 'page': 0})

In [3]:
print(f"Page count: {len(pages)}\n")
for page in pages:
    print(f"Page {page.metadata['page'] + 1}: {page.page_content}\n")

Page count: 16

Page 1: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts –9thEdition
Chapter 4:  Threads

Page 2: 4.2 Silberschatz, Galvin and Gagne ©2013 Operating System Concepts –9thEdition
Chapter 4: Threads
Overview
Multicore Programming
Multithreading Models
Threading Issues
Operating System Examples

Page 3: 4.3 Silberschatz, Galvin and Gagne ©2013 Operating System Concepts –9thEdition
Objectives
To introduce the notion of a thread —a fundamental unit of CPU 
utilization that forms the basis of multithreaded computer 
systems
To examine issues related to multithreaded programming
To cover operating system support for threads in Windows and 
Linux

Page 4: 4.4 Silberschatz, Galvin and Gagne ©2013 Operating System Concepts –9thEdition
Motivation
Most modern applications are multithreaded
Threads run within application
Multiple tasks with the application can be implemented by 
separate threads
Update display
Fetch data
Spell checking
Answer a network re

## Create vector database

In [4]:
import os
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings

persistency_directory = "./data/chroma"
embedding_function = OpenAIEmbeddings()

if os.path.exists(persistency_directory):
    print("Loading Chroma from disk...")
    db = Chroma(
        persist_directory=persistency_directory,
        embedding_function=embedding_function,
    )
else:
    print("Creating Chroma from text embeddings...")
    db = Chroma.from_documents(
        pages,
        OpenAIEmbeddings(),
        persist_directory="./data/chroma",
    )

Loading Chroma from disk...


### Test similarity search

In [5]:
query = "What are the obectives of the presentation?" 
docs = db.similarity_search(query)

In [6]:
print(docs[0].page_content)

4.3 Silberschatz, Galvin and Gagne ©2013 Operating System Concepts –9thEdition
Objectives
To introduce the notion of a thread —a fundamental unit of CPU 
utilization that forms the basis of multithreaded computer 
systems
To examine issues related to multithreaded programming
To cover operating system support for threads in Windows and 
Linux


## Create the agent

In [7]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
from langchain.agents import initialize_agent, Tool, AgentType
from langchain.memory import ConversationBufferMemory

llm = ChatOpenAI(temperature=0)

ppt_retriever = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=db.as_retriever(),
)

tools = [
    Tool(
        name="Presentation Content Retriever",
        func=ppt_retriever.run,
        description="useful for finding content inside the presentation",
    )
]

memory = ConversationBufferMemory(
    return_messages=True, memory_key="chat_history"
)

agent = initialize_agent(
    tools,
    llm=llm,
    agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
    verbose=True,
    memory=memory,
)

In [8]:
query = "What are the stated objectives of the lesson? Format your answer as a list of bullet points."
response = agent.run(query)
print(response)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
    "action": "Presentation Content Retriever",
    "action_input": "stated objectives of the lesson"
}[0m
Observation: [36;1m[1;3mThe stated objectives of the lesson are:

1. To introduce the notion of a thread - a fundamental unit of CPU utilization that forms the basis of multithreaded computer systems.
2. To examine issues related to multithreaded programming.
3. To cover operating system support for threads in Windows and Linux.[0m
Thought:[32;1m[1;3m{
    "action": "Final Answer",
    "action_input": "The stated objectives of the lesson are:\n\n1. To introduce the notion of a thread - a fundamental unit of CPU utilization that forms the basis of multithreaded computer systems.\n2. To examine issues related to multithreaded programming.\n3. To cover operating system support for threads in Windows and Linux."
}[0m

[1m> Finished chain.[0m
The stated objectives of the lesson are:

1. To introduce the notion of a