![](2023-07-24-10-52-10.png)

# Building a Simple QA System for Chatting with a PDF

This part of the training will be mostly hands on with the code for building the qa PDF system with langchain.

In [1]:
!pip install langchain==0.2.14
!pip install langchain-openai==0.1.8
!pip install langchainhub==0.1.20
!pip install pypdf==4.2.0
!pip install chromadb==0.5.0

In [2]:
import os

# # Set OPENAI API Key

os.environ["OPENAI_API_KEY"] = "your openai key"

# OR (load from .env file)

# from dotenv import load_dotenv
# load_dotenv("./.env")

In [40]:
from langchain import hub
from langchain_community.vectorstores import Chroma
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain.document_loaders import PyPDFLoader
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

In [14]:
MODEL="gpt-4o-mini"

In [3]:
pdf_path = "./assets-resources/paper-llm-components.pdf"
loader = PyPDFLoader(pdf_path) # LOAD

In [4]:
pdf_docs = loader.load_and_split() # SPLIT
pdf_docs

[Document(metadata={'source': './assets-resources/paper-llm-components.pdf', 'page': 0}, page_content='A Survey on LLM-Based Agents: Common Workflows and Reusable\nLLM-Profiled Components\nXinzhe Li\nSchool of IT, Deakin University, Australia\nlixinzhe@deakin.edu.au\nAbstract\nRecent advancements in Large Language Mod-\nels (LLMs) have catalyzed the development of so-\nphisticated frameworks for developing LLM-based\nagents. However, the complexity of these frame-\nworks r poses a hurdle for nuanced differentiation\nat a granular level, a critical aspect for enabling\nefficient implementations across different frame-\nworks and fostering future research. Hence, the\nprimary purpose of this survey is to facilitate a co-\nhesive understanding of diverse recently proposed\nframeworks by identifying common workflows and\nreusable LLM-Profiled Components (LMPCs).\n1 Introduction\nGenerative Large Language Models (GLMs or LLMs)\nhave acquired extensive general knowledge and\nhuman-like reaso

In [5]:
doc_obj = pdf_docs[0]
doc_obj

Document(metadata={'source': './assets-resources/paper-llm-components.pdf', 'page': 0}, page_content='A Survey on LLM-Based Agents: Common Workflows and Reusable\nLLM-Profiled Components\nXinzhe Li\nSchool of IT, Deakin University, Australia\nlixinzhe@deakin.edu.au\nAbstract\nRecent advancements in Large Language Mod-\nels (LLMs) have catalyzed the development of so-\nphisticated frameworks for developing LLM-based\nagents. However, the complexity of these frame-\nworks r poses a hurdle for nuanced differentiation\nat a granular level, a critical aspect for enabling\nefficient implementations across different frame-\nworks and fostering future research. Hence, the\nprimary purpose of this survey is to facilitate a co-\nhesive understanding of diverse recently proposed\nframeworks by identifying common workflows and\nreusable LLM-Profiled Components (LMPCs).\n1 Introduction\nGenerative Large Language Models (GLMs or LLMs)\nhave acquired extensive general knowledge and\nhuman-like reason

In [6]:
type(doc_obj)

langchain_core.documents.base.Document

In [7]:
doc_obj.page_content

'A Survey on LLM-Based Agents: Common Workflows and Reusable\nLLM-Profiled Components\nXinzhe Li\nSchool of IT, Deakin University, Australia\nlixinzhe@deakin.edu.au\nAbstract\nRecent advancements in Large Language Mod-\nels (LLMs) have catalyzed the development of so-\nphisticated frameworks for developing LLM-based\nagents. However, the complexity of these frame-\nworks r poses a hurdle for nuanced differentiation\nat a granular level, a critical aspect for enabling\nefficient implementations across different frame-\nworks and fostering future research. Hence, the\nprimary purpose of this survey is to facilitate a co-\nhesive understanding of diverse recently proposed\nframeworks by identifying common workflows and\nreusable LLM-Profiled Components (LMPCs).\n1 Introduction\nGenerative Large Language Models (GLMs or LLMs)\nhave acquired extensive general knowledge and\nhuman-like reasoning capabilities (Santurkar et al.,\n2023; Wang et al., 2022; Zhong et al., 2022, 2023),\npositioning

In [8]:
len(pdf_docs)

28

In [9]:
from IPython.display import display, Markdown

Markdown(doc_obj.page_content)

A Survey on LLM-Based Agents: Common Workflows and Reusable
LLM-Profiled Components
Xinzhe Li
School of IT, Deakin University, Australia
lixinzhe@deakin.edu.au
Abstract
Recent advancements in Large Language Mod-
els (LLMs) have catalyzed the development of so-
phisticated frameworks for developing LLM-based
agents. However, the complexity of these frame-
works r poses a hurdle for nuanced differentiation
at a granular level, a critical aspect for enabling
efficient implementations across different frame-
works and fostering future research. Hence, the
primary purpose of this survey is to facilitate a co-
hesive understanding of diverse recently proposed
frameworks by identifying common workflows and
reusable LLM-Profiled Components (LMPCs).
1 Introduction
Generative Large Language Models (GLMs or LLMs)
have acquired extensive general knowledge and
human-like reasoning capabilities (Santurkar et al.,
2023; Wang et al., 2022; Zhong et al., 2022, 2023),
positioning them as pivotal in constructing AI agents
known as LLM-based agents. In the context of this
survey, LLM-based agents are defined by their abil-
ity to interact actively with external tools (such as
Wikipedia) or environments (such as householding en-
vironments) and are designed to function as integral
components of agency, including acting, planning, and
evaluating.
Purpose of the Survey The motivation behind this
survey stems from the observation that many LLM-
based agents incorporate similar workflows and com-
ponents, despite the presence of a wide variety of
technical and conceptual challenges, e.g., search algo-
rithms (Yao et al., 2023a), tree structures (Hao et al.,
2023), and Reinforcement Learning (RL) components
(Shinn et al., 2023). (Wu et al., 2023) offer a modular
approach but lack integration with prevalent agentic
workflows. Wang et al. (2024) provide a comprehen-
sive review of LLM agents, exploring their capabil-
ities across profiling, memory, planning, and action.
In contrast, our survey does not attempt to cover all
components of LLM-based agents comprehensively.
Instead, we concentrate on the involvement of LLMswithin agentic workflows and aim to clarify the roles
of LLMs in agent implementations. We create com-
mon workflows incorporating reusable LLM-Profiled
Components (LMPCs), as depicted in Figure 1.
Contributions This survey offers the following con-
tributions. 1) Alleviating the understanding of com-
plex frameworks : The complexity of existing frame-
works can be simplified into implementable workflows,
especially when they are extracted for specific tasks.
This survey emphasizes reusable workflows and LM-
PCs across popular frameworks, such as ReAct (Yao
et al., 2023b), Reflexion (Shinn et al., 2023) and Tree-
of-Thoughts (Yao et al., 2023a). Specifically, based
on the interaction environments (§2) and the use of
common LMPCs (§3), we categorize and detail vari-
ous workflows, e.g., tool-use workflows, search work-
flows, and feedback-learning workflows. Many ex-
isting frameworks are composed of these workflows
and LMPCs, along with some specific non-LLM com-
ponents. 2) Helping researchers/practitioners as-
sess current frameworks at a more granular and
cohesive level : Section 4 categorizes prominent frame-
works and demonstrates how they are assembled by the
common workflows and LMPCs, as summarized in Ta-
ble 21.3) Facilitating further extensions of existing
frameworks : Existing frameworks could be modi-
fied by changing the implementations of LMPCs. To
enable this, we not only summarize implementations
of LMPCs but also their applicability across diverse
workflows and tasks in Section 5.
2 Task Environments And Tool
Environments
This section explores task environments and tool envi-
ronments, which present different settings compared to
traditional AI and reinforcement learning (RL) agent
frameworks (Russell and Norvig, 2010; Sutton and
Barto, 2018) . After a brief overview of standard logic-

In [10]:
embeddings = OpenAIEmbeddings() # EMBED
embeddings

OpenAIEmbeddings(client=<openai.resources.embeddings.Embeddings object at 0x11ad38d90>, async_client=<openai.resources.embeddings.AsyncEmbeddings object at 0x11ad3cc10>, model='text-embedding-ada-002', dimensions=None, deployment='text-embedding-ada-002', openai_api_version='', openai_api_base='https://api.openai.com/v1', openai_api_type='', openai_proxy='', embedding_ctx_length=8191, openai_api_key=SecretStr('**********'), openai_organization=None, allowed_special=None, disallowed_special=None, chunk_size=1000, max_retries=2, request_timeout=None, headers=None, tiktoken_enabled=True, tiktoken_model_name=None, show_progress_bar=False, model_kwargs={}, skip_empty=False, default_headers=None, default_query=None, retry_min_seconds=4, retry_max_seconds=20, http_client=None, http_async_client=None, check_embedding_ctx_length=True)

In [11]:
vectordb = Chroma.from_documents(pdf_docs, embedding=embeddings) # STORE
vectordb

<langchain_community.vectorstores.chroma.Chroma at 0x11a1ced90>

Definition of a [retriever](https://python.langchain.com/docs/modules/data_connection/retrievers/#:~:text=A%20retriever%20is,Document's%20as%20output.):

> A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store. A retriever does not need to be able to store documents, only to return (or retrieve) them. Vector stores can be used as the backbone of a retriever, but there are other types of retrievers as well.

In [12]:
retriever = vectordb.as_retriever() 
retriever

VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x11a1ced90>)

In [15]:
llm = ChatOpenAI(model=MODEL, temperature=0)

In [26]:
# source: https://python.langchain.com/v0.2/docs/tutorials/pdf_qa/#question-answering-with-rag

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

prompt

ChatPromptTemplate(input_variables=['context', 'input'], messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, say that you don't know. Use three sentences maximum and keep the answer concise.\n\n{context}")), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}'))])

In [35]:
question_answer_chain = create_stuff_documents_chain(llm, prompt)

question_answer_chain

RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableLambda(format_docs)
}), config={'run_name': 'format_inputs'})
| ChatPromptTemplate(input_variables=['context', 'input'], messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, say that you don't know. Use three sentences maximum and keep the answer concise.\n\n{context}")), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}'))])
| ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x12bf88150>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x12bff0890>, model_name='gpt-4o-mini', temperature=0.0, openai_api_key=SecretStr('**********'), openai_api_base='https://api.openai.com/v1', openai_proxy='')
| StrOutputParser()

This method `create_stuff_documents_chain` [outputs an LCEL runnable](https://arc.net/l/quote/bnsztwth)

In [20]:
query = "According to this paper what are these reusable LLM-profiled components?"

In [28]:
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

results = rag_chain.invoke({"input": query})

results

{'input': 'According to this paper what are these reusable LLM-profiled components?',
 'context': [Document(metadata={'page': 0, 'source': './assets-resources/paper-llm-components.pdf'}, page_content='A Survey on LLM-Based Agents: Common Workflows and Reusable\nLLM-Profiled Components\nXinzhe Li\nSchool of IT, Deakin University, Australia\nlixinzhe@deakin.edu.au\nAbstract\nRecent advancements in Large Language Mod-\nels (LLMs) have catalyzed the development of so-\nphisticated frameworks for developing LLM-based\nagents. However, the complexity of these frame-\nworks r poses a hurdle for nuanced differentiation\nat a granular level, a critical aspect for enabling\nefficient implementations across different frame-\nworks and fostering future research. Hence, the\nprimary purpose of this survey is to facilitate a co-\nhesive understanding of diverse recently proposed\nframeworks by identifying common workflows and\nreusable LLM-Profiled Components (LMPCs).\n1 Introduction\nGenerative Lar

In [29]:
from IPython.display import Markdown

final_answer = results["answer"]

Markdown(final_answer)

Reusable LLM-Profiled Components (LMPCs) include various task-agnostic components commonly used across different workflows, such as LLM-Profiled Policy (glmpolicy), LLM-Profiled Evaluators (glmeval), and LLM-Profiled Dynamic Models (glmdynamic). These components leverage the internal knowledge and reasoning abilities of LLMs to generate actions, provide feedback, and predict changes in the environment. The survey emphasizes the importance of these components in simplifying complex frameworks and facilitating the development of LLM-based agents.

In [30]:
query_summary = "Write a simple bullet points summary about this paper"

 # adding chat history so the model remembers previous questions
output = rag_chain.invoke({"input": query_summary})

Markdown(output["answer"])

- The paper discusses the integration of task-specific LMPCs and non-LLM components to enhance agentic workflows.
- It focuses on common LLM-profiled components while omitting memory design and peripheral component integration.
- Various workflows, such as Tree-of-Thoughts and RAP, are explored, highlighting their search strategies and feedback-learning mechanisms.

The final output is easily verifiable, we can see below that the chunk context for the answer came from pages 0,5,7 and 16 in the source pdf.

In [34]:
for i in range(len(output['context'])):
    print(output['context'][i].metadata)

{'page': 7, 'source': './assets-resources/paper-llm-components.pdf'}
{'page': 0, 'source': './assets-resources/paper-llm-components.pdf'}
{'page': 5, 'source': './assets-resources/paper-llm-components.pdf'}
{'page': 16, 'source': './assets-resources/paper-llm-components.pdf'}


Let's now dig deeper into RAG with pdf and construct this rag chain ourselves.

In [36]:


system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages([
    ('system', system_prompt),
    ('human', '{input}')
])

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain_from_docs = (
    {
        'input': lambda x: x['input'],
        'context': lambda x: format_docs(x['context']), 
    }
    | prompt
    | llm
    | StrOutputParser()
)

In [39]:
# passing the input query to the retriever
retrieve_docs = (lambda x: x['input']) | retriever

In [41]:
chain = RunnablePassthrough.assign(context=retrieve_docs).assign(
    answer=rag_chain_from_docs
)
chain

RunnableAssign(mapper={
  context: RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x11a1ced90>)
})
| RunnableAssign(mapper={
    answer: {
              input: RunnableLambda(...),
              context: RunnableLambda(...)
            }
            | ChatPromptTemplate(input_variables=['context', 'input'], messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, say that you don't know. Use three sentences maximum and keep the answer concise.\n\n{context}")), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}'))])
            | ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x12bf88150>, async_cli

In [42]:
query = "According to this paper what are these reusable LLM-profiled components?" 
chain.invoke({'input': query})

{'input': 'According to this paper what are these reusable LLM-profiled components?',
 'context': [Document(metadata={'page': 0, 'source': './assets-resources/paper-llm-components.pdf'}, page_content='A Survey on LLM-Based Agents: Common Workflows and Reusable\nLLM-Profiled Components\nXinzhe Li\nSchool of IT, Deakin University, Australia\nlixinzhe@deakin.edu.au\nAbstract\nRecent advancements in Large Language Mod-\nels (LLMs) have catalyzed the development of so-\nphisticated frameworks for developing LLM-based\nagents. However, the complexity of these frame-\nworks r poses a hurdle for nuanced differentiation\nat a granular level, a critical aspect for enabling\nefficient implementations across different frame-\nworks and fostering future research. Hence, the\nprimary purpose of this survey is to facilitate a co-\nhesive understanding of diverse recently proposed\nframeworks by identifying common workflows and\nreusable LLM-Profiled Components (LMPCs).\n1 Introduction\nGenerative Lar

Adding structured sources:

In [43]:
# source: https://python.langchain.com/v0.2/docs/how_to/qa_sources/
from typing import List

from langchain_core.runnables import RunnablePassthrough
from typing_extensions import Annotated, TypedDict


# Desired schema for response
class AnswerWithSources(TypedDict):
    """An answer to the question, with sources."""

    answer: str
    sources: Annotated[
        List[str],
        ...,
        "List of sources (author + year) used to answer the question",
    ]


# Our rag_chain_from_docs has the following changes:
# - add `.with_structured_output` to the LLM;
# - remove the output parser
rag_chain_from_docs = (
    {
        "input": lambda x: x["input"],
        "context": lambda x: format_docs(x["context"]),
    }
    | prompt
    | llm.with_structured_output(AnswerWithSources)
)

retrieve_docs = (lambda x: x["input"]) | retriever

chain = RunnablePassthrough.assign(context=retrieve_docs).assign(
    answer=rag_chain_from_docs
)

response = chain.invoke({"input": query})
response

{'input': 'According to this paper what are these reusable LLM-profiled components?',
 'context': [Document(metadata={'page': 0, 'source': './assets-resources/paper-llm-components.pdf'}, page_content='A Survey on LLM-Based Agents: Common Workflows and Reusable\nLLM-Profiled Components\nXinzhe Li\nSchool of IT, Deakin University, Australia\nlixinzhe@deakin.edu.au\nAbstract\nRecent advancements in Large Language Mod-\nels (LLMs) have catalyzed the development of so-\nphisticated frameworks for developing LLM-based\nagents. However, the complexity of these frame-\nworks r poses a hurdle for nuanced differentiation\nat a granular level, a critical aspect for enabling\nefficient implementations across different frame-\nworks and fostering future research. Hence, the\nprimary purpose of this survey is to facilitate a co-\nhesive understanding of diverse recently proposed\nframeworks by identifying common workflows and\nreusable LLM-Profiled Components (LMPCs).\n1 Introduction\nGenerative Lar

# References

https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb 
Below are notebook from openai cookbook on these topics of search and embeddings:
- https://github.com/openai/openai-cookbook/blob/main/examples/Get_embeddings.ipynb
- https://github.com/openai/openai-cookbook/blob/main/examples/Code_search.ipynb
- https://github.com/openai/openai-cookbook/blob/main/examples/Customizing_embeddings.ipynb
- https://github.com/openai/openai-cookbook/blob/main/examples/Embedding_Wikipedia_articles_for_search.ipynb
- https://platform.openai.com/docs/guides/embeddings/what-are-embeddings
- [In-context learning abilities of ChatGPT models](https://arxiv.org/pdf/2303.18223.pdf)
- [Issue with long context](https://arxiv.org/pdf/2303.18223.pdf)