[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mongodb-developer/GenAI-Showcase/blob/main/notebooks/agents/agent_fireworks_ai_langchain_mongodb.ipynb)


## Install Libraries


In [145]:
!pip install langchainhub langchain-fireworks langchain-huggingface langchain-mongodb arxiv pymupdf datasets pymongo tqdm

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Collecting langchainhub
  Downloading langchainhub-0.1.17-py3-none-any.whl.metadata (621 bytes)
Collecting types-requests<3.0.0.0,>=2.31.0.2 (from langchainhub)
  Downloading types_requests-2.32.0.20240523-py3-none-any.whl.metadata (1.8 kB)
Downloading langchainhub-0.1.17-py3-none-any.whl (4.8 kB)
Downloading types_requests-2.32.0.20240523-py3-none-any.whl (15 kB)
Installing collected packages: types-requests, langchainhub
Successfully installed langchainhub-0.1.17 types-requests-2.32.0.20240523


## Set Evironment Variables


In [93]:
import getpass

MONGODB_URI = getpass.getpass("Enter your MongoDB connection string:")

In [94]:
os.environ["FIREWORKS_API_KEY"] = getpass.getpass("Enter Fireworks API key:")

## Ingest Data into MongoDB Vector Database


In [95]:
import pandas as pd
from datasets import load_dataset

data = load_dataset("mongodb-eai/arxiv-embeddings")
dataset_df = pd.DataFrame(data["train"])

Using the latest cached version of the dataset since mongodb-eai/arxiv-embeddings couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'default' at /Users/apoorva.joshi/.cache/huggingface/datasets/mongodb-eai___arxiv-embeddings/default/0.0.0/489df6ceb90444598a5f73794db75a7dec209134 (last modified on Wed May 29 15:33:09 2024).


In [96]:
dataset_df.head()

Unnamed: 0,id,submitter,authors,title,comments,journal-ref,doi,report-no,categories,license,abstract,versions,update_date,authors_parsed,embedding
0,704.0001,Pavel Nadolsky,"C. Bal\'azs, E. L. Berger, P. M. Nadolsky, C.-...",Calculation of prompt diphoton production cros...,"37 pages, 15 figures; published version","Phys.Rev.D76:013009,2007",10.1103/PhysRevD.76.013009,ANL-HEP-PR-07-12,hep-ph,,A fully differential calculation in perturba...,"[{'version': 'v1', 'created': 'Mon, 2 Apr 2007...",1227657600000,"[[Balázs, C., ], [Berger, E. L., ], [Nadolsky,...","[0.2324569076, -0.894839108, -0.242858842, 0.1..."
1,704.0002,Louis Theran,Ileana Streinu and Louis Theran,Sparsity-certifying Graph Decompositions,To appear in Graphs and Combinatorics,,,,math.CO cs.CG,http://arxiv.org/licenses/nonexclusive-distrib...,"We describe a new algorithm, the $(k,\ell)$-...","[{'version': 'v1', 'created': 'Sat, 31 Mar 200...",1229126400000,"[[Streinu, Ileana, ], [Theran, Louis, ]]","[0.6949232221, 0.3588359952, 0.1817755997, 0.7..."
2,704.0003,Hongjun Pan,Hongjun Pan,The evolution of the Earth-Moon system based o...,"23 pages, 3 figures",,,,physics.gen-ph,,The evolution of Earth-Moon system is descri...,"[{'version': 'v1', 'created': 'Sun, 1 Apr 2007...",1200182400000,"[[Pan, Hongjun, ]]","[0.1294624656, 1.1964389086, 0.8928941488, -0...."
3,704.0004,David Callan,David Callan,A determinant of Stirling cycle numbers counts...,11 pages,,,,math.CO,,We show that a determinant of Stirling cycle...,"[{'version': 'v1', 'created': 'Sat, 31 Mar 200...",1179878400000,"[[Callan, David, ]]","[-0.0994227678, -0.364127785, 0.5390082002, -0..."
4,704.0005,Alberto Torchinsky,Wael Abu-Shammala and Alberto Torchinsky,From dyadic $\Lambda_{\alpha}$ to $\Lambda_{\a...,,"Illinois J. Math. 52 (2008) no.2, 681-689",,,math.CA math.FA,,In this paper we show how to compute the $\L...,"[{'version': 'v1', 'created': 'Mon, 2 Apr 2007...",1381795200000,"[[Abu-Shammala, Wael, ], [Torchinsky, Alberto, ]]","[0.0711007342, 0.5356642008, 0.5095595121, 0.4..."


In [97]:
from pymongo import MongoClient

# Initialize MongoDB python client
client = MongoClient(MONGODB_URI)

DB_NAME = "agent_demo"
COLLECTION_NAME = "knowledge"
ATLAS_VECTOR_SEARCH_INDEX_NAME = "vector_index"
collection = client[DB_NAME][COLLECTION_NAME]

In [6]:
# Delete any existing records in the collection
collection.delete_many({})

# Data Ingestion
records = dataset_df.to_dict("records")
collection.insert_many(records)

print("Data ingestion into MongoDB completed")

Data ingestion into MongoDB completed


## Create Vector Search Index Defintion

```
{
  "fields": [
    {
      "type": "vector",
      "path": "embedding",
      "numDimensions": 1024,
      "similarity": "cosine"
    }
  ]
}
```


## Create MongoDB Vector Store Retriever


In [178]:
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from langchain_mongodb import MongoDBAtlasVectorSearch

embedding_model = HuggingFaceEmbeddings(model_name="mixedbread-ai/mxbai-embed-large-v1")

# Vector Store Creation
vector_store = MongoDBAtlasVectorSearch.from_connection_string(
    connection_string=MONGODB_URI,
    namespace=DB_NAME + "." + COLLECTION_NAME,
    embedding=embedding_model,
    index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,
    text_key="abstract",
)

retriever = vector_store.as_retriever(search_type="similarity", search_kwargs={"k": 5})



In [177]:
from langchain_fireworks import ChatFireworks

llm = ChatFireworks(
    model="accounts/fireworks/models/firefunction-v1", temperature=0.0, max_tokens=1024
)

## Create Agent Tools


In [179]:
from langchain.tools import tool
from typing import Type
from langchain_community.document_loaders import ArxivLoader
from langchain_core.prompts import ChatPromptTemplate
from langchain_community.chat_models import ChatOllama
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser


@tool
def get_paper_metadata_from_arxiv(topic: str) -> list:
    """
    Fetch and return paper metadata for 10 arxiv papers matching the given topic, for example: Retrieval Augmented Generation.

    Args:
    topic (str): The topic to find papers for on arXiv.

    Returns:
    list: Metadata about the papers matching the topic.
    """
    docs = ArxivLoader(query=topic, load_max_docs=5).load()
    # Extract just the metadata from each document
    metadata = [doc.metadata for doc in docs]
    return metadata


@tool
def get_paper_summary_from_arxiv(id: str) -> list:
    """
    Fetch and return the summary for a single research paper from arXiv given the paper ID, for example: 1605.08386.

    Args:
    id (str): The paper ID.

    Returns:
    str: Summary of the paper.
    """
    doc = ArxivLoader(query=id, load_max_docs=1).get_summaries_as_docs()
    if len(doc) == 0:
        return "No summary found for this paper."
    return doc[0].page_content


@tool
def answer_questions_about_topics(query: str) -> list:
    """
    Answer questions about a given topic based on information in the knowledge base.

    Args:
    query (str): User query about a topic.

    Returns:
    str: Information about the topic.
    """
    retrieve = {
        "context": retriever
        | (lambda docs: "\n\n".join([d.page_content for d in docs])),
        "question": RunnablePassthrough(),
    }
    template = """Answer the question based only on the following context. If no context is provided, say I do not know: \
    {context}

    Question: {question}
    """
    # Defining the chat prompt
    prompt = ChatPromptTemplate.from_template(template)
    # Parse output as a string
    parse_output = StrOutputParser()
    # Retrieval chain
    retrieval_chain = retrieve | prompt | llm | parse_output

    print(retrieval_chain)

    answer = retrieval_chain.invoke(query)

    return answer

In [110]:
get_paper_metadata_from_arxiv.invoke("Retrieval Augmented Generation")

[{'Published': '2022-02-13',
  'Title': 'A Survey on Retrieval-Augmented Text Generation',
  'Authors': 'Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu',
  'Summary': 'Recently, retrieval-augmented text generation attracted increasing attention\nof the computational linguistics community. Compared with conventional\ngeneration models, retrieval-augmented text generation has remarkable\nadvantages and particularly has achieved state-of-the-art performance in many\nNLP tasks. This paper aims to conduct a survey about retrieval-augmented text\ngeneration. It firstly highlights the generic paradigm of retrieval-augmented\ngeneration, and then it reviews notable approaches according to different tasks\nincluding dialogue response generation, machine translation, and other\ngeneration tasks. Finally, it points out some important directions on top of\nrecent methods to facilitate future research.'},
 {'Published': '2024-05-12',
  'Title': 'DuetRAG: Collaborative Retrieval-Augmented Gene

In [111]:
get_paper_summary_from_arxiv.invoke("1808.09236")

'We determine the non-perturbatively renormalized axial current for O($a$)\nimproved lattice QCD with Wilson quarks. Our strategy is based on the chirally\nrotated Schr\\"odinger functional and can be generalized to other finite (ratios\nof) renormalization constants which are traditionally obtained by imposing\ncontinuum chiral Ward identities as normalization conditions. Compared to the\nlatter we achieve an error reduction up to one order of magnitude. Our results\nhave already enabled the setting of the scale for the $N_{\\rm f}=2+1$ CLS\nensembles [1] and are thus an essential ingredient for the recent $\\alpha_s$\ndetermination by the ALPHA collaboration [2]. In this paper we shortly review\nthe strategy and present our results for both $N_{\\rm f}=2$ and $N_{\\rm f}=3$\nlattice QCD, where we match the $\\beta$-values of the CLS gauge configurations.\nIn addition to the axial current renormalization, we also present precise\nresults for the renormalized local vector current.'

In [113]:
get_paper_summary_from_arxiv.invoke("808.09236")

'No summary found for this paper.'

In [120]:
answer_questions_about_topics.invoke("What are partial cubes?")

"Partial cubes are isometric subgraphs of hypercubes. They are characterized by structures on a graph defined by means of semicubes, and Djokovi\\'{c}'s and Winkler's relations. These structures are employed in the paper to characterize bipartite graphs and partial cubes of arbitrary dimension."

In [116]:
tools = [
    get_paper_metadata_from_arxiv,
    get_paper_summary_from_arxiv,
    answer_questions_about_topics,
]

## Basic Agent


In [199]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain.tools.render import render_text_description

system_message = f"""Answer the following questions as best you can.
You can answer directly if the user is greeting you or similar.
Otherwise, you have access to the following tools:

{render_text_description(tools)}
"""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_message),
        ("human", "{input}"),
        # Placeholders fill up a **list** of messages
        MessagesPlaceholder("agent_scratchpad"),
    ]
)

agent = create_tool_calling_agent(llm, tools, prompt)

agent_executor = AgentExecutor(
    agent=agent, tools=tools, verbose=True, handle_parsing_errors=True
)

agent_executor.invoke({"input": "Give me papers on the topic prompt compression."})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `get_paper_metadata_from_arxiv` with `{'topic': 'prompt compression'}`


[0m[36;1m[1;3m[{'Published': '2024-03-30', 'Title': 'PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression', 'Authors': 'Muhammad Asif Ali, Zhengping Li, Shu Yang, Keyuan Cheng, Yang Cao, Tianhao Huang, Lijie Hu, Lu Yu, Di Wang', 'Summary': "Large language models (LLMs) have shown exceptional abilities for multiple\ndifferent natural language processing tasks. While prompting is a crucial tool\nfor LLM inference, we observe that there is a significant cost associated with\nexceedingly lengthy prompts. Existing attempts to compress lengthy prompts lead\nto sub-standard results in terms of readability and interpretability of the\ncompressed prompt, with a detrimental impact on prompt utility. To address\nthis, we propose PROMPT-SAW: Prompt compresSion via Relation AWare graphs, an\neffective strategy for prompt compressi

{'input': 'Give me papers on the topic prompt compression.',
 'output': 'Here are some papers on the topic of prompt compression:\n\n1. "PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression" by Muhammad Asif Ali, Zhengping Li, Shu Yang, Keyuan Cheng, Yang Cao, Tianhao Huang, Lijie Hu, Lu Yu, Di Wang. Published on 2024-03-30.\n\n2. "Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression" by Xinze Li, Zhenghao Liu, Chenyan Xiong, Shi Yu, Yukun Yan, Shuo Wang, Ge Yu. Published on 2024-02-25.\n\n3. "Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt" by Zhaozhuo Xu, Zirui Liu, Beidi Chen, Yuxin Tang, Jue Wang, Kaixiong Zhou, Xia Hu, Anshumali Shrivastava. Published on 2023-10-10.\n\n4. "PromptCIR: Blind Compressed Image Restoration with Prompt Learning" by Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen. Published on 2024-04-26.\n\n5. "Learnin

## Bonus: Without using `create_tool_calling_agent`


In [None]:
# from langchain.agents.output_parsers.tools import ToolsAgentOutputParser
# from langchain.agents.format_scratchpad.tools import (
#     format_to_tool_messages,
# )

# llm_with_tools = llm.bind_tools(tools)

# agent = (
#     RunnablePassthrough.assign(
#         agent_scratchpad=lambda x: format_to_tool_messages(x["intermediate_steps"])
#     )
#     | prompt
#     | llm_with_tools
#     | ToolsAgentOutputParser()
# )

# agent_executor = AgentExecutor(
#     agent=agent, tools=tools, verbose=True, handle_parsing_errors=True
# )

# agent_executor.invoke({"input": "Give me papers on the topic prompt compression."})

## ReAct Agent


In [207]:
from langchain.agents import create_react_agent
from langchain import hub

prompt = hub.pull("hwchase17/react")
prompt.pretty_print()

Answer the following questions as best you can. You have access to the following tools:

[33;1m[1;3m{tools}[0m

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [[33;1m[1;3m{tool_names}[0m]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: [33;1m[1;3m{input}[0m
Thought:[33;1m[1;3m{agent_scratchpad}[0m


In [None]:
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
    handle_parsing_errors="Check your output. Make an observation in order to determine whether or not you have the final answer.\
        If you do, use the exact characters `Final Answer` and exit.",
)
agent_executor.invoke({"input": "Give me the summary for the paper 1808.09236."})

## Adding Memory to the Agent


In [243]:
from langchain_mongodb.chat_message_histories import MongoDBChatMessageHistory
from langchain.memory import ConversationBufferMemory


def get_session_history(session_id: str) -> MongoDBChatMessageHistory:
    return MongoDBChatMessageHistory(
        MONGODB_URI, session_id, database_name=DB_NAME, collection_name="history"
    )


# memory = ConversationBufferMemory(
#     memory_key="chat_history", chat_memory=get_session_history("my-session")
# )

## Tooling Calling Agent with Memory


In [252]:
from langchain_core.runnables.history import RunnableWithMessageHistory

system_message = f"""Answer the following questions as best you can.
You can answer directly if the user is greeting you or similar.
Otherwise, you have access to the following tools:

{render_text_description(tools)}
"""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_message),
        ("human", "{input}"),
        # Placeholders fill up a **list** of messages
        MessagesPlaceholder("agent_scratchpad"),
    ]
)

agent = create_tool_calling_agent(llm, tools, prompt)

# agent_executor = AgentExecutor(
#     agent=agent,
#     tools=tools,
#     verbose=True,
#     handle_parsing_errors=True,
#     memory=memory,
# )
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_with_chat_history = RunnableWithMessageHistory(
    agent_executor,
    # This is needed because in most real world scenarios, a session id is needed
    # It isn't really used here because we are using a simple in memory ChatMessageHistory
    lambda session_id: get_session_history("session-1"),
    input_messages_key="input",
    history_messages_key="chat_history",
)

In [254]:
agent_with_chat_history.invoke(
    {"input": "Get me a list of research papers on the topic Prompt Compression"},
    config={"configurable": {"session_id": "session-2"}},
)

Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 3e95d9ca-1f31-4c7b-be7f-f6ff01c3576a.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 76f9cd1e-0815-4886-bc6e-172147dccffc.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID f76c07cb-89fe-4ab2-80cc-ce08312d8ef1.')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on



[1m> Entering new AgentExecutor chain...[0m


Error in RootListenersTracer.on_llm_end callback: KeyError('input')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID f7c2135c-7a6a-427c-93a6-1b0fb1785379.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID c1eba923-66c5-4311-8aa9-1237bbefc1da.')
Error in RootListenersTracer.on_tool_start callback: AssertionError('Invalid format: original+chat')


[32;1m[1;3m
Invoking: `get_paper_metadata_from_arxiv` with `{'topic': 'Prompt Compression'}`


[0m

Error in RootListenersTracer.on_tool_end callback: TracerException('No indexed run ID 473bf6c5-27db-4bfa-9bc3-fa07300e4623.')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 3a0c211b-66e2-4d05-9289-3c98d3fd931a.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 2a2e772d-2dcb-4223-a886-9d9b4890984c.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 20a3346f-c043-410c-b637-488548a04147.')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Err

[36;1m[1;3m[{'Published': '2024-03-30', 'Title': 'PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression', 'Authors': 'Muhammad Asif Ali, Zhengping Li, Shu Yang, Keyuan Cheng, Yang Cao, Tianhao Huang, Lijie Hu, Lu Yu, Di Wang', 'Summary': "Large language models (LLMs) have shown exceptional abilities for multiple\ndifferent natural language processing tasks. While prompting is a crucial tool\nfor LLM inference, we observe that there is a significant cost associated with\nexceedingly lengthy prompts. Existing attempts to compress lengthy prompts lead\nto sub-standard results in terms of readability and interpretability of the\ncompressed prompt, with a detrimental impact on prompt utility. To address\nthis, we propose PROMPT-SAW: Prompt compresSion via Relation AWare graphs, an\neffective strategy for prompt compression over task-agnostic and task-aware\nprompts. PROMPT-SAW uses the prompt's textual information to build a graph,\nlater extracts key information ele

Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 4d9964c3-071c-471c-9ec1-acfb799b9cb3.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 9c6378bb-2ecb-4961-b4a0-a19e303981f8.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 3e67d93b-38e4-43d4-be08-bbe1db103c78.')


[32;1m[1;3mHere are some research papers on the topic Prompt Compression:

1. "PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression" by Muhammad Asif Ali, Zhengping Li, Shu Yang, Keyuan Cheng, Yang Cao, Tianhao Huang, Lijie Hu, Lu Yu, Di Wang

2. "Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression" by Xinze Li, Zhenghao Liu, Chenyan Xiong, Shi Yu, Yukun Yan, Shuo Wang, Ge Yu

3. "Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt" by Zhaozhuo Xu, Zirui Liu, Beidi Chen, Yuxin Tang, Jue Wang, Kaixiong Zhou, Xia Hu, Anshumali Shrivastava

4. "PromptCIR: Blind Compressed Image Restoration with Prompt Learning" by Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen

5. "Learning to Compress Prompt in Natural Language Formats" by Yu-Neng Chuang, Tianwei Xing, Chia-Yuan Chang, Zirui Liu, Xun Chen, Xia Hu[0m

[1m> Finished chain.[0m


{'input': 'Get me a list of research papers on the topic Prompt Compression',
 'chat_history': [],
 'output': 'Here are some research papers on the topic Prompt Compression:\n\n1. "PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression" by Muhammad Asif Ali, Zhengping Li, Shu Yang, Keyuan Cheng, Yang Cao, Tianhao Huang, Lijie Hu, Lu Yu, Di Wang\n\n2. "Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression" by Xinze Li, Zhenghao Liu, Chenyan Xiong, Shi Yu, Yukun Yan, Shuo Wang, Ge Yu\n\n3. "Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt" by Zhaozhuo Xu, Zirui Liu, Beidi Chen, Yuxin Tang, Jue Wang, Kaixiong Zhou, Xia Hu, Anshumali Shrivastava\n\n4. "PromptCIR: Blind Compressed Image Restoration with Prompt Learning" by Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen\n\n5. "Learning to Compress Prompt in Natural Language Formats" by Yu-Neng

In [255]:
agent_with_chat_history.invoke(
    {"input": "WHat is the name of the first paper in that list?"},
    config={"configurable": {"session_id": "session-2"}},
)

Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID a82ed9dd-029b-46b0-a3b6-0a25662acda4.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 0837a2c4-7c0b-4e3a-9814-af284a3a3bda.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID d29d7554-1d8d-496d-8fae-6d6c23381046.')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on



[1m> Entering new AgentExecutor chain...[0m


Error in RootListenersTracer.on_llm_end callback: KeyError('input')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID e579372a-0c9c-4d48-a617-4d2ac3c19bb1.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID f5186753-1827-40b2-9299-712ec10fa709.')
Error in RootListenersTracer.on_tool_start callback: AssertionError('Invalid format: original+chat')


[32;1m[1;3m
Invoking: `get_paper_metadata_from_arxiv` with `{'topic': 'Retrieval Augmented Generation'}`


[0m

Error in RootListenersTracer.on_tool_end callback: TracerException('No indexed run ID 62d23f06-7985-492d-aff5-d5a59b57050a.')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 783e95de-bc81-4f39-8d6d-cac5a0d6b4e5.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 51a8b7e7-c8ac-491d-a2e4-2cb151e6b301.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 0debc771-70c8-4e65-8e84-fe889578d066.')
Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Err

[36;1m[1;3m[{'Published': '2022-02-13', 'Title': 'A Survey on Retrieval-Augmented Text Generation', 'Authors': 'Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu', 'Summary': 'Recently, retrieval-augmented text generation attracted increasing attention\nof the computational linguistics community. Compared with conventional\ngeneration models, retrieval-augmented text generation has remarkable\nadvantages and particularly has achieved state-of-the-art performance in many\nNLP tasks. This paper aims to conduct a survey about retrieval-augmented text\ngeneration. It firstly highlights the generic paradigm of retrieval-augmented\ngeneration, and then it reviews notable approaches according to different tasks\nincluding dialogue response generation, machine translation, and other\ngeneration tasks. Finally, it points out some important directions on top of\nrecent methods to facilitate future research.'}, {'Published': '2024-05-12', 'Title': 'DuetRAG: Collaborative Retrieval-Augmented 

Error in RootListenersTracer.on_chain_start callback: ValueError('Invalid format: original+chat')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 9cb9316e-7dcb-43d1-bad4-60b507cedb96.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID 9bbace70-3688-4636-97ec-0c31006e09a7.')
Error in RootListenersTracer.on_chain_end callback: TracerException('No indexed run ID ff66211e-798a-4037-b81d-4e97aaeba6ce.')


[32;1m[1;3mThe first paper in the list is titled "A Survey on Retrieval-Augmented Text Generation" and was published on 2022-02-13.[0m

[1m> Finished chain.[0m


{'input': 'WHat is the name of the first paper in that list?',
 'chat_history': [],
 'output': 'The first paper in the list is titled "A Survey on Retrieval-Augmented Text Generation" and was published on 2022-02-13.'}

## Agent Execution


In [247]:
agent_executor.invoke(
    {"input": "Get me a list of research papers on the topic Prompt Compression"},
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `get_paper_metadata_from_arxiv` with `{'topic': 'Prompt Compression'}`


[0m[36;1m[1;3m[{'Published': '2024-03-30', 'Title': 'PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression', 'Authors': 'Muhammad Asif Ali, Zhengping Li, Shu Yang, Keyuan Cheng, Yang Cao, Tianhao Huang, Lijie Hu, Lu Yu, Di Wang', 'Summary': "Large language models (LLMs) have shown exceptional abilities for multiple\ndifferent natural language processing tasks. While prompting is a crucial tool\nfor LLM inference, we observe that there is a significant cost associated with\nexceedingly lengthy prompts. Existing attempts to compress lengthy prompts lead\nto sub-standard results in terms of readability and interpretability of the\ncompressed prompt, with a detrimental impact on prompt utility. To address\nthis, we propose PROMPT-SAW: Prompt compresSion via Relation AWare graphs, an\neffective strategy for prompt compressi

{'input': 'Get me a list of research papers on the topic Prompt Compression',
 'chat_history': '',
 'output': 'Here are some research papers on the topic Prompt Compression:\n\n1. "PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression" by Muhammad Asif Ali, Zhengping Li, Shu Yang, Keyuan Cheng, Yang Cao, Tianhao Huang, Lijie Hu, Lu Yu, Di Wang\n\n2. "Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression" by Xinze Li, Zhenghao Liu, Chenyan Xiong, Shi Yu, Yukun Yan, Shuo Wang, Ge Yu\n\n3. "Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt" by Zhaozhuo Xu, Zirui Liu, Beidi Chen, Yuxin Tang, Jue Wang, Kaixiong Zhou, Xia Hu, Anshumali Shrivastava\n\n4. "PromptCIR: Blind Compressed Image Restoration with Prompt Learning" by Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen\n\n5. "Learning to Compress Prompt in Natural Language Formats" by Yu-Neng

In [250]:
agent_executor.invoke({"input": "What is the name of the first paper in the list?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `get_paper_metadata_from_arxiv` with `{'topic': 'Retrieval Augmented Generation'}`


[0m[36;1m[1;3m[{'Published': '2022-02-13', 'Title': 'A Survey on Retrieval-Augmented Text Generation', 'Authors': 'Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu', 'Summary': 'Recently, retrieval-augmented text generation attracted increasing attention\nof the computational linguistics community. Compared with conventional\ngeneration models, retrieval-augmented text generation has remarkable\nadvantages and particularly has achieved state-of-the-art performance in many\nNLP tasks. This paper aims to conduct a survey about retrieval-augmented text\ngeneration. It firstly highlights the generic paradigm of retrieval-augmented\ngeneration, and then it reviews notable approaches according to different tasks\nincluding dialogue response generation, machine translation, and other\ngeneration tasks. Finally, it points out some i

{'input': 'What is the name of the first paper in the list?',
 'chat_history': 'Human: Get me a list of research papers on the topic Prompt Compression\nAI: Here are some research papers on the topic Prompt Compression:\n\n1. "PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression" by Muhammad Asif Ali, Zhengping Li, Shu Yang, Keyuan Cheng, Yang Cao, Tianhao Huang, Lijie Hu, Lu Yu, Di Wang\n\n2. "Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression" by Xinze Li, Zhenghao Liu, Chenyan Xiong, Shi Yu, Yukun Yan, Shuo Wang, Ge Yu\n\n3. "Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt" by Zhaozhuo Xu, Zirui Liu, Beidi Chen, Yuxin Tang, Jue Wang, Kaixiong Zhou, Xia Hu, Anshumali Shrivastava\n\n4. "PromptCIR: Blind Compressed Image Restoration with Prompt Learning" by Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen\n\n5. "Learning to Compress 

In [249]:
agent_executor.invoke({"input": "Are you sure?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mYes, I am sure.[0m

[1m> Finished chain.[0m


{'input': 'Are you sure?',
 'chat_history': 'Human: Get me a list of research papers on the topic Prompt Compression\nAI: Here are some research papers on the topic Prompt Compression:\n\n1. "PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression" by Muhammad Asif Ali, Zhengping Li, Shu Yang, Keyuan Cheng, Yang Cao, Tianhao Huang, Lijie Hu, Lu Yu, Di Wang\n\n2. "Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression" by Xinze Li, Zhenghao Liu, Chenyan Xiong, Shi Yu, Yukun Yan, Shuo Wang, Ge Yu\n\n3. "Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt" by Zhaozhuo Xu, Zirui Liu, Beidi Chen, Yuxin Tang, Jue Wang, Kaixiong Zhou, Xia Hu, Anshumali Shrivastava\n\n4. "PromptCIR: Blind Compressed Image Restoration with Prompt Learning" by Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen\n\n5. "Learning to Compress Prompt in Natural Language Formats"