In [1]:
!pip install langchain openai tiktoken cohere arxiv duckduckgo-search langchainhub -qU

In [2]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

## LangChain ReAct Agent

### Base LLM Powering RAG

In [3]:
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(temperature=0, model="gpt-4-1106-preview")

  warn_deprecated(


In [4]:
from langchain.schema import HumanMessage

messages = [
    HumanMessage(
        content="Hello, how are you?"
    )
]

In [5]:
llm(messages)

  warn_deprecated(


AIMessage(content="Hello! I'm just a computer program, so I don't have feelings, but I'm here and ready to help you with any questions or tasks you have. How can I assist you today?", response_metadata={'token_usage': {'completion_tokens': 40, 'prompt_tokens': 13, 'total_tokens': 53}, 'model_name': 'gpt-4-1106-preview', 'system_fingerprint': 'fp_89f117abc5', 'finish_reason': 'stop', 'logprobs': None}, id='run-12f35913-0abc-42c5-aeb2-094bddcf2400-0')

### Tool Belt

In [6]:
from langchain.agents import load_tools

tools = load_tools(["arxiv", "ddg-search"], llm=llm)

### LCEL Agent Construction

In [7]:
from langchain import hub

prompt = hub.pull("hwchase17/react-json")

In [8]:
prompt

ChatPromptTemplate(input_variables=['agent_scratchpad', 'input', 'tool_names', 'tools'], metadata={'lc_hub_owner': 'hwchase17', 'lc_hub_repo': 'react-json', 'lc_hub_commit_hash': '669cf4d6988c3b8994a8189edb3891e07948e1c0abfd500823914548c53afa7c'}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['tool_names', 'tools'], template='Answer the following questions as best you can. You have access to the following tools:\n\n{tools}\n\nThe way you use the tools is by specifying a json blob.\nSpecifically, this json should have a `action` key (with the name of the tool to use) and a `action_input` key (with the input to the tool going here).\n\nThe only values that should be in the "action" field are: {tool_names}\n\nThe $JSON_BLOB should only contain a SINGLE action, do NOT return a list of multiple actions. Here is an example of a valid $JSON_BLOB:\n\n```\n{{\n  "action": $TOOL_NAME,\n  "action_input": $INPUT\n}}\n```\n\nALWAYS use the following format:\n\nQuestio

In [9]:
from langchain.tools.render import render_text_description

prompt = prompt.partial(
    tools=render_text_description(tools),
    tool_names=", ".join([t.name for t in tools]),
)

In [10]:
llm_with_stop = llm.bind(stop=["\nObservation"])

In [11]:
from langchain.agents.format_scratchpad import format_log_to_str
from langchain.agents.output_parsers import ReActJsonSingleInputOutputParser

agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_log_to_str(x["intermediate_steps"]),
    }
    | prompt
    | llm_with_stop
    | ReActJsonSingleInputOutputParser()
)

In [12]:
from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, handle_parsing_errors=True)

In [13]:
agent_executor.invoke(
    {
        "input" : "What is Retrieval Augmented Generation?"
    }
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: Retrieval Augmented Generation (RAG) is a concept in the field of natural language processing and machine learning. It refers to a methodology where a generative model is augmented with a retrieval component to enhance its ability to generate text. The retrieval component can fetch relevant information from a large corpus of text, which the generative model can then use to produce more accurate, informative, or contextually relevant outputs. To provide a detailed and accurate explanation, I can search for scientific articles on arxiv.org that discuss Retrieval Augmented Generation.

Action:
```
{
  "action": "arxiv",
  "action_input": "Retrieval Augmented Generation"
}
```[0m[36;1m[1;3mPublished: 2022-02-13
Title: A Survey on Retrieval-Augmented Text Generation
Authors: Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
Summary: Recently, retrieval-augmented text generation attracted increasing attention
of the 

{'input': 'What is Retrieval Augmented Generation?',
 'output': 'Retrieval Augmented Generation (RAG) is a methodology in natural language processing where a generative model is augmented with a retrieval component. This retrieval component fetches relevant information from a large corpus of text, which the generative model then uses to produce more accurate, informative, or contextually relevant outputs. RAG has been applied to various tasks such as dialogue response generation and machine translation, and recent advancements have focused on improving the retrieval process through context tuning and corrective measures to ensure the relevance and quality of the information retrieved.'}

In [14]:
agent_executor.invoke(
    {
        "input" : "Who is the current QB of the Denver Broncos?"
    }
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: To find out the current quarterback of the Denver Broncos, I can use a DuckDuckGo search to get the most recent information.

Action:
```json
{
  "action": "duckduckgo_search",
  "action_input": "current QB of the Denver Broncos"
}
```[0m[33;1m[1;3mNew Broncos starting quarterback is married, a father and has backed up Tom Brady, Cam Newton, Jimmy Garoppolo - and Russell Wilson in his career. Credit: AP Photo/David Zalubowski. Denver ... ENGLEWOOD, Colo. -- The quarterback question is the bottom line of the Denver Broncos' offseason. Coach Sean Payton benched Russell Wilsonwith two games remaining in the 2023 season. The five-year ... Denver Broncos quarterback Teddy Bridgewater (5) throws against the Cincinnati Bengals during the first half of an NFL football game, Sunday, Dec. 19, 2021, in Denver. State of the Broncos roster: QB is the biggest issue and question. What the Denver Broncos do at the quarterback pos

{'input': 'Who is the current QB of the Denver Broncos?',
 'output': "Based on the most recent information available, Jarrett Stidham is believed to be the current starting quarterback for the Denver Broncos, having started the final two games of the 2023 season after Russell Wilson was benched. However, this information may change, and it is recommended to check the latest news or the team's official announcements for the most up-to-date information."}

## OpenAI Assistant API

Helper Functions for Assistant API taking from [this](https://gist.github.com/assafelovic/579822cd42d52d80db1e1c1ff82ffffd) gist.

In [17]:
from duckduckgo_search import DDGS

def duckduckgo_search(query):
  with DDGS() as ddgs:
    results = [r for r in ddgs.text(query, max_results=5)]
    return "\n".join(result["body"] for result in results)

In [18]:
duckduckgo_search("Who is the current captain of the Winnipeg Jets?")

"Lowry has been a part of the Jets organization since June 25, 2011, when he was selected in the third round, 67 th overall, after putting up 37 points in 36 games with the Swift Current Broncos of ...\nAdam Lowry, who has been a Jet since 2011 when he was drafted 67th overall, is the new captain of the NHL team — its third since relocating to Winnipeg from Atlanta in 2011. Andrew Ladd served ...\nLowry will follow Andrew Ladd and Blake Wheeler to serve as the third captain of the new Winnipeg Jets franchise. - Sep 12, 2023. After a season without a captain, the Winnipeg Jets have named ...\nWinnipeg played without a captain last season after stripping Blake Wheeler of the title in September. Lowry is entering his 10th season with the Jets, who drafted him in the third round in 2011.\nThe Winnipeg Jets have officially made the decision on which player will wear the captain's 'C' for the 2023-24 season, and hopefully onward. That player is 30-year-old centre Adam Lowry."

In [20]:
import arxiv

def arxiv_search(query):
  results = []

  search = arxiv.Search(
      query = query,
      max_results = 5,
      sort_by = arxiv.SortCriterion.Relevance
  )

  for result in arxiv.Client().results(search):
    results.append(result)

  return "\n".join(result.summary for result in results)

In [21]:
arxiv_search("Retrieval Augmented Generation?")

"Existing research on response generation for chatbot focuses on \\textbf{First\nResponse Generation} which aims to teach the chatbot to say the first response\n(e.g. a sentence) appropriate to the conversation context (e.g. the user's\nquery). In this paper, we introduce a new task \\textbf{Second Response\nGeneration}, termed as Improv chat, which aims to teach the chatbot to say the\nsecond response after saying the first response with respect the conversation\ncontext, so as to lighten the burden on the user to keep the conversation\ngoing. Specifically, we propose a general learning based framework and develop\na retrieval based system which can generate the second responses with the\nusers' query and the chatbot's first response as input. We present the approach\nto building the conversation corpus for Improv chat from public forums and\nsocial networks, as well as the neural networks based models for response\nmatching and ranking. We include the preliminary experiments and resu

In [22]:
ddg_function = {
    "name" : "duckduckgo_search",
    "description" : "Answer non-technical questions. Do not use this tool for questions about Machine Learning.",
    "parameters" : {
        "type" : "object",
        "properties" : {
            "query" : {
                "type:" : "string",
                "description" : "The search query to use. For example: 'Who is the current Goalie of the Colorado Avalance?'"
            }
        },
        "required" : ["query"]
    }
}

In [23]:
arxiv_function = {
    "name" : "arxiv_query",
    "description" : "Answer technical questions about the Machine Learning domain.",
    "parameters" : {
        "type" : "object",
        "properties" : {
            "query" : {
                "type:" : "string",
                "description" : "The search query to use. For example: 'Retrieval Augmented Generation'"
            }
        },
        "required" : ["query"]
    }
}

In [24]:
from openai import OpenAI

client = OpenAI()

assistant = client.beta.assistants.create(
    name="Query Assistant",
    instructions="You are a personal assistant. Use the provided functions to answer questions.",
    tools=[
        {"type" : "function",
         "function" : arxiv_function
        },
        {"type": "function",
         "function" : ddg_function
        }
    ],
    model="gpt-4-1106-preview"
)

In [26]:
assistant_id = assistant.id

print(f"Assistant ID: {assistant_id}")

Assistant ID: asst_ClheFCt0y5DG270vcddQXPp4


In [27]:
thread = client.beta.threads.create()
user_input = "What is the capital of Ottawa?"
message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content=user_input,
)

In [28]:
run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant_id,
)

In [29]:
import time

def wait_for_run_completion(thread_id, run_id):
    while True:
        time.sleep(1)
        run = client.beta.threads.runs.retrieve(thread_id=thread_id, run_id=run_id)
        print(f"Current run status: {run.status}")
        if run.status in ['completed', 'failed', 'requires_action']:
            return run

In [30]:
run = wait_for_run_completion(thread.id, run.id)

Current run status: requires_action


In [31]:
import json

def submit_tool_outputs(thread_id, run_id, tools_to_call):
    tool_output_array = []
    for tool in tools_to_call:
        output = None
        tool_call_id = tool.id
        function_name = tool.function.name
        function_args = tool.function.arguments

        if function_name == "duckduckgo_search":
            print("Consulting Duck Duck Go...")
            output = duckduckgo_search(query=json.loads(function_args)["query"])

        if function_name == "arxiv_query":
            print("Consulting Arxiv...")
            output = arxiv_search(query=json.loads(function_args)["query"])

        if output:
            tool_output_array.append({"tool_call_id": tool_call_id, "output": output})

    print(tool_output_array)

    return client.beta.threads.runs.submit_tool_outputs(
        thread_id=thread_id,
        run_id=run_id,
        tool_outputs=tool_output_array
    )

In [32]:
if run.status == 'requires_action':
    run = submit_tool_outputs(thread.id, run.id, run.required_action.submit_tool_outputs.tool_calls)
    run = wait_for_run_completion(thread.id, run.id)

Consulting Duck Duck Go...
[{'tool_call_id': 'call_7TmXcXzGYmXxsbrLQul8DNlg', 'output': "Ottawa is the capital city of Canada.It is located in the southern portion of the province of Ontario, at the confluence of the Ottawa River and the Rideau River.Ottawa borders Gatineau, Quebec, and forms the core of the Ottawa-Gatineau census metropolitan area (CMA) and the National Capital Region (NCR). As of 2021, Ottawa had a city population of 1,017,449 and a metropolitan population of ...\nOttawa, city, capital of Canada, located in southeastern Ontario. In the eastern extreme of the province, Ottawa is situated on the Ottawa River across from Gatineau, Quebec, at the confluence of the Ottawa, Gatineau, and Rideau rivers. The Ottawa River was a key factor in the city's settlement.\nOttawa, Ontario, incorporated as a city in 1855, population 1,017,449 (2021 census), 934,243 (2016 census).The City of Ottawa is the capital of Canada and is located on the Ottawa River on Ontario's eastern boundar

In [33]:
def print_messages_from_thread(thread_id):
    messages = client.beta.threads.messages.list(thread_id=thread_id)
    for msg in messages:
        print(f"{msg.role}: {msg.content[0].text.value}")

In [34]:
print_messages_from_thread(thread.id)

assistant: Ottawa itself is the capital city of Canada and does not have a capital. It is located in the southern portion of the province of Ontario, at the confluence of the Ottawa and Rideau rivers. Ottawa also borders Gatineau, Quebec, and forms the core of the Ottawa-Gatineau census metropolitan area and the National Capital Region.
user: What is the capital of Ottawa?


In [35]:
def use_assistant(query):
  thread = client.beta.threads.create()

  message = client.beta.threads.messages.create(
      thread_id=thread.id,
      role="user",
      content=query,
  )

  print("Creating Assistant ")

  run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant_id,
  )

  print("Querying OpenAI Assistant Thread.")

  run = wait_for_run_completion(thread.id, run.id)

  if run.status == 'requires_action':
    run = submit_tool_outputs(thread.id, run.id, run.required_action.submit_tool_outputs.tool_calls)
    run = wait_for_run_completion(thread.id, run.id)

  print_messages_from_thread(thread.id)

In [36]:
use_assistant("What is QLoRA?")

Creating Assistant 
Querying OpenAI Assistant Thread.
Current run status: in_progress
Current run status: in_progress
Current run status: requires_action
Consulting Arxiv...
[{'tool_call_id': 'call_WMktx38kZjaVPgE3i6hmxRjW', 'output': "We present QLoRA, an efficient finetuning approach that reduces memory usage\nenough to finetune a 65B parameter model on a single 48GB GPU while preserving\nfull 16-bit finetuning task performance. QLoRA backpropagates gradients through\na frozen, 4-bit quantized pretrained language model into Low Rank\nAdapters~(LoRA). Our best model family, which we name Guanaco, outperforms all\nprevious openly released models on the Vicuna benchmark, reaching 99.3% of the\nperformance level of ChatGPT while only requiring 24 hours of finetuning on a\nsingle GPU. QLoRA introduces a number of innovations to save memory without\nsacrificing performance: (a) 4-bit NormalFloat (NF4), a new data type that is\ninformation theoretically optimal for normally distributed weig

In [39]:
print_messages_from_thread(thread.id)

assistant: Ottawa itself is the capital city of Canada and does not have a capital. It is located in the southern portion of the province of Ontario, at the confluence of the Ottawa and Rideau rivers. Ottawa also borders Gatineau, Quebec, and forms the core of the Ottawa-Gatineau census metropolitan area and the National Capital Region.
user: What is the capital of Ottawa?


In [40]:
use_assistant("What is a meme?")

Creating Assistant 
Querying OpenAI Assistant Thread.
Current run status: in_progress
Current run status: requires_action
Consulting Duck Duck Go...
[{'tool_call_id': 'call_XG1KGzf7hS2OU7DXinK9E2Ud', 'output': 'A meme is an item or genre of items that is spread widely online, such as a captioned picture or video, or an idea, behavior, style, or usage that spreads from person to person within a culture. The word comes from the Greek mimeme, meaning "imitation". Learn more about the origin, history, and examples of memes from the Merriam-Webster dictionary.\nA meme (/ m iː m /; MEEM) is an idea, behavior, or style that spreads by means of imitation from person to person within a culture and often carries symbolic meaning representing a particular phenomenon or theme. A meme acts as a unit for carrying cultural ideas, symbols, or practices, that can be transmitted from one mind to another through writing, speech, gestures, rituals, or other ...\nA meme is a cultural shorthand that portray