In [118]:
"""
Language models are transformer networks with input as prompt text and the output as response text. 
Within the network, they compute numerous matrix multiplications on the prompt vector to convert it into the response vector.

Chapter 5 in https://arxiv.org/pdf/2303.12712.pdf

Prompting language models and retrieving the response is a widely used interation pattern in applications.
However, there are root level problems with language models such as a lack of latest world knowledge,
inability to understand symbolic representation as in mathematics, and code exectution. 
Sophisticated approches around using LLMs are evolving. One of those approches is creating Agents powered by LLMs.

Agents use LLMs as a reasoning engine to decide the next steps and their order of execution.
It differs from sequences of actions in traditional applications such RAGs where the 
retrieval and generation are fixed actions occuring one after the another.

Agents perform actions through the interface provided by _tools_. 
A tool is essentially a python function that performs a certain task.
An example is search tool that queries the internet given a prompt. 
Another example is a retriver tool that queries a document index given a prompt. 
A variety of open source tools such as a calculator, Slack integrator, SparkSQL, Wikipedia etc. are already available for use. 

Under the cover, a language model has access to the description of the tool.
If the user query requires the use of the tool,
the language model calls the tool function and uses its output for the next step. 

In this application, let us look at how to create an LLM agent that can search the web.

"""

import os
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_community.callbacks import StreamlitCallbackHandler
from langchain import hub
from langchain.agents import AgentExecutor, create_react_agent, create_json_chat_agent
from langchain.tools import DuckDuckGoSearchResults
from langchain_community.utilities import DuckDuckGoSearchAPIWrapper
from langchain_openai import OpenAI

import streamlit as st



In [89]:
model_service = os.getenv("MODEL_SERVICE_ENDPOINT", "http://localhost:8000/v1")

llm = ChatOpenAI(base_url=model_service, 
                 api_key="EMPTY",
                 streaming=True)

In [76]:
import os

os.environ["OPENAI_API_KEY"] = "***"


from langchain.llms import OpenAI

llm = OpenAI()

In [90]:
wrapper = DuckDuckGoSearchAPIWrapper(region="us-en", time="d", max_results=1)
search = DuckDuckGoSearchResults(api_wrapper=wrapper)

In [91]:
search.run('Who is Obama?')

'[snippet: Students at Obama Academy will research World War I for the project. "This sort of project is an exciting new endeavor for us," said Matthew Falcone, president of Preservation Pittsburgh., title: World War I memorial at Pittsburgh\'s Obama Academy to be restored for ..., link: https://triblive.com/local/world-war-i-memorial-at-pittsburghs-obama-academy-to-be-restored-for-100th-anniversary/], [snippet: Ben Rhodes, who was a top adviser and speechwriter for U.S. President Barack Obama, argues that Russian President Vladimir Putin faces numerous, still-unknown threats despite his efforts to ..., title: Former Top Obama Adviser Foresees \'A Lot Of Challenges\' For Putin, link: https://www.rferl.org/a/ben-rhodes-obama-security-adviser-putin-ukraine-war-sanctions-economy-challenge/32837343.html], [snippet: The Obama/Biden approach does the opposite. Biden Goes After Indemnity Insurance Another insurance option the Biden administration wants to restrict is called indemnity insuranc

In [92]:
from langchain.agents import Tool
#from langchain.tools import DuckDuckGoSearchTool
search = DuckDuckGoSearchResults(api_wrapper=wrapper)

search_tool = Tool(
    name="Duck Duck Go Search",
    func=search.run,
    description="useful for when you need to search the internet to answer a question"
)
tools = [search_tool]

In [93]:
search_tool.func

<bound method BaseTool.run of DuckDuckGoSearchResults(api_wrapper=DuckDuckGoSearchAPIWrapper(region='us-en', safesearch='moderate', time='d', max_results=1, backend='api', source='text'))>

In [53]:

# # Get the prompt to use - you can modify this!
# prompt = hub.pull("hwchase17/openai-functions-agent")
# # Construct the ReAct agent
# from langchain.agents import create_openai_functions_agent

# agent = create_openai_functions_agent(llm, tools, prompt)
# print(prompt)

input_variables=['agent_scratchpad', 'input'] input_types={'chat_history': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]], 'agent_scratchpad': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]} messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant')), MessagesPlaceholder(variable_name='chat_history', optional=True), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}')), MessagesPlaceholder(variable_name='agent_

In [100]:
print(prompt.template)

Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}


In [138]:
# special tokens used by llama 2 chat
B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"

# create the system message
sys_msg = "<s>" + B_SYS + """
Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}""" + E_SYS



In [139]:
sys_msg

'<s><<SYS>>\n\nAnswer the following questions as best you can. You have access to the following tools:\n\n{tools}\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [{tool_names}]\nAction Input: the input to the action\nObservation: the result of the action\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: {input}\nThought:{agent_scratchpad}\n<</SYS>>\n\n'

In [94]:
llm

ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x1747b1d80>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x174794bb0>, openai_api_key=SecretStr('**********'), openai_api_base='http://localhost:8000/v1', openai_proxy='', streaming=True)

In [143]:
prompt.template = sys_msg

In [144]:
prompt

PromptTemplate(input_variables=['agent_scratchpad', 'input', 'tool_names', 'tools'], template='<s><<SYS>>\n\nAnswer the following questions as best you can. You have access to the following tools:\n\n{tools}\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [{tool_names}]\nAction Input: the input to the action\nObservation: the result of the action\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: {input}\nThought:{agent_scratchpad}\n<</SYS>>\n\n')

In [112]:
# Get the prompt to use - you can modify this!
#prompt = hub.pull("hwchase17/react")

# Get the prompt to use - you can modify this!
prompt_json = hub.pull("hwchase17/react-chat-json")

In [145]:
prompt

PromptTemplate(input_variables=['agent_scratchpad', 'input', 'tool_names', 'tools'], template='<s><<SYS>>\n\nAnswer the following questions as best you can. You have access to the following tools:\n\n{tools}\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [{tool_names}]\nAction Input: the input to the action\nObservation: the result of the action\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: {input}\nThought:{agent_scratchpad}\n<</SYS>>\n\n')

In [146]:
print(prompt.template)

<s><<SYS>>

Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}
<</SYS>>




In [149]:


agent = create_react_agent(llm, tools, prompt)

#agent = create_json_chat_agent(llm, tools, prompt)

In [125]:
search.run('Give news for Feb 24th, 2024?')

'[snippet: Live updates and analysis from POLITICO\'s Congress team, title: Inside Congress Live, Feb 29, 2024 - Live Updates - POLITICO, link: https://www.politico.com/live-updates/2024/02/29/congress], [snippet: Feb 29, 2024 In other news from Texas, prison officials executed 50-year-old Dallas native Ivan Abner Cantu Wednesday despite major doubts over his conviction and a high-profile campaign to save him., title: Headlines for February 29, 2024 | Democracy Now!, link: https://www.democracynow.org/2024/2/29/headlines], [snippet: Also: News about International Womens Day events, a deal on pancakes, and Crumbl\'s venture into pies. ... Feb. 24, 2024. ... Which candy do Utahns give on Valentine\'s Day?, title: Utah Eats: Eat with your hands at Ethiopian restaurant Oromian, link: https://www.sltrib.com/artsliving/food/2024/02/29/utah-eats-eat-with-your-hands-this/], [snippet: Today\'s early morning highlights from the major news organizations. Skip to content. Toggle navigation Donate.

In [126]:
tools[0]

Tool(name='Duck Duck Go Search', description='useful for when you need to search the internet to answer a question', func=<bound method BaseTool.run of DuckDuckGoSearchResults(api_wrapper=DuckDuckGoSearchAPIWrapper(region='us-en', safesearch='moderate', time='d', max_results=1, backend='api', source='text'))>)

In [150]:
prompt

PromptTemplate(input_variables=['agent_scratchpad', 'input', 'tool_names', 'tools'], template='<s><<SYS>>\n\nAnswer the following questions as best you can. You have access to the following tools:\n\n{tools}\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [{tool_names}]\nAction Input: the input to the action\nObservation: the result of the action\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: {input}\nThought:{agent_scratchpad}\n<</SYS>>\n\n')

In [152]:
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, handle_parsing_errors=True)
agent_executor.invoke({"input": "Give news for Feb 24th, 2024?"})



[1m> Entering new AgentExecutor chain...[0m


KeyboardInterrupt: 

In [None]:
# Follow the calculator demo first. 
# Then try search

So, the tools work with OpenAI models because they are designed for them. However, if we want to use them with llama models, we have to change the prompt and maybe update the output parser class. The next step is to use a prompt similar to the pinecone example.

## Calculator demo example

In [154]:
pip install numexpr

Collecting numexpr
  Downloading numexpr-2.9.0-cp310-cp310-macosx_11_0_arm64.whl.metadata (7.9 kB)
Downloading numexpr-2.9.0-cp310-cp310-macosx_11_0_arm64.whl (91 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m91.8/91.8 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: numexpr
Successfully installed numexpr-2.9.0
Note: you may need to restart the kernel to use updated packages.


In [163]:
from langchain.memory import ConversationBufferWindowMemory
from langchain.agents import load_tools

memory = ConversationBufferWindowMemory(
    memory_key="chat_history", k=5, return_messages=True, output_key="output"
)
tools = load_tools(["llm-math"], llm=llm)

In [180]:
from langchain.agents import AgentOutputParser
from langchain.agents.conversational_chat.prompt import FORMAT_INSTRUCTIONS
from langchain.output_parsers.json import parse_json_markdown
from langchain.schema import AgentAction, AgentFinish

class OutputParser(AgentOutputParser):
    def get_format_instructions(self) -> str:
        return FORMAT_INSTRUCTIONS

    def parse(self, text: str) -> AgentAction | AgentFinish:
        try:
            # this will work IF the text is a valid JSON with action and action_input
            response = parse_json_markdown(text)
            action, action_input = response["action"], response["action_input"]
            if action == "Final Answer":
                # this means the agent is finished so we call AgentFinish
                return AgentFinish({"output": action_input}, text)
            else:
                # otherwise the agent wants to use an action, so we call AgentAction
                return AgentAction(action, action_input, text)
        except Exception:
            # sometimes the agent will return a string that is not a valid JSON
            # often this happens when the agent is finished
            # so we just return the text as the output
            return AgentFinish({"output": text}, text)

    @property
    def _type(self) -> str:
        return "conversational_chat"

# initialize output parser for agent
parser = OutputParser()

In [181]:
from langchain.agents import initialize_agent

# initialize agent
agent = initialize_agent(
    agent="chat-conversational-react-description",
    tools=tools,
    llm=llm,
    verbose=True,
    early_stopping_method="generate",
    memory=memory,
    agent_kwargs={"output_parser": parser}
)

In [182]:
agent.agent.llm_chain.prompt

ChatPromptTemplate(input_variables=['agent_scratchpad', 'chat_history', 'input'], input_types={'chat_history': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]], 'agent_scratchpad': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='Assistant is a large language model trained by OpenAI.\n\nAssistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and 

In [183]:
B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"

In [184]:
sys_msg = B_SYS + """Assistant is a expert JSON builder designed to assist with a wide range of tasks.

Assistant is able to respond to the User and use tools using JSON strings that contain "action" and "action_input" parameters.

All of Assistant's communication is performed using this JSON format.

Assistant can also use tools by responding to the user with tool use instructions in the same "action" and "action_input" JSON format. Tools available to Assistant are:

- "Calculator": Useful for when you need to answer questions about math.
  - To use the calculator tool, Assistant should write like so:
    ```json
    {{"action": "Calculator",
      "action_input": "sqrt(4)"}}
    ```

Here are some previous conversations between the Assistant and User:

User: Hey how are you today?
Assistant: ```json
{{"action": "Final Answer",
 "action_input": "I'm good thanks, how are you?"}}
```
User: I'm great, what is the square root of 4?
Assistant: ```json
{{"action": "Calculator",
 "action_input": "sqrt(4)"}}
```
User: 2.0
Assistant: ```json
{{"action": "Final Answer",
 "action_input": "It looks like the answer is 2!"}}
```
User: Thanks could you tell me what 4 to the power of 2 is?
Assistant: ```json
{{"action": "Calculator",
 "action_input": "4**2"}}
```
User: 16.0
Assistant: ```json
{{"action": "Final Answer",
 "action_input": "It looks like the answer is 16!"}}
```

Here is the latest conversation between Assistant and User.""" + E_SYS
new_prompt = agent.agent.create_prompt(
    system_message=sys_msg,
    tools=tools
)
agent.agent.llm_chain.prompt = new_prompt

In [185]:
instruction = B_INST + " Respond to the following in JSON with 'action' and 'action_input' values " + E_INST
human_msg = instruction + "\nUser: {input}"

agent.agent.llm_chain.prompt.messages[2].prompt.template = human_msg

In [186]:
agent("what is 4 to the power of 2.1?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m  ```json
{
"action": "Calculator",
"action_input": "4**2.1"}
```[0m

ValueError: unknown format from LLM: Sure, I can help you translate the math problems into expressions that can be evaluated using Python's `numexpr` library. Here are the steps for each problem:
1. What is 37593 * 67?
Expression: `37593 * 67`
Python Code: `numexpr.evaluate("37593 * 67")`
Output: `2518731`
2. 37593^(1/5)
Expression: `37593**(1/5)`
Python Code: `numexpr.evaluate("37593**(1/5)")`
Output: `8.22831614237718`
3. 4**2.1

Expression: `4**2.1`

Python Code: `numexpr.evaluate("4**2.1")`
Output: `16.5`

Now, you can use these expressions to calculate the answers to the questions. For example, to calculate the answer to the first question, you can use the following code:
`print(numexpr.evaluate("37593 * 67"))`
Which will output `2518731`.

In [173]:
4**2.1

18.37917367995256

In [None]:
## Even this doesn't work
## so now I'll have to go in depth of the custom agent outputs and see if the
## right function is being called and if that's happening consistently 
## with multiple prompts.