Simple LLM Chat Application using OpenAI APIs, HuggingFace Open Source Models

https://www.notion.so/prakhar-pm/LLM-AI-104519d551428049a63ac29fda34cc81?pvs=4

In [1]:
import openai
import re
import httpx
import os
from openai import OpenAI
from langchain import PromptTemplate
from langchain import HuggingFaceHub

In [2]:
openai_api_key = "sk-49_PQs9n7HH5XIfPl5Bm-5zWd-fCJPMrV0VvX4FOuPT3BlbkFJicxxVapD-KQKeLr2Qa30yMRemOZ4o7ZPnlNTb7of0A"
client = OpenAI(api_key=openai_api_key)

## Simple LLM Chat Application using OpenAI API

In [3]:
#Supported roles are - system, user, assistant, tool, function
model = "gpt-3.5-turbo"
chat_completion = client.chat.completions.create(
    model=model,
    messages=[{"role": "user", "content": "Tell me a joke"}]
)

In [30]:
chat_completion.choices[0].message.content

"Why don't scientists trust atoms?\n\nBecause they make up everything!"

## LLM Chat Application using Langchain

In [31]:
llm = ChatOpenAI(
    model=model,
    temperature=0.5,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    api_key=openai_api_key,  # if you prefer to pass api key in directly instaed of using env vars
)

In [32]:
print(llm.invoke("Tell me a joke about data scientist in 20 words"))

content="Why did the data scientist break up with their calculator? It couldn't handle their complex relationship with numbers." response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 18, 'total_tokens': 39}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-1aba274e-3297-46e5-a931-38e888b492da-0' usage_metadata={'input_tokens': 18, 'output_tokens': 21, 'total_tokens': 39}


## Simple LLM application using Huggingface LLM model

In [33]:
huggingface_api_key = "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"
llm_hf = HuggingFaceHub(repo_id="google/flan-t5-large", huggingfacehub_api_token=huggingface_api_key)

  warn_deprecated(


In [34]:
print(llm_hf.invoke("Tell me a joke about data scientist"))

a data scientist is a person who is a data scientist


In [35]:
#Pass multiple prompts at once
llm_hf.generate(['Tell me a joke about data scientist', 'Tell me a joke about recruiter', 'Tell me a joke about psychologist'])

LLMResult(generations=[[Generation(text='a data scientist is a person who is a data scientist')], [Generation(text='recruiter snoops around the office')], [Generation(text='psychologists are not always right')]], llm_output=None, run=[RunInfo(run_id=UUID('9f0c06c1-66ae-4950-895e-071c96831c14')), RunInfo(run_id=UUID('4ad78cc5-b317-494b-bfa7-3e877911ddb9')), RunInfo(run_id=UUID('77f3359c-7a48-4bc7-b6c5-555fbfe364b8'))])

## Generate desired response from LLM using Prompt Template

In [36]:
template = """ I am travelling to {location}. What are the top 3 things I can do while I am there. Be very specific and respond as three bullet points """

In [37]:
prompt = PromptTemplate(input_variables=["location"], template = template)

In [38]:
#In business case, we can extract this entity from upstream application (deterministic)
user_input = "Paris"
final_prompt = prompt.format(location=user_input)
llm.invoke(final_prompt)

AIMessage(content='1. Visit the Eiffel Tower and take a ride to the top for stunning views of the city.\n2. Explore the Louvre Museum and see iconic artworks such as the Mona Lisa and Venus de Milo.\n3. Stroll through the charming streets of Montmartre, visit the Sacré-Cœur Basilica, and enjoy a meal at a traditional French bistro.', response_metadata={'token_usage': {'completion_tokens': 76, 'prompt_tokens': 38, 'total_tokens': 114}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-f542e5ab-2883-40ed-8200-44e1c0f59271-0', usage_metadata={'input_tokens': 38, 'output_tokens': 76, 'total_tokens': 114})

## Combining LLM and Prompts in Multi-Step Workflows

In [39]:
from langchain.chains import LLMChain, SimpleSequentialChain

In [40]:
template="What is the most popular city in {county} for tourists? Just return the name of the city"
prompt1 = PromptTemplate(input_variables = ["country"], template=template)
chain1 = LLMChain(llm=llm, prompt=prompt1)

template="What are the top 3 things to do in the {city} for tourists? Just return the answer as three bullet points."
prompt2 = PromptTemplate(input_variables=["city"], template=template)
chain2 = LLMChain(llm=llm, prompt=prompt2)

overallChain = SimpleSequentialChain(chains = [chain1, chain2], verbose=True)

  warn_deprecated(


In [41]:
final_answer = overallChain.run("Canada")

  warn_deprecated(




[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mToronto[0m
[33;1m[1;3m- Visit the CN Tower for panoramic views of the city
- Explore the Royal Ontario Museum for a diverse collection of art, culture, and natural history
- Stroll around the Distillery District for its charming cobblestone streets, art galleries, and trendy shops and restaurants[0m

[1m> Finished chain.[0m


- Here we are asking LLM to complete all tasks of the workflow. This isn't pragmatic approach as in real world, there can be different tasks that will be required.
- Langchain excels in prompts, models, memory, chains, and agents.
- To Do: Use of memory, and agents using langchain.

## LlamaIndex
- Implementing RAG functionality, Indexing of data, storing of data, vector stores, embeddings, query, retrieval, post-processing, and response synthesis. 

## LLM with Constrained Generation - Decoding Strategy

## LLM with RAG

## Agent based application from Scratch

In [94]:
class Agent:
    def __init__(self, system=""):
        self.system = system
        self.messages = []
        if self.system:
            self.messages.append({"role": "system", "content": system})

    def __call__(self, message):
        self.messages.append({"role": "user", "content": message})
        result = self.execute()
        self.messages.append({"role": "assistant", "content":result})
        return result

    def execute(self):
        completion = client.chat.completions.create(
                                    model=model,
                                    temperature=0,
                                    messages=self.messages)
        return completion.choices[0].message.content        

In [95]:
#prompt = """ Instruction - Role or Behavior of Agent, Tools required, Examples (One shot, 2 shot, Few shot learning) """
prompt = """You run in a loop of Thought, Action, PAUSE, Observation.
At the end of the loop you output an Answer
Use Thought to describe your thoughts about the question you have been asked.
Use Action to run one of the actions available to you - then return PAUSE.
Observation will be the result of running those actions.

Your available actions are:

calculate:
e.g. calculate: 4 * 7 / 31
Runs a calculation and returns the number - uses Python so be sure to use floating point syntax if necessary

average_dog_weight:
e.g. average_dog_weight: Collie
returns average weight of a dog when given the breed

Example session:

Question: How much does a Bulldog weigh?
Thought: I should look the dogs weight using average_dog_weight
Action: average_dog_weight: Bulldog
PAUSE

You will be called again with this:

Observation: A Bulldog weights 51 lbs

You then output:

Answer: A bulldog weights 51 lbs
""".strip()

In [96]:
def calculate(what):
    return eval(what)

def average_dog_weight(name):
    if name in "Scottish Terrier":
        return("Scottish Terriers average 20 lbs")
    elif name in "Border Collie":
        return("a Border Collies average weight is 37 lbs")
    elif name in "Toy Poodle":
        return("a toy poodles average weight is 7 lbs")
    else:
        return("An average dog weights 50 lbs")

In [97]:
#Equivalent to Tools
known_actions = {
    "calculate": calculate, 
    "average_dog_weight": average_dog_weight
}

In [98]:
#Initializing an agent with the clear instructions of roles with few Samples - Similar to KT to a new employee 
#joining a team
agent1 = Agent(prompt)
#Querying an agent to work on a task
result = agent1("How much does a toy poodle weigh?")
result

'Thought: I should look up the average weight of a Toy Poodle using the function average_dog_weight.\nAction: average_dog_weight: Toy Poodle\nPAUSE'

In [99]:
# Multi-query agent system to demonstrate the reasoning abilities of agents and how can we use the existing tools for different use cases. 
# This is non-automated process but with simple few lines of code, this can be automated in future.
agent1 = Agent(prompt)
question = """I have 2 dogs, a border collie and a scottish terrier. What is their combined weight"""
result = agent1(question)
print(result)

Thought: I can find the average weight of a Border Collie and a Scottish Terrier using the average_dog_weight action, and then calculate their combined weight.

Action: average_dog_weight: Border Collie
PAUSE


In [100]:
#Preparing next Prompt that needs to be sent to agent - Find Average dog weight of Border Collie
next_prompt = "Observation: {}".format(average_dog_weight("Border Collie"))
print(next_prompt)
agent1(next_prompt)

Observation: a Border Collies average weight is 37 lbs


'Action: average_dog_weight: Scottish Terrier\nPAUSE'

In [101]:
#Preparing next Prompt that needs to be sent to agent - Find Average dog weight of Scottish Terrier
next_prompt = "Observation: {}".format(average_dog_weight("Scottish Terrier"))
print(next_prompt)
result = agent1(next_prompt)
print(result)

Observation: Scottish Terriers average 20 lbs
Action: calculate: 37 + 20
PAUSE


In [102]:
#Preparing next Prompt that needs to be sent to agent - Find Combined weight of the two dogs
next_prompt = "Observation: {}".format(eval("37+20"))
print(next_prompt)
agent1(next_prompt)

Observation: 57


'Answer: The combined weight of a Border Collie and a Scottish Terrier is 57 lbs'

In [103]:
print(agent1.messages)

[{'role': 'system', 'content': 'You run in a loop of Thought, Action, PAUSE, Observation.\nAt the end of the loop you output an Answer\nUse Thought to describe your thoughts about the question you have been asked.\nUse Action to run one of the actions available to you - then return PAUSE.\nObservation will be the result of running those actions.\n\nYour available actions are:\n\ncalculate:\ne.g. calculate: 4 * 7 / 31\nRuns a calculation and returns the number - uses Python so be sure to use floating point syntax if necessary\n\naverage_dog_weight:\ne.g. average_dog_weight: Collie\nreturns average weight of a dog when given the breed\n\nExample session:\n\nQuestion: How much does a Bulldog weigh?\nThought: I should look the dogs weight using average_dog_weight\nAction: average_dog_weight: Bulldog\nPAUSE\n\nYou will be called again with this:\n\nObservation: A Bulldog weights 51 lbs\n\nYou then output:\n\nAnswer: A bulldog weights 51 lbs'}, {'role': 'user', 'content': 'I have 2 dogs, a b

## Automating the Agent system

In [105]:
import re
action_re = re.compile('^Action: (\w+): (.*)$')   # python regular expression to selection action

  action_re = re.compile('^Action: (\w+): (.*)$')   # python regular expression to selection action


In [106]:
def query(question, max_turns = 5):
    i=0
    bot = Agent(prompt)
    next_prompt = question
    while i<max_turns:
        i+=1
        result = bot(next_prompt) #querying an agent with my question and prompting it to reason
        print(result)
        actions = [
            action_re.match(a)
        for a in result.split('\n')
        if action_re.match(a)
                  ]
        if actions:
            action, action_input = actions[0].groups()
            print(action, action_input)
            if action not in known_actions:
                raise Exception("Unknown action: {}: {}".format(action, action_input))
            print("-- running {} {}".format(action, action_input))
            observation = known_actions[action](action_input)
            print("Observation:", observation)
            next_prompt = "Observation: {}".format(observation)
        else:
            return

In [107]:
question = """I have 2 dogs, a border collie and a scottish terrier. What is their combined weight"""
query(question)

Thought: I can find the average weight of a Border Collie and a Scottish Terrier using the average_dog_weight action, then calculate their combined weight.

Action: average_dog_weight: Border Collie
PAUSE
average_dog_weight Border Collie
-- running average_dog_weight Border Collie
Observation: a Border Collies average weight is 37 lbs
Action: average_dog_weight: Scottish Terrier
PAUSE
average_dog_weight Scottish Terrier
-- running average_dog_weight Scottish Terrier
Observation: Scottish Terriers average 20 lbs
Action: calculate: 37 + 20
PAUSE
calculate 37 + 20
-- running calculate 37 + 20
Observation: 57
Answer: The combined weight of a Border Collie and a Scottish Terrier is 57 lbs


## Agent based application using LangGraph Components

In [108]:
from typing import TypedDict, Annotated
import operator
from langchain_core.messages import AnyMessage, SystemMessage, HumanMessage, ToolMessage
from langchain_openai import ChatOpenAI
from langchain_community.tools.tavily_search import TavilySearchResults
from langgraph.graph import StateGraph, END

In [109]:
travily_api_key = "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"
openai_api_key = "sk-49_PQs9n7HH5XIfPl5Bm-5zWd-fCJPMrV0VvX4FOuPT3BlbkFJicxxVapD-KQKeLr2Qa30yMRemOZ4o7ZPnlNTb7of0A"
model = "gpt-3.5-turbo"

In [110]:
import os
import getpass

if not os.environ.get("TAVILY_API_KEY"):
    os.environ["TAVILY_API_KEY"] = getpass.getpass("Tavily API key:\n")

Tavily API key:
 ········


In [112]:
tool = TavilySearchResults(max_results=4) #increased number of results
print(type(tool))
print(tool.name)

<class 'langchain_community.tools.tavily_search.tool.TavilySearchResults'>
tavily_search_results_json


In [114]:
class AgentState(TypedDict):
    messages: Annotated[list[AnyMessage], operator.add]

In [115]:
class Agent:
    #Define the role of an agent, model - gpt3.5, gpt4o, etc., tools available, and system
    def __init__(self, model, tools, system=""):
        self.system = system
        graph = StateGraph(AgentState)
        graph.add_node("llm", self.call_openai)
        graph.add_node("action", self.take_action)
        graph.add_conditional_edges(
            "llm",
            self.exists_action,
            {True: "action", False: END}
        )
        graph.add_edge("action", "llm")
        graph.set_entry_point("llm")
        self.graph = graph.compile()
        self.tools = {t.name: t for t in tools}
        self.model = model.bind_tools(tools)

    def exists_action(self, state: AgentState):
        result = state['messages'][-1]
        return len(result.tool_calls) > 0

    def call_openai(self, state: AgentState):
        messages = state['messages']
        if self.system:
            messages = [SystemMessage(content=self.system)] + messages
        message = self.model.invoke(messages)
        return {'messages': [message]}

    def take_action(self, state: AgentState):
        tool_calls = state['messages'][-1].tool_calls
        results = []
        for t in tool_calls:
            print(f"Calling: {t}")
            if not t['name'] in self.tools:      # check for bad tool name from LLM
                print("\n ....bad tool name....")
                result = "bad tool name, retry"  # instruct LLM to retry if bad
            else:
                result = self.tools[t['name']].invoke(t['args'])
            results.append(ToolMessage(tool_call_id=t['id'], name=t['name'], content=str(result)))
        print("Back to the model!")
        return {'messages': results}

In [116]:
prompt = """You are a smart research assistant. Use the search engine to look up information. \
You are allowed to make multiple calls (either together or in sequence). \
Only look up information when you are sure of what you want. \
If you need to look up some information before asking a follow up question, you are allowed to do that!
"""

llm = ChatOpenAI(model=model, api_key=openai_api_key)  #reduce inference cost
agent = Agent(llm, [tool], system=prompt)

In [119]:
#from IPython.display import Image
#Image(agent.graph.get_graph().draw_png())

In [122]:
messages = [HumanMessage(content="What is the weather in sf?")]
result = agent.graph.invoke({"messages": messages})
result 

Calling: {'name': 'tavily_search_results_json', 'args': {'query': 'weather in San Francisco'}, 'id': 'call_xksSO8kyhz0FUW7b8hYTmUpa', 'type': 'tool_call'}
Back to the model!


{'messages': [HumanMessage(content='What is the weather in sf?'),
  AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_xksSO8kyhz0FUW7b8hYTmUpa', 'function': {'arguments': '{"query":"weather in San Francisco"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 153, 'total_tokens': 174, 'completion_tokens_details': {'reasoning_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-0d9aa5fc-e100-4063-96c7-7bd1cbe7f1d9-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'weather in San Francisco'}, 'id': 'call_xksSO8kyhz0FUW7b8hYTmUpa', 'type': 'tool_call'}], usage_metadata={'input_tokens': 153, 'output_tokens': 21, 'total_tokens': 174}),
  ToolMessage(content='[{\'url\': \'https://www.weatherapi.com/\', \'content\': "{\'location\': {\'name\': \'San Francisco\', \'region\': \'Califo

In [119]:
#Example of parallel tool calling
messages = [HumanMessage(content="What is the weather in SF and LA?")]
result = agent.graph.invoke({"messages": messages})

Calling: {'name': 'tavily_search_results_json', 'args': {'query': 'current weather in San Francisco'}, 'id': 'call_qaNPcHMkhSpO71c4KJk6BWRt', 'type': 'tool_call'}
Calling: {'name': 'tavily_search_results_json', 'args': {'query': 'current weather in Los Angeles'}, 'id': 'call_l72Hxb1VCXpyw3hQRorZhHwE', 'type': 'tool_call'}
Back to the model!


In [121]:
result

{'messages': [HumanMessage(content='What is the weather in SF and LA?'),
  AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_qaNPcHMkhSpO71c4KJk6BWRt', 'function': {'arguments': '{"query": "current weather in San Francisco"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}, {'id': 'call_l72Hxb1VCXpyw3hQRorZhHwE', 'function': {'arguments': '{"query": "current weather in Los Angeles"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 60, 'prompt_tokens': 153, 'total_tokens': 213}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-b2a6f525-c7ab-4b69-b1b3-a4c7d4651a43-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'current weather in San Francisco'}, 'id': 'call_qaNPcHMkhSpO71c4KJk6BWRt', 'type': 'tool_call'}, {'name': 'tavily_search_results_json', 'args': {'query': 'current weather i

In [136]:
# Note, the query was modified to produce more consistent results. 
# Results may vary per run and over time as search information and models change.
# Example of Sequential Query 
query = "Who won the super bowl in 2024? In what state is the winning team headquarters located? \
What is the GDP of that state? Answer each question." 
messages = [HumanMessage(content=query)]

#llm = ChatOpenAI(model="gpt-4o")  # requires more advanced model
agent1 = Agent(llm, [tool], system=prompt)
result = agent1.graph.invoke({"messages": messages})

Calling: {'name': 'tavily_search_results_json', 'args': {'query': 'Super Bowl 2024 winner'}, 'id': 'call_LUDWsilpc6lSNVARKkPSGG78', 'type': 'tool_call'}
Calling: {'name': 'tavily_search_results_json', 'args': {'query': 'headquarters location of Super Bowl 2024 winner'}, 'id': 'call_bot40mNk46lVUe8cbnA2Z9fY', 'type': 'tool_call'}
Back to the model!
Calling: {'name': 'tavily_search_results_json', 'args': {'query': 'GDP of Missouri'}, 'id': 'call_NcA9LboOnEkOjFvX5PWavdpn', 'type': 'tool_call'}
Back to the model!


In [138]:
result

{'messages': [HumanMessage(content='Who won the super bowl in 2024? In what state is the winning team headquarters located? What is the GDP of that state? Answer each question.'),
  AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_LUDWsilpc6lSNVARKkPSGG78', 'function': {'arguments': '{"query": "Super Bowl 2024 winner"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}, {'id': 'call_bot40mNk46lVUe8cbnA2Z9fY', 'function': {'arguments': '{"query": "headquarters location of Super Bowl 2024 winner"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 65, 'prompt_tokens': 178, 'total_tokens': 243}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-712efe51-a4b1-4993-a1df-d222f4b1ff8a-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'Super Bowl 2024 winner'}, 'id': 'call_LUDWsilpc6lSNVARKkPSGG78', 'ty

## Regular Search using Tavily Library

In [227]:
from tavily import TavilyClient

In [228]:
client = TavilyClient(api_key=os.environ.get("TAVILY_API_KEY"))

In [229]:
result = client.search("What is the weather of SF?")

## Persistence & Streaming of Tokens

In [149]:
from langgraph.checkpoint.sqlite import SqliteSaver
memory = SqliteSaver.from_conn_string(":memory:")

In [140]:
class Agent:

    def __init__(self, model, tools, checkpointer, system=""):
        self.system = system
        graph = StateGraph(AgentState)
        graph.add_node("llm", self.call_openai)
        graph.add_node("action", self.take_action)
        graph.add_conditional_edges(
            "llm",
            self.exists_action,
            {True: "action", False: END}
        )
        graph.add_edge("action", "llm")
        graph.set_entry_point("llm")
        self.graph = graph.compile(checkpointer=checkpointer)
        self.tools = {t.name: t for t in tools}
        self.model = model.bind_tools(tools)

    def exists_action(self, state: AgentState):
        result = state['messages'][-1]
        return len(result.tool_calls) > 0

    def call_openai(self, state: AgentState):
        messages = state['messages']
        if self.system:
            messages = [SystemMessage(content=self.system)] + messages
        message = self.model.invoke(messages)
        return {'messages': [message]}

    def take_action(self, state: AgentState):
        tool_calls = state['messages'][-1].tool_calls
        results = []
        for t in tool_calls:
            print(f"Calling: {t}")
            if not t['name'] in self.tools:      # check for bad tool name from LLM
                print("\n ....bad tool name....")
                result = "bad tool name, retry"  # instruct LLM to retry if bad
            else:
                result = self.tools[t['name']].invoke(t['args'])
            results.append(ToolMessage(tool_call_id=t['id'], name=t['name'], content=str(result)))
        print("Back to the model!")
        return {'messages': results}

In [150]:
prompt = """You are a smart research assistant. Use the search engine to look up information. \
You are allowed to make multiple calls (either together or in sequence). \
Only look up information when you are sure of what you want. \
If you need to look up some information before asking a follow up question, you are allowed to do that!
"""

agent = Agent(llm, [tool], system=prompt, checkpointer=memory)

In [152]:
messages = [HumanMessage(content="What is the weather in LA?")]
thread = {"configurable": {"thread_id": "4"}}
for event in agent.graph.stream({"messages": messages}, thread):
    for v in event.values():
        print(v)

{'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_S9mFnaYJn2qeakkq3RQ2FrLk', 'function': {'arguments': '{"query":"weather in Los Angeles"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 153, 'total_tokens': 174}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-88d43e79-213b-4d91-a3eb-e9ec9a5294d3-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'weather in Los Angeles'}, 'id': 'call_S9mFnaYJn2qeakkq3RQ2FrLk', 'type': 'tool_call'}], usage_metadata={'input_tokens': 153, 'output_tokens': 21, 'total_tokens': 174})]}
Calling: {'name': 'tavily_search_results_json', 'args': {'query': 'weather in Los Angeles'}, 'id': 'call_S9mFnaYJn2qeakkq3RQ2FrLk', 'type': 'tool_call'}
Back to the model!
{'messages': [ToolMessage(content='[{\'url\': \'https://www.weatherapi.com/\', \'cont

In [153]:
messages = [HumanMessage(content="What about in SF?")]
thread = {"configurable": {"thread_id": "4"}}
for event in agent.graph.stream({"messages": messages}, thread):
    for v in event.values():
        print(v)

{'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_ltyDoGnHgUrBOQ6EwuqKlmSa', 'function': {'arguments': '{"query":"weather in San Francisco"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 1131, 'total_tokens': 1152}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-c60978b0-3eee-477c-aea9-f390425f6ddb-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'weather in San Francisco'}, 'id': 'call_ltyDoGnHgUrBOQ6EwuqKlmSa', 'type': 'tool_call'}], usage_metadata={'input_tokens': 1131, 'output_tokens': 21, 'total_tokens': 1152})]}
Calling: {'name': 'tavily_search_results_json', 'args': {'query': 'weather in San Francisco'}, 'id': 'call_ltyDoGnHgUrBOQ6EwuqKlmSa', 'type': 'tool_call'}
Back to the model!
{'messages': [ToolMessage(content='[{\'url\': \'https://www.weatherapi.com/

In [154]:
messages = [HumanMessage(content="Which one is hotter?")]
thread = {"configurable": {"thread_id": "4"}}
for event in agent.graph.stream({"messages": messages}, thread):
    for v in event.values():
        print(v)

{'messages': [AIMessage(content='Based on the current weather information:\n- Los Angeles: Temperature is 78.2°F (25.7°C)\n- San Francisco: Temperature is 57.6°F (14.2°C)\n\nLos Angeles is hotter than San Francisco at the moment. If you have any more questions or need comparisons, feel free to ask!', response_metadata={'token_usage': {'completion_tokens': 68, 'prompt_tokens': 1952, 'total_tokens': 2020}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-b9dc1c4d-599d-4f44-9166-dcabd1f9464e-0', usage_metadata={'input_tokens': 1952, 'output_tokens': 68, 'total_tokens': 2020})]}


In [156]:
from langgraph.checkpoint.aiosqlite import AsyncSqliteSaver
memory = AsyncSqliteSaver.from_conn_string(":memory:")
agent = Agent(llm, [tool], system=prompt, checkpointer=memory)

In [157]:
messages = [HumanMessage(content="What is the weather in SF?")]
thread = {"configurable": {"thread_id": "4"}}
async for event in agent.graph.astream_events({"messages": messages}, thread, version="v1"):
    kind = event["event"]
    if kind == "on_chat_model_stream":
        content = event["data"]["chunk"].content
        if content:
            # Empty content in the context of OpenAI means
            # that the model is asking for a tool to be invoked.
            # So we only print non-empty content
            print(content, end="|")

Calling: {'name': 'tavily_search_results_json', 'args': {'query': 'weather in San Francisco'}, 'id': 'call_nNLUlX3ZVFKnImnsN2l93glD', 'type': 'tool_call'}
Back to the model!
The| current| weather| in| San| Francisco| is| clear| with| a| temperature| of| |57|.|6|°F| (|14|.|2|°C|).| The| wind| speed| is| |13|.|0| km|/h| coming| from| the| W|SW| direction|.| The| humidity| is| at| |83|%,| and| there| is| no| precipitation| at| the| moment|.|

## Human in the Loop

# Essay Writer Project