# Ben Needs a Friend - LLM Agent for event listings
This is part of the "Ben Needs a Friend" tutorial.  See all the notebooks and materials [here](https://github.com/bpben/ben_friend).

This notebook is intended to be run in Kaggle Notebooks.  Access that version [here](https://www.kaggle.com/code/bpoben/ben-needs-a-friend-llm-agent). 

In this notebook, we set up a simple workflow for an agent to suggest some cool events from the [Boston Calendar](https://www.thebostoncalendar.com/).  This can be run locally or on Kaggle, but requires you to have access to [OpenAI's API](https://openai.com/blog/openai-api). 

In [1]:
# install requirements (required for Kaggle setup)
%pip install -q langchain langchain_openai
# only if you plan to use the HF setup here
#%pip install -q bitsandbytes accelerate

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
keras-cv 0.8.2 requires keras-core, which is not installed.
keras-nlp 0.8.2 requires keras-core, which is not installed.
tensorflow-decision-forests 1.8.1 requires wurlitzer, which is not installed.
apache-beam 2.46.0 requires dill<0.3.2,>=0.3.1.1, but you have dill 0.3.8 which is incompatible.
apache-beam 2.46.0 requires numpy<1.25.0,>=1.14.3, but you have numpy 1.26.4 which is incompatible.
apache-beam 2.46.0 requires pyarrow<10.0.0,>=3.0.0, but you have pyarrow 15.0.2 which is incompatible.
google-cloud-bigquery 2.34.4 requires packaging<22.0dev,>=14.3, but you have packaging 23.2 which is incompatible.
jupyterlab 4.1.5 requires jupyter-lsp>=2.0.0, but you have jupyter-lsp 1.5.1 which is incompatible.
jupyterlab-lsp 5.1.0 requires jupyter-lsp>=2.0.0, but you have jupyter-lsp 1.5.1 which is incompatible.
li

In [2]:
from langchain_openai import ChatOpenAI
from langchain.agents import (
    tool, 
    create_react_agent,
    AgentExecutor
)
from langchain_core.prompts import PromptTemplate
from datetime import datetime, date
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers.string import StrOutputParser
from langchain.tools.render import render_text_description

# this is for simple scraping tool
import requests
from random import sample
from bs4 import BeautifulSoup

## Setting up the agent
Here are three ways for setting up your LLM.  For more details, see the README! 

1) OpenAI GPT 3.5 - Requires an API key (not free!) You will also need to have an .env file that looks like: `OPENAI_API_KEY=<API key>`

2) Ollama - See the README for setup instructions.  Only tested with Mac M1 OS.

3) Mistral + Kaggle notebook - Designed for use of free GPU, will need installation of packages above.

In [3]:
# setting up GPT-3.5 configuration
# # using .env file for GPT API key
# from dotenv import load_dotenv
# load_dotenv()
# llm_model = "gpt-3.5-turbo"
# openai = ChatOpenAI(model=llm_model)

# for use in the Kaggle notebook
# you will need to access "Secrets" under the "Add-ons" menu
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
api_key = user_secrets.get_secret("OPENAI_API_KEY")
llm_model = "gpt-3.5-turbo"
openai = ChatOpenAI(api_key=api_key, model=llm_model)

In [None]:
# running with local Ollama
# see setup instructions - this is different from use in Kaggle
from langchain.llms import Ollama
ollama_instruct_model = 'mistral'

# load pre-trained model
ollama = Ollama(model=ollama_instruct_model)

In [None]:
# running with Mistral in Kaggle notebook
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from transformers import BitsAndBytesConfig
from langchain_community.llms import HuggingFacePipeline
model_name = '/kaggle/input/mistral/pytorch/7b-instruct-v0.1-hf/1'

# Configure quantization
quantization_config = BitsAndBytesConfig(load_in_4bit=True)

# will use HF's pipeline and LC's wrapping
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name,
                                            device_map='auto',# makes use of the GPUs
                                            quantization_config=quantization_config, # for speed/memory
                                            )
pipe = pipeline("text-generation", 
                model=model, tokenizer=tokenizer,
                max_new_tokens=20, # arbitrary, for short answers
               device_map='auto',)

hf_pipe = HuggingFacePipeline(pipeline=pipe)

In [4]:
# choose your own adventure - for consistency on the llm we'll be using
llm = openai

## Early experiments with agents

If we ask our model to tell us the day, it's not particularly useful.  It has no knowledge of the current date!

In [5]:
print(llm.invoke('What day is it today?'))

content='Today is Sunday.' response_metadata={'token_usage': {'completion_tokens': 4, 'prompt_tokens': 13, 'total_tokens': 17}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None} id='run-c10c03f8-3c53-4ea9-b8de-1f42b1f43593-0'


 But if we design a function and provided the details to the model, it can suggest what steps to take to get the information it needs.

 We'll use LangChain's capability here.  The `@tool` decorator wraps the function that makes it easier to process the text description into input into the model.

In [6]:
# notice the detailed docstrings - this is to allow the LLM to understand its purpose
# this is a tool decorator for LangChain, it enables it to be parsed 
@tool
def today(text: str) -> str:
    """Returns today's date, use this when you need to get today's date.\
    The input should always be an empty string."""
    return str(date.today())

# render the tool for a prompt (requires input as a list)
render_text_description([today])

"today: today(text: str) -> str - Returns today's date, use this when you need to get today's date.    The input should always be an empty string."

#### Try it: Construct a tool-aware prompt
The previous examples with LC should give us some idea how we might plug this tool description into the LLM input so it knows the tool is available.  An example is provided below.

In [7]:
template = """Answer the following questions as best you can. \
You will need to break your response into steps, each which may use a different tool. \
You have access to the following tools:

{tools}

Question: {input}
"""

prompt = PromptTemplate.from_template(template)

chain = prompt | llm | StrOutputParser()

response = chain.invoke(input={
    "input": "What day is it today?",
    "tools": [today]
    })
print(response)

Step 1: Use the today tool to get today's date
today()

Step 2: Provide the response based on the result obtained from step 1
Today is [insert today's date].


### Setting up the Reasoning + Act (ReAct) prompt
This is an adapted version of [LangChain's example](https://api.python.langchain.com/en/latest/agents/langchain.agents.react.agent.create_react_agent.html).  It provides a structure for the model to generate its outputs as well as placeholders for the tool descriptions and names.

The "scratchpad" is essentially the model's "memory; where all the things its thought and done will be entered.


In [8]:
# prompt is adapted from LangChain's example: https://api.python.langchain.com/en/latest/agents/langchain.agents.react.agent.create_react_agent.html
# action input is required to be a string - otherwise I found it would use function calls as inputs
template = '''You will be a answering questions about upcoming events. \
You will need to break your response into steps, each which may use a different tool. \
You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: an input to the action, usually an empty string
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question. 

Think step by step! Begin!

Question: {input}
Thought:{agent_scratchpad}'''

prompt = PromptTemplate.from_template(template)

#### Simple workflow
LC has its own implementation of the ReAct framework.  When you call `create_react_agent`, it will create a pipe with several components.  Let's look at those components:

In [9]:
# copying this here for reference
@tool
def today(text: str) -> str:
    """Returns today's date, use this when you need to get today's date.
    The input should always be an empty string."""
    return str(date.today())

In [10]:
tools = [today]
agent = create_react_agent(llm, tools, prompt)
for stage, component in agent:
    if component is not None:
        print(stage)
        if stage=='middle':
            for c in component:
                print(c)
                print('--')
        else:
            print(component.__repr__())

first
RunnableAssign(mapper={
  agent_scratchpad: RunnableLambda(lambda x: format_log_to_str(x['intermediate_steps']))
})
middle
input_variables=['agent_scratchpad', 'input'] partial_variables={'tools': "today: today(text: str) -> str - Returns today's date, use this when you need to get today's date.\n    The input should always be an empty string.", 'tool_names': 'today'} template='You will be a answering questions about upcoming events. You will need to break your response into steps, each which may use a different tool. You have access to the following tools:\n\n{tools}\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [{tool_names}]\nAction Input: an input to the action, usually an empty string\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the 

Step by step through the chain above:
- First: The `format_log_to_str` takes the steps that have been performed so far (i.e. thoughts + actions) and formats them into the model input
- Middle
  - First is the prompt, which has "partial variables" corresponding to the tools with passed it.  These will be populated in the tools an tool_names fields in the prompt when the agent runs
  - Then, we have the model itself
    - Note here - there is a "stop" argument - this tells the agent workflow when to stop, so that actions can be parsed 
- Last: The "output parser" which parses the tools and the parameters

All this gets wrappered in an `AgentExecutor`, which controls the flow through the chain (described above) and actually executes the tools, feeding their output back to the LLM.

`verbose`: Enables us to see all the intermediate steps.  
`handle_parsing_errors`: Errors in execution will be fed back to the model to see if it can correct iself.

You'll see in green the text produced by the agent and in blue the output from the function.  

In [11]:
# run the agent
agent_executor = AgentExecutor(agent=agent, tools=tools, 
                               verbose=True, 
                               handle_parsing_errors=True,
                              max_iterations=5)
print(agent_executor.invoke({"input": "What day is it today?"})['output'])



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find out today's date.
Action: today
Action Input: [0m[36;1m[1;3m2024-04-16[0m[32;1m[1;3mI now know today's date is April 16, 2024.
Final Answer: Today is April 16, 2024.[0m

[1m> Finished chain.[0m
Today is April 16, 2024.


With the smaller models, you'll see some weird answers.  With GPT 3.5, you'll usually get the workflow you're looking for.

#### Local events workflow
These next functions provide two new tools for our agent; one that provides the current "weekend number" in the month and one that scrapes Boston Calendar for event information.  Most of this is just scraping and formatting Boston Calendar, but the functions with the tool decorators will be provided to the model.

Then, we set up our ReAct agent as before.

A note here - this *usually* works with GPT 3.5.  With other models, it's hit and miss.  That's likely due both to the complexity of the task and the fact that LangChain is...[complicated](https://minimaxir.com/2023/07/langchain-problem/) in how it builds these workflows.

In [12]:

def parse_for_calendar(text: str) -> str:
    # utility for converting text input to boston calendar URL
    if len(text) == 1:
        # will assume today's month if we're looking at weekend number
        today = datetime.now()
        month = today.month
        day = today.day
        url = f'https://www.thebostoncalendar.com/events?day={day}&month={month}&weekend={text}&year=2024'
    # this will fail if the string provided is not a date
    else:
        try:
            f_date = datetime.strptime(text, '%Y-%M-%d')
            day_of_month = f_date.day
            month = f_date.month
            url = f'https://www.thebostoncalendar.com/events?day={day_of_month}&month={month}&year=2024'
        except ValueError:
            return 
    return url

@tool
def weekend(text: str) -> str:
    """Returns the single-digit weekend number for this weekend, \
    use this for any questions related to the weekend date. \
    The input should always be an empty string, \
    and this function will always return the weekend number."""
    today = datetime.now()
    
    # Calculate the weekend number within the month
    first_day_of_month = today.replace(day=1)
    weekend_number_within_month = (today - first_day_of_month).days // 7 + 1

    return weekend_number_within_month

@tool
def get_events(text: str) -> str:
    """Returns information about local events. \ 
    The input is either a date string in the format YYYY-MM-DD, \
    or it is a single-digit weekend number.\
    This function will return a list where \
    each element contains an event name, date and location as a tuple.\
    This function should be used to provide complete information about events."""
    # use the parsing utility to get a formatted url
    url = parse_for_calendar(text)
    if url is None:
        # give the LLM a useful response
        return f'Input "{text}" is not in the right format - it needs to be a date string or a weekend number'
    response = requests.get(url)     
    
    # Parse the HTML content
    soup = BeautifulSoup(response.content, 'html.parser')
    
    # Extract data
    events = soup.find_all('div', class_='info')

    all_events = []
    for event in events:
        title = event.find('h3').text.strip()
        date = event.find('p', class_='time').text.strip()
        location = event.find('p', class_='location').text.strip()
        all_events.append((title, date, location))

    # randomly select a few, provide as list
    # just reducing the amount of tokens being passed to the model, this gives the idea
    choices = sample(all_events, 3)
    
    return choices


In [13]:
tools = [today, weekend, get_events]
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, 
                               verbose=True, 
                               handle_parsing_errors=True,
                               max_iterations=5)
agent_executor.invoke({"input": "What is going on this weekend?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find out the weekend number first before getting information on events happening this weekend.
Action: weekend
Action Input: [0m[33;1m[1;3m3[0m[32;1m[1;3mNow that I have the weekend number, I can use it to get information on events happening this weekend.
Action: get_events
Action Input: 3[0m[38;5;200m[1;3m[('Latin Fiesta at OLEspana: Dance the Night Away Under the Stars!', 'Saturday, Apr 20, 2024 10:30p', 'OLESPANA WHISKEY & TAPAS'), ('Spotlight Tour: Seeing In/Looking Out, with Sophia Pasalis ’25', 'Saturday, Apr 20, 2024 11:00a', 'Harvard Art Museums'), ('Zosha Warpeha in Concert', 'Saturday, Apr 20, 2024 1:00p', 'Scandinavian Cultural Center')][0m[32;1m[1;3mI now know the final answer.
Final Answer: This weekend, you can attend the Latin Fiesta at OLEspana, the Spotlight Tour at Harvard Art Museums, or the Zosha Warpeha in Concert at the Scandinavian Cultural Center.[0m

[1m> Finished chain.[0m


{'input': 'What is going on this weekend?',
 'output': 'This weekend, you can attend the Latin Fiesta at OLEspana, the Spotlight Tour at Harvard Art Museums, or the Zosha Warpeha in Concert at the Scandinavian Cultural Center.'}

Again, using the smaller models will yield weird results, but GPT is usually able to do this successfully.

We can see here that the output provides three random examples of events coming up this weekend.  One note - if you run this during the weekend, it will return information for the current weekend.

Our "today" tool also enables the agent to get today's events.

In [14]:
tools = [today, weekend, get_events]
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, 
                               verbose=True, 
                               handle_parsing_errors=True,
                              max_iterations=5)
agent_executor.invoke({"input": "What is going on today?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find out today's date to answer this question.
Action: today
Action Input: [0m[36;1m[1;3m2024-04-16[0m[32;1m[1;3mNow that I know today's date, I can check for events happening today.
Action: get_events
Action Input: 2024-04-16[0m[38;5;200m[1;3m[('Virtual Lecture: "Medicine for Women, Medicine by Women: The Founding of Boston\'s Vincent Memorial Hospital"', 'Tuesday, Jan 16, 2024 6:00p', 'Online'), ('Boomerangs Thrift Stores | Jamaica Plain, Central Square, South End', 'Tuesday, Jan 16, 2024 goes until 06/30', 'Boomerangs'), ('Source: Gastro Pub and Pizza Bar | Harvard Square', 'Tuesday, Jan 16, 2024 goes until 06/30', 'Source')][0m[32;1m[1;3mI now know the events happening today.[0mInvalid Format: Missing 'Action:' after 'Thought:[32;1m[1;3mI need to provide the final answer to the original question.
Final Answer: There are several events happening today, including a virtual lecture, a thrift store vi

{'input': 'What is going on today?',
 'output': 'There are several events happening today, including a virtual lecture, a thrift store visit, and a pub and pizza bar in different locations.'}

And we can even ask for specific dates!

In [15]:
agent_executor.invoke({"input": "What is going on 4/28/24?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find out the events happening on a specific date.
Action: get_events
Action Input: 2024-04-28[0m[38;5;200m[1;3m[('Magic Fred the Magician', 'Sunday, Jan 28, 2024 goes until 01/28', 'Wenham Museum'), ('Lamplighter CX Patio', 'Sunday, Jan 28, 2024 goes until 06/30', 'Lamplighter CX'), ('Amar: Portuguese cuisine with Boston flair | Back Bay', 'Sunday, Jan 28, 2024 goes until 06/30', 'Amar at the Raffles Hotel')][0m[32;1m[1;3mI now know the final answer
Final Answer: On 4/28/24, there are events such as "Magic Fred the Magician" at Wenham Museum, "Lamplighter CX Patio" at Lamplighter CX, and "Amar: Portuguese cuisine with Boston flair | Back Bay" at Amar at the Raffles Hotel.[0m

[1m> Finished chain.[0m


{'input': 'What is going on 4/28/24?',
 'output': 'On 4/28/24, there are events such as "Magic Fred the Magician" at Wenham Museum, "Lamplighter CX Patio" at Lamplighter CX, and "Amar: Portuguese cuisine with Boston flair | Back Bay" at Amar at the Raffles Hotel.'}

### "Friendly" style
In the above example, we get a sort of generic tone in our response.  Our Friend would never talk like that, would they?


#### Try it: Make the response more friendly
Try and use what you know so far to come up with a way to make the output more in the style of the Friend we designed.

There's a couple ways to do this

1) Edit the ReAct prompt to generate something more friendly

2) Pipe the output of the agent flow into a "styling" prompt.


In [16]:
# "friendly" version of the ReAct prompt
friend_template = '''Answer the following questions as best you can. \
You will need to break your response into steps, each which may use a different tool. \
You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: an input to the action, usually an empty string
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question. 
Output the final answer as if you're having a fun conversation with a good friend.

Think step by step! Begin!

Question: {input}
Thought:{agent_scratchpad}'''

friend_prompt = PromptTemplate.from_template(friend_template)

In [17]:
agent = create_react_agent(llm, tools, friend_prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, 
                               verbose=True, 
                               handle_parsing_errors=True,
                              max_iterations=5)
agent_executor.invoke({"input": "What is going on today?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find out today's date.
Action: today
Action Input: ""[0m[36;1m[1;3m2024-04-16[0m[32;1m[1;3mNow that I know today's date, I can see what events are happening.
Action: get_events
Action Input: 2024-04-16[0m[38;5;200m[1;3m[('Live Music at Aeronaut Brewery | Somerville', 'Tuesday, Jan 16, 2024 goes until 06/30', 'Aeronaut Brewery Somerveille'), ('$1 Oysters at Limani Seaport', 'Tuesday, Jan 16, 2024 goes until 01/17', 'Seaport'), ("D's Keys Dueling Pianos & Sing Along Bar", 'Tuesday, Jan 16, 2024 goes until 06/30', "D's Keys")][0m[32;1m[1;3mThere are some exciting events happening today, like live music at Aeronaut Brewery and $1 oysters at Limani Seaport! 
Final Answer: There are events such as live music and $1 oysters happening today.[0m

[1m> Finished chain.[0m


{'input': 'What is going on today?',
 'output': 'There are events such as live music and $1 oysters happening today.'}

Sometimes this will confuse the flow and you may have less success even with GPT.  With the smaller models, it definitely becomes a challenge.  Another way to get a friendly response would be to do a separate "rewrite" call. 

In [22]:
# creating a styling prompt
styling_prompt = """Your name is Friend.  You are having a conversation with your close friend Ben. 
You and Ben are sarcastic and poke fun at one another. 
But you care about each other and support one another.

You know the following information:
{events_output}

Ben: {input}
Provide your response:"""

styling_template = PromptTemplate.from_template(styling_prompt)

In [23]:

styling_agent = {'events_output': agent_executor, 'input': RunnablePassthrough()} | \
styling_template | llm | StrOutputParser()

print(styling_agent.invoke({"input": "What is going on today?"}))



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find out today's date to see what events are happening.
Action: today
Action Input: [0m[36;1m[1;3m2024-04-16[0m[32;1m[1;3mNow that I know today's date, I can use it to find out what events are happening.
Action: get_events
Action Input: 2024-04-16[0m[38;5;200m[1;3m[('The Arnold Arboretum of Harvard University | "Museum of Trees"', 'Tuesday, Jan 16, 2024 goes until 06/30', 'Arnold Arboretum'), ('Neighbors Who Care Thrift Shop', 'Tuesday, Jan 16, 2024 goes until 06/30', 'Neighbors Who Care Thrift Shop'), ('Getting Around Town: Four Centuries of Mapping Boston in Transit', 'Tuesday, Jan 16, 2024 goes until 04/27', 'Boston Public Library')][0m[32;1m[1;3mThere are some cool events happening today like "Museum of Trees" at The Arnold Arboretum of Harvard University, "Neighbors Who Care Thrift Shop", and "Getting Around Town: Four Centuries of Mapping Boston in Transit" at the Boston Public Library. Looks like 