5.1 LLM agents
==========
**Author:** Polina Tsvilodub

This sheet takes a closer look at more complex LLM-based systems and *LLM agents*. Specifically, we will use the package [`langchain`](https://python.langchain.com/v0.2/docs/tutorials/) and its extensions to build our own LLM systems and explore their functionality. The learning goals for this sheet are:

* understanding basics of langchain 
* trying out langchain agents and tools 
* understanding the basics of output processing 
* familiarization with basic handling of agent memory

Langchain is under heavy development. Sometimes examples provided in the docs break with version updates, so one needs to be somewhat patient. 

**NOTE**: At this point it provides quite vast functionality (and docs, respectively) -- of course, we do NOT expect you to study or understand all of that. The examples below will provide links to some relevant parts of the documentation, and the examples serve a little demo / inspiration of what is out there, as a starting point for you to learn more, if you are interested. 

## LangChain

The lecture discussed that modern LLMs can be viewed as building blocks of larger systems, be it for engineering or research purposes. In particular, one might want to use an LLM and make several calls (i.e., several inference passses) to it, and somehow use the predicted results together to complete one's task. Note that when we talk about such systems, we (almost always) use the LLM for *inference*, i.e., the LLM is already pretrained / fine-tuned.

Using the terminology of langchain, a sequence of such LLM calls is called a *chain*. For each call, one minimally needs to specify a (pretrained / fine-tuned) LLM and a prompt that specifies what exactly the call should accomplish. For the prompt, oftentimes *prompt templates* are used. These prompt templates usually specify *variables* which are filled with inputs when the respective LLM call is invoked. The idea behind this is that the calls can be re-used, e.g., with various user inputs, without having to re-type the entire prompt. Further, these inputs may come from a previous LLM call. One neat feature of langchain is that it allows to seamlessly chain LLM calls and stream outputs from one call into the next. Specific types of templates (e.g., chat prompt templates) also take care of formatting text in the way expected by the model, e.g., adding the required special tokens and format for chat models.

[Disclaimer: Not sponsored by LangChain -- there are other very useful tools for doing such things, for instance, XXXX. This is just one popular example.]

Below, we will first look at a an example of a simple sequence of LLM calls. In particular, we will build a system that helps us to come up with a dinner menu, given some ingredients that we already have.

We will be using the OpenAI API to get optimal performance (specifically, the GPT-3.5-turbo model). Instructions for retreiving the an API key will be provided in class.

In [None]:
# please install the following packages and versions

#!pip install langchain==0.2.2 langchain-core==0.2.4 langchain-openai==0.1.7 wikipedia==1.4.0 langchainhub==0.1.17

In [33]:
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate

In [35]:
# set some hyperparameters for the generation
temperature = 0.7
kwargs = {
    "max_tokens": 100,
}

ingredients = "cauliflower, tomatoes."

instructions_text_appetizer = "I have the following ingredients in my fridge: \n{ingredients}\n\nWhich Italian appetizer can I make for dinner with these ingredients?"

instructions_text_main = "I am planning to make the following appetizer: \n{appetizer}\n\nWhich Italian main course can I make for my dinner?"

instructions_menu_summary = "I am planning the following recipes for my dinner: \nAppetizer: {appetizer}\nMain course: {main_course}\n\nPlease write a menu summary for my dinner."

In [36]:
# load env file with the OpenAI API key
load_dotenv()

# instantiate model
model = ChatOpenAI(
    model="gpt-3.5-turbo-0125",
    openai_api_key=os.getenv("OPENAI_API_KEY"),
    temperature=temperature,
    **kwargs   
) 

# construct prompts for our calls
prompt_template_appetizer = PromptTemplate(
    template = instructions_text_appetizer,
    input_variables = ['ingredients'],
)
prompt_template_main = PromptTemplate(
    template = instructions_text_main,
    input_variables = ['appetizer'],
)
prompt_template_summary = PromptTemplate(
    template = instructions_menu_summary,
    input_variables = ['appetizer', 'main_course'],
)
# construct sub-chains for each course
appetizer_chain = prompt_template_appetizer | model | StrOutputParser()
main_chain = {"appetizer": appetizer_chain} | prompt_template_main | model | StrOutputParser()
composed_chain = {"appetizer": appetizer_chain, "main_course": main_chain} | prompt_template_summary | model | StrOutputParser()

# actually call the execution of the entire chain
composed_result = composed_chain.invoke({"ingredients": ingredients})
print("Result: ", composed_result)



Result:  Menu Summary:
- Appetizer: Bruschetta with Cauliflower and Tomato
- Main Course: Spaghetti Carbonara

Enjoy a delicious Italian-inspired dinner with a flavorful cauliflower and tomato bruschetta as the appetizer, followed by a classic spaghetti carbonara as the main course. Bon appétit!


> <strong><span style=&ldquo;color:#D83D2B;&rdquo;>Exercise 5.1.1: LLM chain</span></strong>
>
> 1. Please look at the code above and try to understand what it does. Relevant docs about LLM calls can be found [here](https://python.langchain.com/v0.1/docs/modules/model_io/prompts/quick_start/), about chains [here](https://python.langchain.com/v0.1/docs/expression_language/primitives/sequence/).
> 2. Add a third course to our menu! Of course, you can also play around with the other prompts and the sequence. 

### Agents

In the system above, we have decomposed the task of creating a dinner menu into "bite-sized" pieces for LLM calls ourselves; i.e., we have specified the order and the specific pormpt for the single calls ourselves. Next, we will try to avoid these steps, and use an *agent* instead: i.e., we will pass our overall task description to an LLM and let it figure out the necessary substeps on its own. Specifically, we will use a [React agent](https://react-lm.github.io/).

In [37]:
# same task with agent
from langchain import hub
from langchain.agents import AgentExecutor
from langchain.agents import create_react_agent

# initialize the backbone model for the agent
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0.5)

# Get an example prompt from langchain that was constructed for this agent architecture. you can modify this!
prompt = hub.pull("hwchase17/react")
# inspect the prompt
prompt.template

'Answer the following questions as best you can. You have access to the following tools:\n\n{tools}\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [{tool_names}]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: {input}\nThought:{agent_scratchpad}'

In [38]:
# load the agent
agent = create_react_agent(llm, tools=[], prompt=prompt)
# initialize the agent
agent_executor = AgentExecutor(agent=agent, tools=[], verbose=True)
# actually call the agent with the same task as above
agent_executor.invoke({"input": "Please help me come up with a three course Italian dinner menu. It should be vegetarian. I have cauliflower and tomatoes in my fridge."})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI should think about what dishes I can make with cauliflower and tomatoes to create a well-rounded menu.
Action: [Look up vegetarian Italian recipes using cauliflower and tomatoes]
Action Input: Google search "vegetarian Italian cauliflower tomato recipes"[0m[Look up vegetarian Italian recipes using cauliflower and tomatoes] is not a valid tool, try one of [].[32;1m[1;3mI should try searching for specific dishes using cauliflower and tomatoes as ingredients instead.
Action: [Search for vegetarian Italian cauliflower tomato recipes]
Action Input: Google search "vegetarian Italian cauliflower tomato recipes"[0m[Search for vegetarian Italian cauliflower tomato recipes] is not a valid tool, try one of [].[32;1m[1;3mI should try searching for specific Italian vegetarian dishes that include cauliflower and tomatoes.
Action: [Search for vegetarian Italian dishes with cauliflower and tomatoes]
Action Input: Google search "veget

{'input': 'Please help me come up with a three course Italian dinner menu. It should be vegetarian. I have cauliflower and tomatoes in my fridge.',
 'output': 'A three-course Italian vegetarian dinner menu could include pasta primavera as an appetizer, caprese salad as a main course, and eggplant parmesan as a dessert.'}

> <strong><span style=&ldquo;color:#D83D2B;&rdquo;>Exercise 5.1.2: LLM agents</span></strong>
>
> 1. Please look at the code above and try to understand what it does. Relevant information about agents can be found [here](https://python.langchain.com/v0.2/docs/tutorials/agents/). Please also try to get an overview of the React architecture (see link above).
> 2. What steps does the agent (try to) perform in order to accomplish the task? How does it "know" which steps to do when?
> 3. Compare the results to the chain above. Do you observe differences?

### LangChain agent with tools

As you might have seen above, the agent instantiation accepts a list of *tools* (which we have left empty above). The agent tried to make use of the tools above. This time, let us add a tool to agent -- specifically, we will provide it with a tool to call the Wikipedia API for real time searches.   

In [11]:
from langchain.agents import load_tools
tools = load_tools(["wikipedia"], llm=llm)
print('tools',  tools)

# create an agent with tools
agent_with_tools = create_react_agent(llm, tools=tools, prompt=prompt)

# instantiate and call the agent
agent_executor = AgentExecutor(agent=agent_with_tools, tools=tools, verbose=True)
agent_executor.invoke({"input": "Please help me come up with a three course Italian dinner menu. It should be vegetarian. I have cauliflower and tomatoes in my fridge."})

tools [WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper(wiki_client=<module 'wikipedia' from '/opt/anaconda3/envs/understanding_llms/lib/python3.10/site-packages/wikipedia/__init__.py'>, top_k_results=3, lang='en', load_all_available_meta=False, doc_content_chars_max=4000))]


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mYou can use Wikipedia to search for vegetarian Italian dishes that include cauliflower and tomatoes.
Action: wikipedia
Action Input: Vegetarian Italian dishes with cauliflower and tomatoes[0m[36;1m[1;3mPage: List of vegetable dishes
Summary: This is a list of vegetable dishes, that includes dishes in which the main ingredient or one of the essential ingredients is a vegetable or vegetables.
In culinary terms, a vegetable is an edible plant or its part, intended for cooking or eating raw. Many vegetable-based dishes exist throughout the world.

Page: Fried cauliflower
Summary: Fried cauliflower is a popular dish in many cuisines of the Middle East, Sou

{'input': 'Please help me come up with a three course Italian dinner menu. It should be vegetarian. I have cauliflower and tomatoes in my fridge.',
 'output': 'For the vegetarian Italian dinner menu, you can start with a Caprese salad with tomatoes and fresh mozzarella, followed by a main course of cauliflower risotto, and finish with a dessert of panna cotta with a tomato compote.'}

> <strong><span style=&ldquo;color:#D83D2B;&rdquo;>Exercise 5.1.3: LLM agents with tools</span></strong>
>
> 1. Please look at the code above and try to understand what it does. A list of various tools can be found here [here](https://python.langchain.com/v0.1/docs/integrations/tools/). 
> 2. What steps does the agent (try to) perform in order to accomplish the task? How does it "know" which steps to do when? When does it execute searhces?
> 3. Compare the results to the chain above. Do you observe differences?
> 4. Is the Wikipedia tool a good choice for the task at hand? What else might we consider?

## Output parsing

One of the core bottlenecks of chaining LLM calls is the potential necessity to *process outputs* in aspecific (structured) way. [This](https://python.langchain.com/v0.2/docs/how_to/structured_output/) page provides an overview of how this can be approached.

**TODO:** LMQL and https://github.com/microsoft/aici

## Memory handling

One of the main issues of agents is that they are by default *stateless*; i.e., at each step of execution there is no memory of what happened before. This is handled by adding *memory components*. An overview of this can be found [here]([/v0.2/docs/tutorials/agents/#adding-in-memory](https://python.langchain.com/v0.2/docs/tutorials/agents/#adding-in-memory)).

> <strong><span style=&ldquo;color:#D83D2B;&rdquo;>Exercise 5.1.4: Memory</span></strong>
>
> 1. Take a look at the approach for handling message memory above. Recall the *generative agnet* architecture that was discussed in the lecture. What is the difference between this simple approach and the memory implementation in the generative agents? What are respective (dis)advantages of either approach?