# Lesson 4.3: Common Agent Types

---

In the previous lesson, we learned about the concept of **Agents** and how they use **Tools** to interact with the external world. However, there are different ways an Agent can "think" and "act." This lesson will delve into the most common Agent types in LangChain, focusing on their operational mechanisms and when to use them.

## 1. Zero-shot ReAct Agent

The **Zero-shot ReAct Agent** is one of the most popular and powerful Agent types in LangChain. It is based on a reasoning and acting technique called **ReAct (Reasoning and Acting)**.

### 1.1. Concept and How the ReAct Loop Works

* **ReAct** is a prompting strategy that allows an LLM to interleave between **Thought**, **Action**, and **Observation** steps.
* **Zero-shot** means the Agent does not require specific examples of how to solve a problem. It relies entirely on the LLM's reasoning capabilities and the descriptions of the tools to decide the next action.

**The ReAct Loop (Thought, Action, Observation):**

1.  **Thought:** The LLM analyzes the user's query and the current state. It thinks about the goal, necessary information, and potential tools to achieve that goal. The LLM will self-generate a textual thought.
2.  **Action:** Based on the thought, the LLM decides which tool to use and the necessary input for that tool. The LLM will generate a specific syntax to call the tool (e.g., `Action: search_tool Input: "weather Da Nang"`).
3.  **Observation:** The Agent Executor executes the chosen tool with the provided input. The result of this tool execution (e.g., search results, calculation results) is called an Observation.
4.  **Loop:** The Observation is fed back to the LLM. The LLM will use this Observation as new information to restart the loop (Thought, Action) until it feels it has enough information to provide a final answer (`Final Answer`).



### 1.2. When to Use Zero-shot ReAct Agent

* **Most Common:** This is a good default choice for most scenarios where you need an Agent with flexible tool-using capabilities.
* **Multi-step Problem Solving:** Suitable for problems requiring multiple reasoning steps and interactions with different tools.
* **No Specific Examples Needed:** The Agent can figure out how to use tools on its own without you providing examples of problem-solving.
* **Easy to Debug:** With `verbose=True`, you can follow the LLM's thought process, making it easier to understand why it made certain decisions.


---

## 2. Conversational Agent

A **Conversational Agent** is specifically designed to maintain a fluid and contextual conversation while still being able to use tools to answer complex questions.

### 2.1. Concept and Ability to Maintain Conversation

* **Goal:** Allow the Agent to recall previous turns, understand the context of the conversation, and use that information to answer follow-up questions, while still being able to invoke tools when necessary.
* **Difference from Standard ReAct:** A basic ReAct Agent does not automatically maintain chat history. A Conversational Agent integrates a **Memory** mechanism to do so.

### 2.2. Integrating Memory with an Agent

For an Agent to maintain a conversation, it needs a **Memory** component. Memory stores previous conversation turns (user questions and Agent answers) and provides them back to the LLM in subsequent turns.

* **Mechanism:** Memory is typically included in the Agent's Prompt as a placeholder (e.g., `MessagesPlaceholder(variable_name="chat_history")`). When the Agent Executor runs, it will automatically populate this placeholder with the chat history.
* **Benefits:**
    * **Contextual Understanding:** The LLM can refer to previously discussed information.
    * **More Natural Responses:** The conversation becomes more fluid and coherent.
    * **Handles Context-Dependent Questions:** For example: "What's the weather like there?" after asking "What's the weather like in Da Nang today?"




---

## 3. Other Agent Types (Introduction)

LangChain provides various other Agent types, each suited for specific use cases and LLM models.

* **`AgentType.OPENAI_FUNCTIONS`:**
    * **Concept:** This Agent type leverages the function calling capabilities of OpenAI models (such as `gpt-3.5-turbo-0613` and later, `gpt-4`). Instead of the LLM generating text to describe an action, it will generate a specific JSON structure to call a function/tool.
    * **Benefits:** Often more reliable and efficient at tool invocation compared to prompt-based methods.
    * **When to Use:** When you are using OpenAI models that support function calling and want a reliable way for the Agent to interact with Tools.

* **`AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION`:**
    * **Concept:** A variant of the ReAct Agent designed to work better with Chat Models (LLMs optimized for conversational turn-taking) and tools that have structured inputs (e.g., requiring multiple parameters).
    * **Benefits:** Improved ability to understand and use more complex tools.
    * **When to Use:** When you are using Chat Models and your tools require structured inputs (e.g., Pydantic models).


---

## 4. Practical Example: Building an Agent that Can Search the Web and Perform Calculations

We will build a simple Zero-shot ReAct Agent that can use a web search tool (SerpAPI) and a calculator tool to answer questions.

**Preparation:**
* Ensure you have the necessary libraries installed: `langchain-openai`, `google-search-results`, `numexpr`.
* Set the `OPENAI_API_KEY` and `SERPAPI_API_KEY` environment variables.

In [None]:
# Install libraries if not already installed
# pip install langchain-openai openai google-search-results numexpr

import os
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage
from langchain_community.utilities import SerpAPIWrapper
from langchain_community.tools import Tool
from langchain_community.tools.calculator.tool import Calculator

# Set environment variables for OpenAI and SerpAPI keys
# os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
# os.environ["SERPAPI_API_KEY"] = "YOUR_SERPAPI_API_KEY"

# 1. Initialize LLM
# Use a lower temperature for more consistent Agent decisions
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

# 2. Initialize Tools
# Web search tool
search_tool = Tool(
    name="Google Search",
    func=SerpAPIWrapper().run,
    description="Useful when you need to search for information on Google about current events or factual data."
)

# Calculator tool
calculator_tool = Calculator()

tools = [search_tool, calculator_tool]

# 3. Define Prompt for the Agent
# This prompt instructs the LLM on how to think and use tools.
# MessagesPlaceholder(variable_name="agent_scratchpad") is crucial
# for the Agent to record its Thoughts, Actions, and Observations.
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. You have access to the following tools: {tools}. Use them to answer questions."),
    MessagesPlaceholder(variable_name="chat_history"), # To maintain chat history (if any)
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

# 4. Create Agent
# create_react_agent creates a Zero-shot ReAct Agent
agent = create_react_agent(llm, tools, prompt)

# 5. Create Agent Executor
# The Agent Executor will orchestrate the Agent's thought and action process
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True) # verbose=True to see the Agent's thought process

# --- Execute the Agent ---
print("--- Starting Agent Executor ---")
chat_history = [] # Empty chat history for this example

# Question 1: Requires web search
query_1 = "Who is the current president of the United States?"
print(f"\nUser: {query_1}")
response_1 = agent_executor.invoke({"input": query_1, "chat_history": chat_history})
chat_history.extend([HumanMessage(content=query_1), AIMessage(content=response_1["output"])])
print(f"Agent: {response_1['output']}")

# Question 2: Requires calculation
query_2 = "What is the result of (123 * 45) + 678?"
print(f"\nUser: {query_2}")
response_2 = agent_executor.invoke({"input": query_2, "chat_history": chat_history})
chat_history.extend([HumanMessage(content=query_2), AIMessage(content=response_2["output"])])
print(f"Agent: {response_2['output']}")

# Question 3: Combines search and calculation (if LLM is smart enough to combine)
query_3 = "What is the current population of Japan and what would it be if it increased by 1%?"
print(f"\nUser: {query_3}")
response_3 = agent_executor.invoke({"input": query_3, "chat_history": chat_history})
chat_history.extend([HumanMessage(content=query_3), AIMessage(content=response_3["output"])])
print(f"Agent: {response_3['output']}")

print("--- Agent Executor Ended ---")

**Explanation:**
* The Agent is provided with two tools: `Google Search` and `Calculator`.
* When it receives a question, the LLM will analyze it and decide which tool is appropriate.
    * For "Who is the current president of the United States?", the LLM will choose `Google Search`.
    * For "What is the result of (123 * 45) + 678?", the LLM will choose `Calculator`.
    * For the third question, the LLM might need to use `Google Search` first to find the population, then use `Calculator` to compute the 1% increase.
* `verbose=True` allows you to observe each thought and action step of the Agent, which is very useful for debugging and understanding how the Agent is operating.


---

## Lesson Summary

This lesson introduced common **Agent** types in LangChain. We delved into the **Zero-shot ReAct Agent**, understanding its **Reasoning** and **Acting** mechanism through the **Thought, Action, Observation** loop. You also explored the **Conversational Agent** and the importance of integrating **Memory** to maintain conversational context. Additionally, the lesson briefly introduced other Agent types like `AgentType.OPENAI_FUNCTIONS` and `AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION`. Finally, you practiced **building a Zero-shot ReAct Agent** capable of web searching and performing calculations, illustrating how an Agent can autonomously solve multi-step problems by intelligently using tools.