# Objective

To present an overview of the four important Agentic AI patterns - Reflection, Tool Use, Planning and Multi-Agent Collaboration.

# Setup

## Installation

In [None]:
%pip install -q groq openai openai-swarm langchain langchain-openai langchain-groq langchain-experimental ipykernel
# !pip install -q openai==1.55.3 langchain==0.3.7 langchain-openai==0.2.9 langchain-experimental==0.3.3

## Imports

In [1]:
import json, os

from dotenv import load_dotenv
from groq import Groq
from langchain import hub

from langchain.agents import create_react_agent, Tool, AgentExecutor
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage, ToolMessage, SystemMessage
from langchain_openai import AzureChatOpenAI
from langchain_groq import ChatGroq
from langchain_experimental.utilities import PythonREPL

from swarm import Swarm, Agent

MODEL_NAME = 'gemma-9b-it'  # 'qwen/qwen3-32b' 'openai/gpt-oss-20b' 
load_dotenv()

True

In [2]:
# with open('config-azure.json') as f:
#     configs = f.read()

# creds = json.loads(configs)
groq_api_key = os.environ.get('GROQ_API_KEY')

In [3]:
# Method 1: Purana Tareeka
client_groq = Groq(api_key=groq_api_key)
# Method 2: Naya Tareeka
# client_groq_2 = ChatGroq(api_key=groq_api_key, model=MODEL_NAME)

In [4]:
swarm_client_groq = Swarm(client_groq)

In [5]:
# client = AzureOpenAI(api_key=creds['AZURE_OPENAI_KEY'], azure_endpoint=creds['AZURE_OPENAI_ENDPOINT'], api_version='2024-02-01')

# llm = AzureChatOpenAI(azure_endpoint=creds['AZURE_OPENAI_ENDPOINT'], api_key=creds['AZURE_OPENAI_KEY'], api_version="2024-02-01", model="gpt-4o-mini", temperature=0)

# llm = AzureChatOpenAI(azure_endpoint='https://azuse-mdpfnp0w-swedencentral.openai.azure.com/openai/deployments/gpt-4o-2/chat/completions?api-version=2025-01-01-preview', api_key=groq_api_key, api_version="2025-01-01-preview", model="gpt-4o-2", temperature=0)

# swarm_client = Swarm(client)

# Pattern 1: Reflection

Self-reflection - Ask the LLM to reflect on its own work to improve its answer.

Consider the following use case where the LLM is tasked to extract structured information from medical notes. However, instead of asking the LLM to directly provide the answer, we present the output from the generator LLM to a reflector LLM (in this case the same model). The feedback from the reflector is used by the generator to improve its answer.

In [6]:
MAIN_MODEL = CRITIQUE_MODEL = 'gemma2-9b-it'

The system message for the generator is below.

Notice how the system message here explicitly acknowledges that feedback might be provided and should be used to improve the answer.

In [7]:
medical_note_data = """
Medical Notes:
---
Patient Name: Ms. Krishnaveni
Age: 45 years
Gender: Female

Chief Complaint:
Ms. Krishnaveni presented with complaints of persistent abdominal pain, bloating, and changes in bowel habits over the past two months.

History of Present Illness:
Ms. Krishnaveni reports experiencing intermittent abdominal pain, predominantly in the lower abdomen, accompanied by bloating and alternating episodes of diarrhea and constipation. She describes the pain as crampy in nature, relieved partially by defecation but worsening after meals. There is no association with specific food items. She denies any rectal bleeding, unintended weight loss, or fever.

Past Medical History:
Ms. Krishnaveni has a history of irritable bowel syndrome (IBS), diagnosed five years ago, managed with dietary modifications and occasional use of over-the-counter antispasmodics.

Medications:
She occasionally takes over-the-counter antispasmodics for symptomatic relief of abdominal discomfort related to IBS.

Family History:
There is no significant family history of gastrointestinal disorders or malignancies.

Social History:
Ms. Krishnaveni is a non-smoker and does not consume alcohol. She works as a teacher in a local school.
"""

system_message = """
You are an expert assistant to a hospital administration team working on extracting important information from medical notes made by doctors.
Extract relevant information from the note presented by the user with the following schema.
- age: integer, age of the patient
- gender: string, can be one of male, female or other
- diagnosis: string, can be one of migraine, diabetes, arthritis and acne
- weight: integer, weight of the patient
- smoking: string, can be one of yes or no
Use information ONLY from the medical note to come up with the JSON output.

If you receive feedback from the user, use it to provide a revised version of your answer.
"""

reflection_system_message = """
You are an expert assistant to a hospital administration team who is tasked to generate critique and recommendations for output from an LLM.
The input will contain an attempt by an LLM to extract relevant information in a JSON format of a medical note presented further below
The LLM was instructed that the JSON output needs to be extracted according to the following schema.
- age: integer, age of the patient
- gender: string, can be one of male, female or other
- diagnosis: string, can be one of migraine, diabetes, arthritis and acne
- weight: integer, weight of the patient
- smoking: string, can be one of yes or no

When you review the LLM attempt ensure that your critique is in accordance with the above schema.
While you are checking the input entered by the user, check if the input contains only the JSON and no additional information.
Provide explicit feedback if you notice additional information apart from the JSON.
Do not provide any suggestions for the output; restrict yourself to feedback.
---
{medical_note_data}
"""

Now let us run the first generation.

In [8]:
first_response = client_groq.chat.completions.create(
    model=MAIN_MODEL,
    messages=[
        {'role': 'system', 'content': system_message},
        {'role': 'user', 'content': medical_note_data}
    ],
    temperature=0.2
).choices[0].message.content

print(first_response)

```json
{
  "age": 45,
  "gender": "Female",
  "diagnosis": null,
  "weight": null,
  "smoking": "no"
}
``` 


Let me know if you have any other notes you'd like me to analyze!



We will now present this output to the reflector that will present a critique according to our instructions above.

In [9]:
first_critique = client_groq.chat.completions.create(
    model=CRITIQUE_MODEL,
    messages=[
        {'role': 'system', 'content': reflection_system_message},
        {'role': 'user', 'content': first_response}
    ],
    temperature=0
).choices[0].message.content

print(first_critique)

The input contains extraneous information. 

Please provide only the JSON output. 



As can be seen from the above output, the reflector identified several issues with the output. We can now present this critique as feedback to the original generator so it can amend its response.

In [10]:
second_response = client_groq.chat.completions.create(
    model=MAIN_MODEL,
    messages=[
        {'role': 'system', 'content': system_message},
        {'role': 'user', 'content': medical_note_data},
        {'role': 'assistant', 'content': first_response},
        {'role': 'user', 'content': first_critique}
    ],
    temperature=0.2
).choices[0].message.content

print(second_response)

```json
{
  "age": 45,
  "gender": "Female",
  "diagnosis": null,
  "weight": null,
  "smoking": "no"
}
```


As we can see from the output above, the critique can be used to improve the response over a series of reflective interventions.

# Pattern 2: Tool Use

Let us see how tool use can augement LLM capabilities with a simple example. First, we beign by defining a series of Python functions that we then wrap as LangChain tools using the `@tool` decorator.

In [11]:
@tool
def add(a: int, b: int) -> int:
    """Adds a and b.

    Args:
        a: first int
        b: second int
    """
    return a + b


@tool
def subtract(a: int, b: int) -> int:
    """Subtract b from a

    Args:
        a: bigger int
        b: smaller int
    """
    return a - b


@tool
def multiply(a: int, b: int) -> int:
    """Multiplies a and b.

    Args:
        a: first int
        b: second int
    """
    return a * b

Note that the doc strings of the function describing what the functions do is a critical input parsed by the tool to understand which function needs to be called when a user input is received.

We then collect these tools into a dictionary with function names as the keys like so:

In [12]:
available_tools = {'add': add, 'multiply': multiply, 'subtract': subtract}

We can now create a basic agent by binding the LLM with these three tools.

In [13]:
client_groq_3 = ChatGroq(api_key=groq_api_key, model='openai/gpt-oss-20b')
agent_2 = client_groq_3.bind_tools(list(available_tools.values()))

Now our agent is capable of answering questions that can be resolved as evaluations of the three functions available to it as tools. Consider the following user query.

In [14]:
system_message = SystemMessage("Answer the question using the available tools only.")
query = HumanMessage("(3 * 12) > (11 + 49) ? what is absolute difference between the two ?")

messages=[system_message, query]
agent_2.invoke(messages)

AIMessage(content='', additional_kwargs={'reasoning_content': 'We need to compute (3 * 12) > (11 + 49). Compute 3*12=36. 11+49=60. So 36 > 60? No, false. But question: "(3 * 12) > (11 + 49) ? what is absolute difference between the two ?" So compute absolute difference between 36 and 60: |36-60| = 24. So answer: 24. We need to use tools? The question says "Answer the question using the available tools only." We need to use functions to compute. We need to compute 3*12 via multiply, then 11+49 via add, then subtract larger minus smaller or subtract. Let\'s do stepwise: multiply 3,12. Then add 11,49. Then subtract bigger minus smaller. Then output result. Use functions.', 'tool_calls': [{'id': 'fc_b71da6b2-6bc0-4228-8d81-76bc86728edf', 'function': {'arguments': '{"a":3,"b":12}', 'name': 'multiply'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 206, 'prompt_tokens': 259, 'total_tokens': 465, 'completion_time': 0.171271364, 'prompt_time': 0.016658837, 'que

In [15]:
agent_2.invoke(messages).tool_calls

[{'name': 'multiply',
  'args': {'a': 3, 'b': 12},
  'id': 'fc_1801fa29-d146-487f-bb4b-e48df847029b',
  'type': 'tool_call'}]

In [16]:
# Diagnostic: Check agent tool registration and show available tools
print('Agent tools:', agent_2.tools if hasattr(agent_2, 'tools') else 'No tools found')
print('Agent config:', agent_2)

# Correct invocation using LangChain message objects
from langchain_core.messages import HumanMessage, SystemMessage

system_message = SystemMessage(content="Answer the question using the available tools only.")
user_message = HumanMessage(content="(3 * 12) > (11 + 49) ? what is absolute difference between the two ?")
messages = [system_message, user_message]

response = agent_2.invoke(messages)
print('Agent response:', response)

Agent tools: No tools found
Agent config: bound=ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x7cedf9ac1d90>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x7cedf9a622a0>, model_name='openai/gpt-oss-20b', model_kwargs={}, groq_api_key=SecretStr('**********')) kwargs={'tools': [{'type': 'function', 'function': {'name': 'add', 'description': 'Adds a and b.\n\n    Args:\n        a: first int\n        b: second int', 'parameters': {'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 'required': ['a', 'b'], 'type': 'object'}}}, {'type': 'function', 'function': {'name': 'multiply', 'description': 'Multiplies a and b.\n\n    Args:\n        a: first int\n        b: second int', 'parameters': {'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 'required': ['a', 'b'], 'type': 'object'}}}, {'type': 'function', 'function': {'name': 'subtract', 'description': 'Subtract b from a\n\n    Args:\n        a: bigger int

Notice how the LLM behavior changed. Instead of answering the question correctly, it has composed a tool call output that is an intermediate step to answer the question from the user.

Specifically, it has correctly recognized that it has to call the functions `multiply` and `add` with the correct arguments in order to answer the user question.

Note that this is still a partial execution of a tool-calling agent. We will see an end-to-end execution of the tool-calling agent in a future session.

# Pattern 3: Planning

Planning agents utilize a specified algorithm to plan/structure their efforts to achieve a business objective. Let us see an example of a Reasoning and Action (ReAct) agent. We will take a much deeper look into ReAct agents in upcoming sessions.

In [None]:
react_prompt = hub.pull("hwchase17/react")
print(react_prompt.template)

As the above prompt template indicates, the LLM is asked to 'think' through before answering in a Thought/Action/Observation sequence till a final answer is reached.
With this prompt, let us now create a simple Python agent that will always use the Python interpretor to answer user queries.

In [None]:
python_repl = PythonREPL()
repl_tool = Tool(
    name="python_repl",
    description="A Python shell used to execute python commands. Input should be a valid python command.",
    func=python_repl.run,
)

As we have seen in previous examples, the `repl_tool` is basically a wrapper around the `python_repl` function. We can now create the ReAct agent by binding this `repl_tool` to the LLM like so:

In [None]:
react_agent = create_react_agent(llm=llm, tools=[repl_tool], prompt=react_prompt)

In `LangChain` agent execution is handled by executors that track the tool calls and execute them in dedicated threads.

In [None]:
react_agent_executor = AgentExecutor(agent=react_agent, tools=[repl_tool], verbose=True)

Let us now test our Python tool-calling agent with a non-trivial math problem.

In [None]:
user_input = "If USD 450 amounts to USD 630 in 6 years, what will it amount to in 2 years at the same interest rate?"
react_agent_executor.invoke({'input': user_input})

# Pattern 4: Multi-agent Collaboration

In one pattern of multi-agent collaboration called the `Triage` mode, a focal agent is tasked with handing-off tasks to appropriate agents. As an example, consider the following scenario where there are two agents - A & B. Agent A is the main agent that has two tools - a function to greet customers and to transfer control to Agent B. Agent B can only speak in Hindi, but can only be reached when Agent A hands-off control.

Whether the control needs to reach Agent B or not is decided by Agent A depending on the user query.

We will look at many more patterns of multi-agent collaborations in an upcoming session.

In [None]:
def transfer_to_agent_b():
    return agent_b


def transfer_to_agent_a():
    return agent_a


def greet_customer():
    return "Hello, how can I help you?"

In [None]:
agent_a = Agent(name="Agent A", instructions="You are a helpful agent.", model='gpt-4o-mini', functions=[transfer_to_agent_b, greet_customer])
agent_b = Agent(name="Agent B", instructions="Only speak in Hindi.", model='gpt-4o-mini', functions=[transfer_to_agent_a])

In [None]:
response = swarm_client.run(agent=agent_a, messages=[{"role": "user", "content": "I need to speak to Agent B"}],)
print(response.messages[-1]["content"])

In [None]:
len(response.messages)

In [None]:
for message in response.messages:
    print(message+'\n----')