### Implementing ReAct Agent with a Custom TAO Parser

When an LLM does not natively support structured function calling, or when more direct control over the "Thought," "Action," and "Observation" (TAO) flow is desired, a custom parser implementation becomes necessary. This approach relies on prompting the LLM to produce its reasoning and intended actions in a specific text format, which is then parsed by the agent's code.   

#### When to Use a Custom Parser

A custom TAO parser is suitable when:

- The chosen LLM lacks native function calling capabilities (e.g., many open-source models or older proprietary models).   
- A model-agnostic approach to tool use is preferred, allowing easier switching between different LLMs.
- Fine-grained control over the textual representation of thoughts and actions is required, - potentially deviating from standard function call JSON structures.
- The explicit generation of "Thought" strings before "Action" strings is a primary design goal, aligning closely with the original ReAct paper's emphasis on interpretability.

In [None]:
import os
from openai import AzureOpenAI
from dotenv import load_dotenv
import re
from tools import web_search, calculator, weather_search, get_current_time

SYSTEM_PROMPT_TEMPLATE = """You are an expert assistant designed to solve complex tasks by reasoning step-by-step and interacting with available tools.

Available Tools:
{available_tools}

Follow this format strictly:
Question: The user's input question.
Thought: Your reasoning about the current state, the goal, and what action to take next.
Action: The action to take. Choose one tool from the Available Tools list or use 'Final Answer'.
Action Input: The input to the action. If the action is 'Final Answer', this should be your final response to the user.
Observation: The result of the action. (This will be provided to you after the action is performed)

--- Example ---
Question: What is the capital of France and what is 2+2?
Thought: I need to find the capital of France and calculate 2+2. I will first find the capital of France using Search.
Action: Search
Action Input: capital of France
Observation: Paris is the capital of France.
Thought: Now I have the capital of France. Next, I need to calculate 2+2 using the Calculator.
Action: Calculator
Action Input: 2+2
Observation: 4
Thought: I have found the capital of France (Paris) and calculated 2+2 (4). I can now provide the final answer.
Action: Final Answer
Action Input: The capital of France is Paris, and 2+2 equals 4.
--- End Example ---

--- Previous Steps ---
{history}
--- Current Step ---
Question: {current_user_question}
Thought:"""

In [None]:
tools_map = {
    "web_search": web_search,
    "calculator": calculator,
    "weather_search": weather_search,
    "get_current_time": get_current_time,
}

def call_tool(tool_name: str, tool_input: str) -> str:
  if tool_name in tools_map:
    return tools_map[tool_name](tool_input)
  elif tool_name == "Final Answer":
    return f"Final Answer provided: {tool_input}"  # Or handle differently
  return f"Error: Unknown tool '{tool_name}'."

def parse_llm_response(response_text: str):
  thought = None
  action = None
  action_input = None

  # Normalize potential variations in capitalization for keywords
  thought_match = re.search(
      r"Thought:\s*(.*?)(?=\nAction:|\nFinal Answer:|$)", response_text, re.IGNORECASE | re.DOTALL)
  if thought_match:
    thought = thought_match.group(1).strip()

  final_answer_match = re.search(
      r"Action:\s*Final Answer\s*\nAction Input:\s*(.*)", response_text, re.IGNORECASE | re.DOTALL)
  if final_answer_match:
    action = "Final Answer"
    action_input = final_answer_match.group(1).strip()
    return thought, action, action_input

  # More specific match for Action and Action Input to avoid over-matching
  action_match = re.search(
      r"Action:\s*([a-zA-Z0-9_]+)\s*\nAction Input:\s*(.*?)(?=\nObservation:|\nThought:|$)", response_text, re.IGNORECASE | re.DOTALL)
  if action_match:
    action = action_match.group(1).strip()
    action_input = action_match.group(2).strip()
  else:  # Fallback for Action without explicit Action Input line, or simple Final Answer
    action_match_simple = re.search(
        r"Action:\s*(.*)", response_text, re.IGNORECASE)
    if action_match_simple:
      action_line = action_match_simple.group(1).strip()
      if action_line.lower() == "final answer":  # If LLM just says "Action: Final Answer"
        # Try to find a subsequent line as input, or assume thought is the answer
        # This part needs careful prompting to ensure LLM provides input for Final Answer
        # For simplicity here, we assume input is part of the thought or needs specific handling
        action = "Final Answer"
        action_input = thought  # Or a dedicated extraction if format is different
      else:
        action = action_line  # Assumes action name is the whole line
        action_input = ""  # No explicit input found

  return thought, action, action_input


# Function to generate the available tools description from the available_tools_map
def generate_tools_description(tools_map):
  """Generate a formatted description of available tools from the tools map."""
  tool_descriptions = []
  for tool_name, tool_function in tools_map.items():
      # Get the docstring of the function to use as description
    description = tool_function.__doc__ or f"Use this tool to {tool_name.lower()}."
    # Format: "- Tool Name: Description"
    tool_descriptions.append(f"- {tool_name}: {description.strip()}")

  return "\n".join(tool_descriptions)


def run_react_agent(user_query: str, llm_client, base_prompt_template: str, tools_map: dict, max_steps: int = 5):
  tools_description = generate_tools_description(tools_map)
  history = ""
  print(f"User: {user_query}\n")

  for step in range(max_steps):
    print(f"*** Step {step + 1} of {max_steps}: ***\n")
    # Construct the prompt for the LLM with the available tools
    prompt_to_llm = base_prompt_template.format(
        available_tools=tools_description,
        history=history,
        current_user_question=user_query
    )

    llm_response_text = llm_client.chat.completions.create(
        model= os.getenv("AZURE_OPENAI_DEPLOYMENT"),
        messages=[{"role": "user", "content": prompt_to_llm}],
        temperature=0.0,  # For more deterministic output
        max_tokens=4000,
        stop=["Observation:"]  # Important to stop before expecting observation
    ).choices[0].message.content.strip()

    print(f"LLM Raw Output:\n{llm_response_text}\n")

    thought, action, action_input = parse_llm_response(
        llm_response_text)

    print(f"Parsed Thought: {thought}")
    if action:
      print(f"Parsed Action: {action}")
      print(f"Parsed Action Input: {action_input}\n")
    else:
      print("No action parsed. Agent might be stuck or prompt needs adjustment.\n")
      # Potentially add the raw response as an observation of confusion
      observation = f"Agent seemed confused. LLM Output: {llm_response_text}"
      history += f"Thought: {thought}\nObservation: {observation}\n"
      continue

    if action == "Final Answer":
      print(f"Final Answer from Agent: {action_input}\n")
      return action_input

    observation = ""
    # Ensure action_input could be empty string for tools that don't need it
    if action and action_input is not None:
      observation = call_tool(action, action_input)
      print(f"Observation (tool output): {observation}\n")
    # Action present, but no input parsed (might be an issue or tool takes no input)
    elif action:
      # Call with empty string or handle as error
      observation = call_tool(action, "")
      print(f"Observation (tool output, no input): {observation}\n")
    else:  # No action parsed
      observation = "Error: No action was parsed from the LLM response."
      print(f"Observation (error): {observation}\n")

    # Update history for the next iteration
    history += f"Thought: {thought}\n"
    if action:  # Only include action/input if an action was parsed
      history += f"Action: {action}\n"
      history += f"Action Input: {action_input}\n"
    history += f"Observation: {observation}\n"  # Always include observation

  print("Max steps reached without a Final Answer.")
  return "Agent could not determine a final answer within the allowed steps."

In [6]:
load_dotenv()

# Set up Azure OpenAI client
client = AzureOpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    azure_deployment=os.getenv("AZURE_OPENAI_DEPLOYMENT"),
)

# Update the call to run_custom_react_agent to include available_tools_map
final_answer = run_react_agent(
    "Where was the Microsoft founder was born? And how old is he/she now?",
    client,
    SYSTEM_PROMPT_TEMPLATE,
    tools_map,
    max_steps=10)
print(f"Final Answer: {final_answer}")


User: Where was the Microsoft founder was born? And how old is he/she now?

*** Step 1 of 10: ***

LLM Raw Output:
To answer this, I need to identify the founder of Microsoft, find their birthplace, and calculate their current age. I will start by searching for the founder of Microsoft and their birthplace.
Action: web_search
Action Input: Microsoft founder birthplace and date of birth

Parsed Thought: None
Parsed Action: web_search
Parsed Action Input: Microsoft founder birthplace and date of birth

Observation (tool output): Search failed with status code 202

*** Step 2 of 10: ***

LLM Raw Output:
Thought: I need to find the birthplace and current age of the founder of Microsoft. The founder of Microsoft is Bill Gates. I will search for Bill Gates' birthplace and date of birth, then calculate his current age.
Action: web_search
Action Input: Bill Gates birthplace and date of birth

Parsed Thought: I need to find the birthplace and current age of the founder of Microsoft. The founder