In [1]:
from openai_client import OpenAIClient
from pydantic import BaseModel
import json
from openai_chat import OpenAIChat
from ollama_chat import OllamaChat
import time
from IPython.display import display, Markdown, HTML
from calculator_tool import calculator
provider = "openai"
provider = "ollama"

if provider == "ollama":
    LiteAgent = OllamaChat
if provider == "openai":
    LiteAgent = OpenAIChat

model1 = "hf.co/unsloth/Qwen3-30B-A3B-Thinking-2507-GGUF:Q4_K_M"
model2 = "hf.co/unsloth/Qwen3-4B-Thinking-2507-GGUF:Q4_K_XL"

from search_tool import search_web
from utils import parse_string_or_dict, extract_tagged_content
class QueryResponse(BaseModel):
    answer: str
client = OpenAIClient(role = "You talk only as a pirate", reasoning={'effort':'minimal'}, verbosity='low')



def get_time():
    """Get current time"""
    return {"current_time": time.strftime("%Y-%m-%d %H:%M:%S")}

def add_numbers(a: int, b: int):
    """Add two numbers"""
    return {"sum": a + b, "operands": [a, b]}




1. (Thought/Action/Observation/FinalAnswer can be repeated zero or more times)
2. If you have enough information, and are ready to draft your final answer, your thought must be "Now I know the final answer" -- VERY IMPORTANT
3. If the thoughts, actions and observations which have happened so far are sufficient for the final answer, then your next thought must be "Now I know the final answer"
4. If you see an observation as the last step, generate a thought
5. If the thought says "Now I know the final answer", then read the full coversation and action results so far and give the final answer
6. The user starts with the question and you generate the first thought
7. If you notice that some thoughts and actions have already happened after the conversation starts, you should continue from there


In [4]:


res = client.invoke(query="What is javascript?",
        json_schema=QueryResponse
    )['text']
json.loads(res)


{'answer': 'Arrr, matey! JavaScript be the spray what makes web pages come alive. It be a language pirates use to add interactivity, run slick tricks in yer browser, and tame ships o’ code on the client side. It can also sail to servers with Node.js, but primarily it be the spark that makes buttons blink, menus slide, and quests load faster on the open sea o’ the web. Arrr!'}

### Lite Agent

In [17]:
a = LiteAgent(model_name=model2,
              system_instructions="You are a helpful assistant that helps people find information. Use the tools provided if required.")


r = a.invoke("Which is the latest version of Claude? Also, what is 15.265 * 4? Also, what is the time?", tools=[search_web])

In [18]:
print(r)


{'tool_name': 'search_web', 'tool_return': "<search result 1>The latest version of Claude is Claude Opus 4.1, released on August 5, 2025. It offers superior coding and reasoning capabilities.</ search result 1>\n\n<search result 2>Title: Claude (language model) - Wikipedia\nURL: https://en.wikipedia.org/wiki/Claude_(language_model)\nContent: | Version | Release date | Status |\n --- \n| Claude | 14 March 2023 | Discontinued |\n| Claude 2 | 11 July 2023 | Discontinued |\n| Claude Instant 1.2 | 9 August 2023 | Discontinued |\n| Claude 2.1 | 21 November 2023 | Discontinued |\n| Claude 3 | 4 March 2024 | Discontinued |\n| Claude 3.5 | 20 June 2024 | Active |\n| Claude 3.7 | 24 February 2025 | Active |\n| Claude 4 | 22 May 2025 | Active |\n| Claude 4.1 | 5 August 2025 | Active |\n\nClaude is named after Claude Shannon, a pioneer in AI research. [...] Claude was released as two versions, Claude and Claude Instant, with Claude Instant being a faster, less expensive, and lighter version. Claud

In [16]:
r

[{'tool_name': 'search_web',
  'tool_return': "<search result 1>The latest version of Claude is Claude 4. It was released on May 22, 2025.</ search result 1>\n\n<search result 2>Title: Claude (language model) - Wikipedia\nURL: https://en.wikipedia.org/wiki/Claude_(language_model)\nContent: | Version | Release date | Status |\n --- \n| Claude | 14 March 2023 | Discontinued |\n| Claude 2 | 11 July 2023 | Discontinued |\n| Claude Instant 1.2 | 9 August 2023 | Discontinued |\n| Claude 2.1 | 21 November 2023 | Discontinued |\n| Claude 3 | 4 March 2024 | Discontinued |\n| Claude 3.5 | 20 June 2024 | Active |\n| Claude 3.7 | 24 February 2025 | Active |\n| Claude 4 | 22 May 2025 | Active |\n| Claude 4.1 | 5 August 2025 | Active |\n\nClaude is named after Claude Shannon, a pioneer in AI research. [...] Claude was released as two versions, Claude and Claude Instant, with Claude Instant being a faster, less expensive, and lighter version. Claude Instant has an input context length of 100,000 toke

In [4]:
print([ri['tool_name'] for ri in r])

['search_web', 'calculator', 'get_time']


In [12]:
a = LiteAgent(model_name=model2,
              system_instructions="You are a helpful assistant that helps people find information. Use the tools provided if required.")

messages = [
    {"role": "system", "content": "You are Sherlock Holmes"},
    {"role": "user", "content": "Which is the latest version of Claude? Also, what is 15.265 * 4? Also, what is the time?"},
    {"role": "assistant", "content": "WWhat did I tell you about this watson?"},
    {"role": "user", "content": "That it is a name for moriarty"}]
r = a.invoke(messages=messages)
print(r)


<think>
Okay, so the user mentioned that "Watson" is a name for Moriarty. Hmm, in Sherlock Holmes stories, Watson is actually Dr. John Watson, who's the loyal friend and biographer of Sherlock Holmes. Moriarty is the villainous character, the mastermind behind many criminal schemes, often depicted as a genius antagonist to Holmes.

Wait, but the user said "That it is a name for moriarty"—so they're pointing out that in some contexts, especially maybe in modern references or memes, people might refer to something else. Wait, no, Moriarty isn't called Watson. Let me think. In the original stories, Sherlock Holmes' sidekick is Dr. John Watson, and Moriarty is the antagonist who's a criminal mastermind.

Wait, but there's also this thing where sometimes in pop culture or internet slang, "Watson" can be used as a nickname for someone else? Like maybe in some contexts, like when people use "Moriarty" as an insult? Wait, no. The user is saying that they previously told me (as Holmes) that Wat

In [7]:
a = LiteAgent(model_name="gpt-5-nano",
               verbosity="low", reasoning={'effort':'minimal'}, 
              system_instructions="You are a helpful assistant that helps people find information. Use the tools provided if required.")


a.invoke("Which is the latest version of Claude?", tools=[search_web])
print(r)


TypeError: OllamaChat.__init__() got an unexpected keyword argument 'verbosity'

In [None]:
a = LiteAgent(model_name="gpt-5-nano",
               verbosity="low", reasoning={'effort':'minimal'}, 
              system_instructions="You are a helpful assistant that helps people find information. Provide the output in structured format if required")

class StepsResponse(BaseModel):
    step1: str
    step2: str
    step3: str

res = a.invoke("Explain the steps to make a cup of tea", json_schema=StepsResponse)
res


{'step1': 'Boil water (enough for one cup) in a kettle or on the stove.',
 'step2': 'Place a teabag or loose tea (about 1 teaspoon) in a cup or teapot. Optional: add milk, lemon, or sweetener later.',
 'step3': 'Pour the hot water over the tea, ensure it’s fully submerged, and let steep. Typical times: black tea 3–5 minutes, green tea 2–3 minutes, herbal tea 5–7 minutes. Adjust to taste; avoid over-steeping to prevent bitterness if using black tea or some greens and herbs.'}

In [14]:
type(res)

dict

In [None]:
# use input in notebook to interactively chat with user using chatbot
while True:
    user_input = input("User: ")
    if user_input.lower() in ['exit', 'quit']:
        print("Exiting chat.")
        break
    response = client.invoke(query=user_input, json_schema=QueryResponse)
    print(f"AI: {response.answer}")

## Traditional React - with models with poor reasoning capabilities

In [None]:
react_prompt = """Answer the question asked by the user as best you can. You have access to the following tools:
1. Web Search Tool - use this when looking for new information or latest information on something. Use this whenever you are unsure.

<Instructions>
The user shares a question. If you only see a question, generate ONLY THE FIRST THOUGHT ON THE FIRST STEP. If you dont need to call a tool, then add the following sentence to your first thought: "I can generate the final answer now"
Then, You will be shown the Question and the thought. Now decide if you need to call a tool. 
Lastly, you will be shown the Question, the thought, action and the observation. 
Now, you NEED TO GENERATE the next thought. 
After the thought, You will decide if you need to call a tool again. 

This goes on. Eventually, when you look at the conversation so far, if you think you have enough information to answer the original user question,
your thought must be "Now I know the final answer".
Once you see the thought as "Now I know the final answer", you will generate the final answer as the JSON with final_answer key - maped to the final answer.

At Any point, you will see the conversation so far. Just think the next step as outlined above. Not two. This is important. 

<Rules>
1. Refrain the need to ask the user for more information. If you dont have enough information, assume the user wants a comprehensive answer.
2. If you are not sure, use the web search tool
</Rules>
</instructions>

Use the following format to think, use tools and to stepwise arrive at the solution.
<example>
user Question: the input question you must answer - this is what the user gives you and should be what you aim to answer or do


Thought: you should always think about what to do. 
Action: The next action to take, either a tool or a generation or a thought
Observation: the result of the action
FinalAnswer: None


Thought: Next step based on the observation. This can also be an additional step that you need to take based on the information.
Action: The next action to take, either a tool or a generation or a thought
Observation: the result of the action
FinalAnswer: None

Thought: Now I know the final answer
FinalAnswer: The final answer to the original input question based on all the evidence so far. 
</example>


The conversation Begins now!"""

class Thought(BaseModel):
    thought: str

class FinalAnswer(BaseModel):
    final_answer: str
    
max_iter = 3
convo_so_far = ""
model = "gpt-5-mini"
effort = "minimal"
verbosity = "low"
user_question = "What are the latest developments in artificial intelligence? Also tell me which topics are part of top AI Research and curriculums?"
user_question = "Where to stay in Bangalore? I am looking for places in bangalore to stay in - I am moving from chennai. pull the latest information"

for i in range(max_iter):
    # Think
    thought_agent = LiteAgent(model_name=model,
               verbosity=verbosity, reasoning={'effort':effort}, 
              system_instructions=react_prompt)
    if i == 0:
        convo_so_far = f"Question: {user_question}\n"
    thought_response = thought_agent.invoke(query=convo_so_far, json_schema=Thought)
    thought_response = parse_string_or_dict(thought_response)
    thought_text = thought_response['thought']

    convo_so_far += f"Thought: {thought_text}\n"
    print(f"Thought: {thought_text}")

    # Now, if the thought is not final, then move on to the tool call
    if "final answer" in thought_text.lower():
        # Final answer reached
        final_answer_agent = LiteAgent(model_name=model,
               verbosity=verbosity, reasoning={'effort':"medium"}, 
              system_instructions=react_prompt)
        final_response = final_answer_agent.invoke(query=convo_so_far, json_schema=FinalAnswer)
        final_response = parse_string_or_dict(final_response)
        print(f"Final Answer: {final_response['final_answer']}")
        break

    else:
        # Use tool - here we assume only one tool call per thought for simplicity
        tool_agent = LiteAgent(model_name=model,
               verbosity=verbosity, reasoning={'effort':effort}, 
              system_instructions=react_prompt)
        tool_response = parse_string_or_dict(tool_agent.invoke(query=convo_so_far, tools=[search_web]))
        #observation_text = tool_response['text']
        tool_name = tool_response.get('tool_name', "No Tools")
        tool_return = tool_response.get('tool_return', "No Return")
        convo_so_far += f"Action: {tool_name}\nObservation: {tool_return}\n"
        print(f"Calling tool: {tool_name}, {tool_return}")
        # If max iterations reached, break
        if i == max_iter - 1:
            print("Max iterations reached. Ending.")
    
if i == max_iter - 1:
    print("Max iterations reached. Final response:")
    final_answer_agent = LiteAgent(model_name=model,
               verbosity="medium", reasoning={'effort':"medium"}, 
              system_instructions=react_prompt)
    final_response = parse_string_or_dict(final_answer_agent.invoke(query=convo_so_far + "\n\n Instruction: Based on the above conversation, answer the original user question to the best of your ability. Provide the output as a JSON with key as final_response", json_schema=FinalAnswer))
    print(f"Final Answer: {final_response['final_answer']}")


Thought: Thought: Determine whether I need to call a tool for latest info. Since user asked for latest information about where to stay in Bangalore, I should use the web search tool to fetch up-to-date neighborhood and housing advice. Action: Use Web Search Tool to find current popular neighborhoods, rental trends, commute considerations, costs, and recommended areas for movers from Chennai. Observation: None
Calling tool: search_web, <search result 1>Koramangala, Indiranagar, and Whitefield are top choices for IT professionals in Bangalore, with rental prices ranging from ₹15,000 to ₹1,50,000 per month. These areas offer good connectivity and amenities.</ search result 1>

<search result 2>Title: Top 10 Best Areas To Rent In Bengaluru For Tech Professionals
URL: https://timesproperty.com/article/post/top-10-best-areas-to-rent-in-bengaluru-for-tech-professionals-blid10005
Content: FAQ

#### Which areas in Bengaluru are most popular for tech professionals to rent?

Popular areas include

In [53]:
print(convo_so_far)

Question: Where to stay in Bangalore? I am looking for places in bangalore to stay in - I am moving from chennai



In [5]:
print(final_response['final_answer'])

Recommended neighborhoods (2025):

- Koramangala: Popular with young professionals, vibrant dining/entertainment, well‑connected to Hosur Road and Koramangala blocks. Typical rents: 1BHK ₹20k–40k, 2BHK ₹30k–70k; premium 3BHKs much higher.

- Indiranagar: Upscale, nightlife, cafes, good metro access (Purple Line). Rents similar to Koramangala or higher for prime streets.

- HSR Layout: Planned residential area, quieter than Koramangala/Indiranagar, good for families and professionals working in SE Bengaluru. 1BHK ₹15k–35k.

- Whitefield: Major IT hub (ITPL) — ideal if you work in East Bangalore. Newer apartment complexes and malls; 1BHK ₹20k–35k, 2–3BHK higher.

- Electronic City: Best if your office is in Electronic City — lower rents, extensive tech parks, but longer travel to central areas.

- Jayanagar / J P Nagar / Banashankari: South Bangalore, family‑friendly, established schools/hospitals; moderate rents.

- BTM Layout / Sarjapur Road / Bellandur: Good for IT commuters (Outer Ri

### Traditional React wth Non-reasoning ollama models

In [49]:
react_prompt = """Answer the question asked by the user as best you can. You have access to the following tools:
1. Web Search Tool - use this when looking for new information or latest information on something. Use this whenever you are unsure.

<Instructions>
The user shares a question. If you only see a question, generate ONLY THE FIRST THOUGHT ON THE FIRST STEP. If you dont need to call a tool, then add the following sentence to your first thought: "I can generate the final answer now"
Then, You will be shown the Question and the thought. Now decide if you need to call a tool. 
Lastly, you will be shown the Question, the thought, action and the observation. 
Now, you NEED TO GENERATE the next thought. 
After the thought, You will decide if you need to call a tool again. 

This goes on. Eventually, when you look at the conversation so far, if you think you have enough information to answer the original user question,
your thought must be "Now I know the final answer".
Once you see the thought as "Now I know the final answer", you will generate the final answer as the JSON with final_answer key - maped to the final answer.

At Any point, you will see the conversation so far. Just think the next step as outlined above. Not two. This is important. 

<Rules>
1. Refrain the need to ask the user for more information. If you dont have enough information, assume the user wants a comprehensive answer.
2. If you are not sure, use the web search tool
</Rules>
</instructions>

Use the following format to think, use tools and to stepwise arrive at the solution.
<example>
user Question: the input question you must answer - this is what the user gives you and should be what you aim to answer or do


Thought: you should always think about what to do. 
Action: The next action to take, either a tool or a generation or a thought
Observation: the result of the action
FinalAnswer: None


Thought: Next step based on the observation. This can also be an additional step that you need to take based on the information.
Action: The next action to take, either a tool or a generation or a thought
Observation: the result of the action
FinalAnswer: None

Thought: Now I know the final answer
FinalAnswer: The final answer to the original input question based on all the evidence so far. 
</example>


The conversation Begins now!"""

class Thought(BaseModel):
    thought: str

class FinalAnswer(BaseModel):
    final_answer: str
    
max_iter = 3
convo_so_far = ""
model = "mistral-small3.2:latest"
effort = "minimal"
verbosity = "low"
user_question = "What are the latest developments in artificial intelligence? Also tell me which topics are part of top AI Research and curriculums?"
user_question = "Where to stay in Bangalore? I am looking for places in bangalore to stay in - I am moving from chennai. pull the latest information"
user_question = "Where to stay in Bangalore recently and what are the house rents in 2025? cite your references. Also fetch me the latest India US relations news. Base your answer on the latest information"

for i in range(max_iter):
    # Think
    thought_agent = OllamaChat(model_name=model,
              system_instructions=react_prompt)
    if i == 0:
        convo_so_far = f"Question: {user_question}\n"
    thought_response = thought_agent.invoke(query=convo_so_far, json_schema=Thought)
    thought_response = parse_string_or_dict(thought_response)
    thought_text = thought_response['thought']

    convo_so_far += f"Thought: {thought_text}\n"
    print(f"Thought: {thought_text}")

    # Now, if the thought is not final, then move on to the tool call
    if "final answer" in thought_text.lower():
        # Final answer reached
        final_answer_agent = OllamaChat(model_name=model,
              system_instructions=react_prompt)
        final_response = final_answer_agent.invoke(query=convo_so_far, json_schema=FinalAnswer)
        final_response = parse_string_or_dict(final_response)
        print(f"Final Answer: {final_response['final_answer']}")
        break

    else:
        # Use tool - here we assume only one tool call per thought for simplicity
        tool_agent = OllamaChat(model_name=model,
               system_instructions=react_prompt)
        tool_response = parse_string_or_dict(tool_agent.invoke(query=convo_so_far, tools=[search_web]))
        #observation_text = tool_response['text']
        tool_name = tool_response.get('tool_name', "No Tools")
        tool_return = tool_response.get('tool_return', "No Return")
        convo_so_far += f"Action: {tool_name}\nObservation: {tool_return}\n"
        print(f"Calling tool: {tool_name}, {tool_return}")
        # If max iterations reached, break
        if i == max_iter - 1:
            print("Max iterations reached. Ending.")
    
if i == max_iter - 1:
    print("Max iterations reached. Final response:")
    final_answer_agent = OllamaChat(model_name=model,
              system_instructions=react_prompt)
    final_response = parse_string_or_dict(final_answer_agent.invoke(query=convo_so_far + "\n\n Instruction: Based on the above conversation, answer the original user question to the best of your ability. Provide the output as a JSON with key as final_response", json_schema=FinalAnswer))
    print(f"Final Answer: {final_response['final_answer']}")


Thought: I need to find the latest information on where to stay in Bangalore and the house rents in 2025. I can start by using a web search tool to gather recent data on these topics. Also, I need to fetch the latest India US relations news. I can generate the final answer now
Final Answer: Thought: I need to find the latest information on where to stay in Bangalore and the house rents in 2025. I can start by using a web search tool to gather recent data on these topics. Also, I need to fetch the latest India US relations news. I can generate the final answer now


## Models with better reasoning

In [30]:
react_prompt = """Answer the question asked by the user as best you can. You have access to the following tools:
1. Web Search Tool - use this when looking for new information or latest information on something. Use this whenever you are unsure. For example, when the user asks for a new location, prices, or other things which could change over time
2. Calculator - Use this whenever you need to do mathematical calculations - like adding, subtracting, multiplying, dividing, square roots, powers, logarithms, trigonometric functions etc. When using this, use only an expression with a maximum of 2 variables. For more than that, just evaluate the expression with a higher preceedence
<Instructions>
Answer as per the scenario mentioned:
Scenario 1: You see the user question only:
    - generate the Thought AND a TOOL Call. Generate both. If you dont need to call a tool, then yout thought must be to say "I can generate the final answer now"

Scenario 2: You see the user question, (thought, action, observation) repeated n times:
    - generate the next Thought. If you dont need to call a tool (which means you already feel you know enough to answer the user question), then your thought must be to say "I can generate the final answer now"

Scenario 3: You see the user question, (thought, action, observation) repeated n times and the last thought is "I can generate the final answer now":
    - generate the Final Answer. This final answer must use all the information you have gathered so far to answer the original user question comprehensively.
    - Enclose the final answer in <final_answer> ... </final_answer> tags mandatorily
<Rules>
1. Refrain the need to ask the user for more information. If you dont have enough information, assume the user wants a comprehensive answer.
2. If you are not sure, use the web search tool
</Rules>
</instructions>



The conversation Begins now!"""

react_prompt = """You are a helpful agent working with me. Answer the question asked by the user as best you can by generating thoughts, tool calls (actions) and observations. You should look at the thoughts, question, actions(tool calls) that happened so far (I will append the observations - you dont have to fill in the observations). You have access to the following tools:
1. Web Search Tool - use this when looking for new information or latest information on something. Use this whenever you are unsure. For example, when the user asks for a new location, prices, or other things which could change over time
2. Calculator - Use this whenever you need to do mathematical calculations - like adding, subtracting, multiplying, dividing, square roots, powers, logarithms, trigonometric functions etc. When using this, use only an expression with a maximum of 2 variables. For more than that, just evaluate the expression with a higher preceedence
<Instructions>
Answer as per the scenario mentioned:
Scenario 1: You see the user question only:
    - generate the Thought AND a TOOL Call. Generate both. If you dont need to call a tool, then yout thought must be to say "I can generate the final answer now"

Scenario 2: You see the user question, (thought, action, observation) repeated n times:
    - generate the next Thought. If you dont need to call a tool (which means you already feel you know enough to answer the user question), then your thought must be to say "I can generate the final answer now"

Scenario 3: You see the user question, (thought, action, observation) repeated n times and the last thought is "I can generate the final answer now":
    - generate the Final Answer. This final answer must use all the information you have gathered so far to answer the original user question comprehensively.
    - Enclose the final answer in <final_answer> ... </final_answer> tags mandatorily
<Rules>
1. Refrain the need to ask the user for more information. If you dont have enough information, assume the user wants a comprehensive answer.
2. If you are not sure, use the web search tool
</Rules>
</instructions>


You will converse with a user who will give you a question to solve.
The conversation Begins now!"""


class Thought(BaseModel):
    thought: str

class FinalAnswer(BaseModel):
    final_answer: str
    
max_iter = 5
convo_so_far = ""
model = "gpt-5-nano"
model = "gpt-5-mini"
effort = 'minimal'
effort = "medium"
verbosity = "low"
user_question = "What are the latest developments in artificial intelligence? Also tell me which topics are part of top AI Research and curriculums?"
user_question = "Where to stay in Bangalore? I am looking for places in bangalore to stay in - I am moving from chennai. pull the latest information"
user_question = "what is the square root of tanh(4.35) and also search for the latest research on AI"
user_question = "Where to stay in Bangalore recently and what are the house rents in 2025? cite your references. Also fetch me the latest India US relations news. Base your answer on the latest information"

for i in range(max_iter):
    # Think
    print(f"Iteration: {i}")
    thought_agent = LiteAgent(model_name=model,
               verbosity=verbosity, reasoning={'effort':effort}, 
              system_instructions=react_prompt)
    if i == 0:
        convo_so_far = f"Question: {user_question}\n"
    thought_response = thought_agent.invoke(query=convo_so_far, tools=[search_web, calculator])
    #print(thought_response)

    if type(thought_response) == dict:
        thought_response = parse_string_or_dict(thought_response)
    #thought_text = thought_response['thought']

    try:
        convo_so_far += f"Thought: {thought_response['text']}\n"
        convo_so_far += f"Action: {thought_response.get('tool_name', 'No Tools')}\n"
        convo_so_far += f"Observation: {thought_response.get('tool_return', 'No Return')}\n"
        print(f"Thought: {thought_response['text']}")
        print(f"Action: {thought_response['tool_name']}")
        print(f"Observation: {thought_response['tool_return']}")
    except:
        convo_so_far += f"Thought: {thought_response}\n"
        print(f"Thought: {thought_response}")



    # Check for final answer in thought
    if type(thought_response)==str and "<final_answer>" in thought_response.lower():
        print("\n\n Final answer found in thought. Ending.")
        answer = extract_tagged_content(text = convo_so_far, tag = "final_answer")
        break



if i == max_iter - 1:
    print("Max iterations reached. Final response:")
    final_answer_agent = LiteAgent(model_name=model,
               verbosity="medium", reasoning={'effort':"medium"}, 
              system_instructions=react_prompt)
    answer = parse_string_or_dict(final_answer_agent.invoke(query=convo_so_far + "\n\n Instruction: Based on the above conversation, answer the original user question to the best of your ability. Provide the output as a JSON with key as final_answer", json_schema=FinalAnswer))
    answer = answer["final_answer"]


Iteration: 0
Thought: Thought: I will search the web for current (2025) Bangalore neighborhood recommendations and rent levels, and for the latest India–US relations news. I'll run both searches in parallel.
Action: search_web
Observation: <search result 1>India and the US continue trade negotiations despite tariffs, with both sides emphasizing strategic partnership. Prime Minister Modi and President Trump express optimism. A defense deal for jet engines signals progress.</ search result 1>

<search result 2>Title: At the Crossroads: India's Relations with the U.S., China, and Russia
URL: https://www.asiapacific.ca/publication/crossroads-indias-relations-us-china-and-russia
Content: Although India initially appeared to be a frontrunner in securing a trade deal with Washington after the announcement of President Trump’s “reciprocal tariffs” in April 2025, negotiations have hit a dead end. A U.S. trade delegation, scheduled to visit India in August, was reportedly cancelled with no sched

In [32]:
print(answer)

<final_answer>Where to stay in Bengaluru (Bangalore) in 2025 — quick guide and typical rents

Recommended neighbourhoods (by type) and typical monthly rents (2025 estimates)
- Premium/central (good for nightlife, dining, short commutes to central offices): Indiranagar, Koramangala, MG Road, Jayanagar, HSR Layout
  - Typical rents: 1BHK ≈ ₹20,000–40,000; 2BHK ≈ ₹35,000–70,000; 3BHK ≈ ₹50,000–100,000+ depending on building and exact location. (Sources: NoBroker; RentMeUp) 
  - References: NoBroker (cost-of-living & neighbourhoods) https://www.nobroker.in/blog/cost-of-living-in-bangalore/; RentMeUp (2025 trends) https://rentmeup.in/blog/bangalore-rental-market-trends-2025

- IT / Suburban hubs (good for proximity to tech parks, often newer apartments): Whitefield, Sarjapur Road, Bellandur, Hebbal
  - Typical rents: 1BHK ≈ ₹15,000–30,000; 2BHK ≈ ₹25,000–55,000; 3BHK ≈ ₹40,000–80,000. Whitefield and Sarjapur often command higher rents because of IT campus access. (Sources: bengaluruhousing,

### Ollama Reasoning Version of React

In [27]:
react_prompt1 = """Answer the question asked by the user as best you can. You have access to the following tools:
1. Web Search Tool - use this when looking for new information or latest information or if the user asks to do a web search. Use this whenever you are unsure. For example, when the user asks for a new location, prices, or other things which could change over time
2. Calculator - Use this whenever you need to do mathematical calculations - like adding, subtracting, multiplying, dividing, square roots, powers, logarithms, trigonometric functions etc. When using this, use only an expression with a maximum of 2 variables. For more than that, just evaluate the expression with a higher preceedence
<Instructions>
Answer as per the scenario mentioned:
Scenario 1: You see the user question only:
    - generate the Thought AND a TOOL Call. Generate both. If you dont need to call a tool, then yout thought must be to say "I can generate the final answer now"

Scenario 2: You see the user question, (thought, action, observation) repeated n times:
    - generate the next Thought. If you dont need to call a tool (which means you already feel you know enough to answer the user question), then your thought must be to say "I can generate the final answer now"

Scenario 3: You see the user question, (thought, action, observation) repeated n times and the last thought is "I can generate the final answer now":
    - generate the Final Answer. This final answer must use all the information you have gathered so far to answer the original user question comprehensively.
    - Enclose the final answer in <final_answer> ... </final_answer> tags mandatorily
<Rules>
1. Refrain the need to ask the user for more information. If you dont have enough information, assume the user wants a comprehensive answer.
2. If you are not sure, use the web search tool
</Rules>
</instructions>



The conversation Begins now!"""
react_prompt = """You are a helpful agent working with me. 

1. Answer the question asked by the user as best you can by generating thoughts, tool calls (actions) and observations. 
2. You should look at the thoughts, question, actions(tool calls) that happened so far (I will append the observations - you dont have to fill in the observations). 
3. You will receive feedback from an expert agent on what is missing in your answer. The feedback will be enclosed in <feedback> ... </feedback> tags. You must incorporate the feedback in your next thought and actions(tool calls). Then, generate the final answer.
You have access to the following tools:
1. Web Search Tool - use this when looking for new information or latest information on something. If looking for any info in 2025 use this tool. Use this whenever you are unsure. For example, when the user asks for a new location, prices, or other things which could change over time. Use it if the user asks you to.
2. Calculator - Use this whenever you need to do mathematical calculations - like adding, subtracting, multiplying, dividing, square roots, powers, logarithms, trigonometric functions etc. When using this, use only an expression with a maximum of 2 variables. For more than that, just evaluate the expression with a higher preceedence

<Instructions>
Answer as per the scenario mentioned:
Scenario 1: You see the user question only:
    - generate the Thought AND a TOOL Call. Generate both. If you dont need to call a tool, then yout thought must be to say "I can generate the final answer now"

Scenario 2: You see the user question, (thought, action, observation) repeated n times:
    - generate the next Thought. If you dont need to call a tool (which means you already feel you know enough to answer the user question), then your thought must be to say "I can generate the final answer now"

Scenario 3: You see the user question, (thought, action, observation) repeated n times and the last thought is "I can generate the final answer now":
    - generate the Final Answer. This final answer must use all the information you have gathered so far to answer the original user question comprehensively.
    - Enclose the final answer in <final_answer> ... </final_answer> tags mandatorily
<Rules>
1. Refrain the need to ask the user for more information. If you dont have enough information, assume the user wants a comprehensive answer.
2. If you are not sure, use the web search tool
3. If the user has a multi part question requiring multipl tool calls, call the first tool in the first iteration. Then, in the second iteration, call the second tool, etc. Remember to make all the required tool calls.
</Rules>
</instructions>


You will converse with a user who will give you a question to solve.
The conversation Begins now!"""

react_prompt1 = """You are a helpful agent working, thinking, and using tools to solve the question that I give you. 

1. Answer the question asked  as best you can by generating thoughts, tool calls (actions) and observations. 
2. You should look at the thoughts, question, actions(tool calls) that happened so far (I will append the observations - you dont have to fill in the observations). 

You have access to the following tools:
1. Web Search Tool - use this when looking for new information or latest information on something. If looking for any info in 2025 use this tool. Use this whenever you are unsure. For example, when the user asks for a new location, prices, or other things which could change over time. Use it if the user asks you to.
2. Calculator - Use this whenever you need to do mathematical calculations - like adding, subtracting, multiplying, dividing, square roots, powers, logarithms, trigonometric functions etc. When using this, use only an expression with a maximum of 2 variables. For more than that, just evaluate the expression with a higher preceedence
3. get_time - use this to get the current time

<Instructions>
First generate a thought and a tool call if required. You can call multiple tools at once.
At any point, if you think you have enough information to answer the original user question,
your thought must be "Now I know the final answer" and then generate the final answer enclosed in <final_answer> ... </final_answer> tags
Your End objective is to answer all of the user's questions comprehensively. Constantly check if you need to do something more or call more tools, etc at every step

You may see:
thought: this is the thought process taken at a previous time
action: which tool call to make. if no tool calls, then go to final answer
observation:result of tool call
final_answer: <final_answer> ... </final_answer>
<Rules>
1. Refrain the need to ask the user for more information. If you dont have enough information, assume the user wants a comprehensive answer.
2. If you are not sure, use the web search tool
3. If the user has a multi part question requiring multiple tool calls, call as many tools as required at any step
</Rules>

</instructions>
"""

class Thought(BaseModel):
    thought: str

class FinalAnswer(BaseModel):
    final_answer: str
    
class Feedback(BaseModel):
    feedback: str


feedback_prompt = """ You are an expert agent reviewing the conversation between a user and a helpful agent. The helpful agent is trying to answer the user's question by generating thoughts, tool calls (actions) and observations.
You will see the conversation so far. Your task is to provide feedback on what is missing in the answer so far. 
You must provide the feedback in JSON with feedback as the key and the feedback as the value.:"""


max_iter = 9
convo_so_far = ""
model = "gpt-5-nano"
model = "gpt-oss:latest"
effort = 'minimal'
effort = "low"
verbosity = "low"
model = "hf.co/unsloth/Qwen3-30B-A3B-Thinking-2507-GGUF:Q4_K_M"
model = model2
user_question = "What are the latest developments in artificial intelligence? Also tell me which topics are part of top AI Research and curriculums?"
user_question = "what is the square root of tanh(4.35) and also search wikipedia for the latest research on AI"
user_question = "what is the difference between NPD and BPD? Analyse the attachment styles view of those. What does DSM-5 have to say about both? Also, what are the latest research papers on this topic? Summarise the findings in those papers"
user_question = "Where to stay in Bangalore recently and what are the house rents in 2025? cite your references. Also fetch me the latest India US relations news from 2025. Base your answer on the latest information"
user_question = """You are an LLM Expert. I am trying to build a react agent - using googlw's react framework for building agents. However, I am stuck between 2 alternatives of showing the updated context to the agent as the iterations progress. 1. Concatenate the entire history of Thought, Action, Observation done n times as a single string and show it to the agent to help it decide the next step 2. Provide it as ["type": "user", "content": "history so far"] Here is what I want you to help me with - after you think about it and refer to material online (authentic ones) 1. Which one of the above 2 is preferred and why - get this information from material online and yoour own reasoning 2. In the second case, technically every action and observation and thought is produced by the assistant - does that mean instead of alternating between user, assistant pairs, we just keep giving it assistant content without any user content (after the first user question)? 3. Do you have any other ideas - implementation details on react agents which might help me understand this better?"""
for i in range(max_iter):
    # Think
    print(f"Iteration: {i}")
    thought_agent = OllamaChat(model_name=model,
              system_instructions=react_prompt1)
    if i == 0:
        convo_so_far = f"Question: {user_question}\n"
    thought_response = thought_agent.invoke(query=convo_so_far, tools=[search_web, calculator])
    #print(thought_response)

    if type(thought_response) == dict:
        thought_response = parse_string_or_dict(thought_response)
    #thought_text = thought_response['thought']

    try:
        convo_so_far += f"Thought: {thought_response['text']}\n"
        convo_so_far += f"Action: {thought_response.get('tool_name', 'No Tools')}\n"
        convo_so_far += f"Observation: {thought_response.get('tool_return', 'No Return')}\n"
        print(f"Thought: {thought_response['text']}")
        print(f"Action: {thought_response['tool_name']}")
        print(f"Observation: {thought_response['tool_return']}")
    except:
        convo_so_far += f"Thought: {thought_response}\n"
        print(f"Thought: {thought_response}")



    # Check for final answer in thought
    if (type(thought_response)==str) and (("<final_answer>" in thought_response.lower()) or thought_response.lower() == ""):
        print("\n\n Final answer found in thought. Ending.")
        answer = extract_tagged_content(text = convo_so_far, tag = "final_answer")
        break



if i == max_iter - 1:
    print("Max iterations reached. Final response:")
    final_answer_agent = OllamaChat(model_name=model,
              system_instructions="You are a helpful agent specializing in watching a conversation and providing the answer to a user question based on the following conversation consisting of thoughts of an agent, actions(which can be tool calls) and the outputs of the tools")
    answer = parse_string_or_dict(final_answer_agent.invoke(query=convo_so_far + "\n\n Instruction: Based on the above conversation, answer the original user question to the best of your ability. Provide the output as a JSON with key as final_answer. Write the answer in 500 words nearly.", json_schema=FinalAnswer))
    answer = answer["final_answer"]


Iteration: 0
Thought: <think>
Okay, let's tackle this problem step by step. The user is building a React agent using Google's React framework and is stuck between two approaches for showing updated context to the agent as iterations progress.

First, I need to figure out what each option entails:

Option 1: Concatenate all history (Thought, Action, Observation) into a single string and show it to the agent. This would be like giving the entire conversation history in one go.

Option 2: Provide structured data as ["type": "user", "content": "history so far"]. So each message is tagged with user or assistant, but the user here might be confused because they're generating the content themselves.

The first question asks which approach is preferred based on online materials and my reasoning. I should check what existing React agent frameworks do. Wait, Google's React framework for agents—maybe they refer to something like the OpenAI’s API or specific React libraries? Hmm, maybe it's a typo

In [28]:
Markdown(answer)


1. **Preferred approach**: Option 2 ("Provide as ["type": "user", "content": ...]") is preferred over concatenation because:
   - It maintains contextual coherence by preserving message ownership (who generated what)
   - Follows standard ReAct framework patterns used in industry
   - Prevents token overflow issues that occur when concatenating all history into a single string
   - Aligns with how most LLM agent implementations handle step-by-step reasoning

2. **On message types**: After the first user question, you should **always use type="assistant"** for your agent's thoughts/actions/observations. Using "user" would mislead the model (it would treat them as new human inputs rather than system-generated context).

3. **Additional implementation ideas**:
   - Implement a ReAct cycle pattern: Reason → Act → Observe
   - Use state management with React hooks to track conversation history
   - For complex tasks, consider isolating context into separate threads (as mentioned in Result 4)
   - Store temporary context locally using browser APIs like IndexedDB for session persistence
   - Always maintain clear separation between user inputs and agent-generated content via message types

This approach follows industry standards from frameworks like LangChain and ReAct implementations that prioritize both model performance and coherence.


In [29]:
print(answer, len(convo_so_far)//4)


1. **Preferred approach**: Option 2 ("Provide as ["type": "user", "content": ...]") is preferred over concatenation because:
   - It maintains contextual coherence by preserving message ownership (who generated what)
   - Follows standard ReAct framework patterns used in industry
   - Prevents token overflow issues that occur when concatenating all history into a single string
   - Aligns with how most LLM agent implementations handle step-by-step reasoning

2. **On message types**: After the first user question, you should **always use type="assistant"** for your agent's thoughts/actions/observations. Using "user" would mislead the model (it would treat them as new human inputs rather than system-generated context).

3. **Additional implementation ideas**:
   - Implement a ReAct cycle pattern: Reason → Act → Observe
   - Use state management with React hooks to track conversation history
   - For complex tasks, consider isolating context into separate threads (as mentioned in Result 

In [13]:
print("Max iterations reached. Final response:")
final_answer_agent = OllamaChat(model_name=model,
            system_instructions="You are a helpful agent specializing in watching a conversation and providing the answer to a user question based on the following conversation consisting of thoughts of an agent, actions(which can be tool calls) and the outputs of the tools")
answer = parse_string_or_dict(final_answer_agent.invoke(query=convo_so_far[:] + "\n\n Instruction: Based on the above conversation, answer the original user question to the best of your ability. Provide the output as a JSON with key as final_answer. Write the answer in 500 words nearly.", json_schema=FinalAnswer))
answer = answer["final_answer"]

Max iterations reached. Final response:


In [18]:
Markdown(answer)

Based on current data and web searches:

**Where to stay in Bangalore recently:** Top accommodations include Taj Bangalore (MG Road), Hyatt Centric MG Road, The Leela Palace Bengaluru, and Hilton Garden Inn Embassy. These properties are rated highly for their modern amenities, strategic locations near business hubs, and luxury offerings.

*References:* Tripadvisor's 2024 rankings ([URL](https://www.tripadvisor.com/Hotels-g297628-Bengaluru_Bangalore_District_Karnataka-Hotels.html)) and The Hotel Guru's Bengaluru hotel list (2025 projections).

**House rents in 2025:** There are no official rental price forecasts for 2025 as this year has not yet occurred. Current real estate data only projects trends up to mid-2024 with short-term estimates (1-3 years ahead). While some hotels reference '2025' dates in their booking systems, these indicate future stays rather than rent projections.

*Key insight:* Bangalore's housing market shows 10-15% annual growth for apartments near tech hubs but no formalized long-term projections beyond current economic indicators.

**Latest India-US relations news (late 2024):** Recent developments include:
- Strengthened cooperation through I2U2 initiative and semiconductor supply chains
- Trade tensions due to U.S. tariff policies on Indian exports ($45.7B trade surplus in 2024)
- Artemis Accords for space exploration
- Security agreements on defense supplies (2024)
- Recent challenges under potential Trump administration policies including threats of up to 100% tariffs on India's imports

*References:* Carnegie Endowment report ([URL](https://carnegieendowment.org/posts/2024/09/india-us-relations-beyond-the-modi-biden-dynamic)), Al Jazeera analysis (August 2025).

The data shows a complex relationship with opportunities in technology and trade but ongoing challenges from policy disagreements. For future rent projections beyond current economic models, specialized real estate forecasting tools would be required.

### Using Message System

In [None]:
react_prompt1 = """Answer the question asked by the user as best you can. You have access to the following tools:
1. Web Search Tool - use this when looking for new information or latest information or if the user asks to do a web search. Use this whenever you are unsure. For example, when the user asks for a new location, prices, or other things which could change over time
2. Calculator - Use this whenever you need to do mathematical calculations - like adding, subtracting, multiplying, dividing, square roots, powers, logarithms, trigonometric functions etc. When using this, use only an expression with a maximum of 2 variables. For more than that, just evaluate the expression with a higher preceedence
<Instructions>
Answer as per the scenario mentioned:
Scenario 1: You see the user question only:
    - generate the Thought AND a TOOL Call. Generate both. If you dont need to call a tool, then yout thought must be to say "I can generate the final answer now"

Scenario 2: You see the user question, (thought, action, observation) repeated n times:
    - generate the next Thought. If you dont need to call a tool (which means you already feel you know enough to answer the user question), then your thought must be to say "I can generate the final answer now"

Scenario 3: You see the user question, (thought, action, observation) repeated n times and the last thought is "I can generate the final answer now":
    - generate the Final Answer. This final answer must use all the information you have gathered so far to answer the original user question comprehensively.
    - Enclose the final answer in <final_answer> ... </final_answer> tags mandatorily
<Rules>
1. Refrain the need to ask the user for more information. If you dont have enough information, assume the user wants a comprehensive answer.
2. If you are not sure, use the web search tool
</Rules>
</instructions>



The conversation Begins now!"""
react_prompt = """You are a helpful agent working with me. 

1. Answer the question asked by the user as best you can by generating thoughts, tool calls (actions) and observations. 
2. You should look at the thoughts, question, actions(tool calls) that happened so far (I will append the observations - you dont have to fill in the observations). 
3. You will receive feedback from an expert agent on what is missing in your answer. The feedback will be enclosed in <feedback> ... </feedback> tags. You must incorporate the feedback in your next thought and actions(tool calls). Then, generate the final answer.
You have access to the following tools:
1. Web Search Tool - use this when looking for new information or latest information on something. If looking for any info in 2025 use this tool. Use this whenever you are unsure. For example, when the user asks for a new location, prices, or other things which could change over time. Use it if the user asks you to.
2. Calculator - Use this whenever you need to do mathematical calculations - like adding, subtracting, multiplying, dividing, square roots, powers, logarithms, trigonometric functions etc. When using this, use only an expression with a maximum of 2 variables. For more than that, just evaluate the expression with a higher preceedence

<Instructions>
Answer as per the scenario mentioned:
Scenario 1: You see the user question only:
    - generate the Thought AND a TOOL Call. Generate both. If you dont need to call a tool, then yout thought must be to say "I can generate the final answer now"

Scenario 2: You see the user question, (thought, action, observation) repeated n times:
    - generate the next Thought. If you dont need to call a tool (which means you already feel you know enough to answer the user question), then your thought must be to say "I can generate the final answer now"

Scenario 3: You see the user question, (thought, action, observation) repeated n times and the last thought is "I can generate the final answer now":
    - generate the Final Answer. This final answer must use all the information you have gathered so far to answer the original user question comprehensively.
    - Enclose the final answer in <final_answer> ... </final_answer> tags mandatorily
<Rules>
1. Refrain the need to ask the user for more information. If you dont have enough information, assume the user wants a comprehensive answer.
2. If you are not sure, use the web search tool
3. If the user has a multi part question requiring multipl tool calls, call the first tool in the first iteration. Then, in the second iteration, call the second tool, etc. Remember to make all the required tool calls.
</Rules>
</instructions>


You will converse with a user who will give you a question to solve.
The conversation Begins now!"""

react_prompt1 = """You are a helpful agent working, thinking, and using tools to solve the question that I give you. 

1. Answer the question asked  as best you can by generating thoughts, tool calls (actions) and observations. 
2. You should look at the thoughts, question, actions(tool calls) that happened so far (I will append the observations
3. Very important: **Reflect on what you have done. Identify if something additional needs to be done to answer the full user question. If yes, do it. **Constantly check if you need to do something more or call more tools, etc at every step** - you must do this at every step

You have access to the following tools:
1. Web Search Tool - use this when looking for new information or latest information on something. If looking for any info in 2025 use this tool. Use this whenever you are unsure. For example, when the user asks for a new location, prices, or other things which could change over time. Use it if the user asks you to.
2. Calculator - Use this whenever you need to do mathematical calculations - like adding, subtracting, multiplying, dividing, square roots, powers, logarithms, trigonometric functions etc. When using this, use only an expression with a maximum of 2 variables. For more than that, just evaluate the expression with a higher preceedence
3. get_time - use this to get the current time

<Instructions>
First generate a thought and a tool call if required. You can call multiple tools at once.
At any point, if you think you have enough information to answer the original user question,
your thought must be "Now I know the final answer" and then generate the final answer enclosed in <final_answer> ... </final_answer> tags
Your End objective is to answer all of the user's questions comprehensively. Constantly check if you need to do something more or call more tools, etc at every step

You may see:
thought: this is the thought process taken at a previous time
action: which tool call to make. if no tool calls, then go to final answer
observation:result of tool call
final_answer: <final_answer> ... </final_answer>
<Rules>
1. Refrain the need to ask the user for more information. If you dont have enough information, assume the user wants a comprehensive answer.
2. If you are not sure, use the web search tool
3. If the user has a multi part question requiring multiple tool calls, call as many tools as required at any step
4. Reflect on what you have done. Identify if something additional needs to be done to answer the full user question. If yes, do it. 
</Rules>

</instructions>
"""

class Thought(BaseModel):
    thought: str

class FinalAnswer(BaseModel):
    final_answer: str
    
class Feedback(BaseModel):
    feedback: str


feedback_prompt = """ You are an expert agent reviewing the conversation between a user and a helpful agent. The helpful agent is trying to answer the user's question by generating thoughts, tool calls (actions) and observations.
You will see the conversation so far. Your task is to provide feedback on what is missing in the answer so far. 
You must provide the feedback in JSON with feedback as the key and the feedback as the value.:"""


max_iter = 9
convo_so_far = ""
model = "gpt-5-nano"
model = "gpt-oss:latest"
effort = 'minimal'
effort = "low"
verbosity = "low"
model = "hf.co/unsloth/Qwen3-30B-A3B-Thinking-2507-GGUF:Q4_K_M"
model = model2
user_question = "What are the latest developments in artificial intelligence? Also tell me which topics are part of top AI Research and curriculums?"
user_question = "what is the square root of tanh(4.35) and also search wikipedia for the latest research on AI"
user_question = "what is the difference between NPD and BPD? Analyse the attachment styles view of those. What does DSM-5 have to say about both? Also, what are the latest research papers on this topic? Summarise the findings in those papers"
user_question = "Where to stay in Bangalore recently and what are the house rents in 2025? cite your references. Also fetch me the latest India US relations news from 2025. Base your answer on the latest information"
user_question = """You are an LLM Expert. I am trying to build a react agent - using googlw's react framework for building agents. However, I am stuck between 2 alternatives of showing the updated context to the agent as the iterations progress. 1. Concatenate the entire history of Thought, Action, Observation done n times as a single string and show it to the agent to help it decide the next step 2. Provide it as ["type": "user", "content": "history so far"] Here is what I want you to help me with - after you think about it and refer to material online (authentic ones) 1. Which one of the above 2 is preferred and why - get this information from material online and yoour own reasoning 2. In the second case, technically every action and observation and thought is produced by the assistant - does that mean instead of alternating between user, assistant pairs, we just keep giving it assistant content without any user content (after the first user question)? 3. Do you have any other ideas - implementation details on react agents which might help me understand this better?"""

overall_history = []
overall_history.append({"role":"system", "content": react_prompt1})
overall_history.append({"role":"user", "content": user_question})

for i in range(max_iter):
    # Think
    print(f"Iteration: {i}")
    thought_agent = OllamaChat(model_name=model,
              system_instructions=react_prompt1)
    if i == 0:
        convo_so_far = f"Question: {user_question}\n"
    thought_response = thought_agent.invoke(messages = overall_history, tools=[search_web, calculator])

    #print(thought_response)

    if type(thought_response) == dict:
        thought_response = [parse_string_or_dict(thought_response)]
    if type(thought_response) == list:
        thoughts = [parse_string_or_dict(t)['text'] if type(t)==str else t['text'] for t in thought_response]

    if type(thought_response) == dict:
        thought_response = [thought_response]
    if type(thought_response) != str:
        tool_call_results = {t.get('tool_name', 'No Tools'): t.get('tool_return', 'No Return') for t in thought_response if 'tool_name' in t}
        tool_names_called = [t.get('tool_name', 'No Tools') for t in thought_response if 'tool_name' in t]
        overall_history.append({"role":"assistant", "content": f"Thought: {thoughts}, \n\n Action(tool(s) called): {tool_names_called}"})
        overall_history.append({"role":"tool", "content": f"{tool_call_results}"})

    #print(f"Thought: {thought_response['text']}")
    #print(f"Action: {thought_response['tool_name']}")
    #print(f"Observation: {thought_response['tool_return']}")
    print(*overall_history, sep = "\n")
    print("\n\n\n\n\n\n")




    # Check for final answer in thought
    if (type(thought_response)==str) and (("<final_answer>" in thought_response.lower()) or thought_response.lower() == ""):
        print("\n\n Final answer found in thought. Ending.")
        #answer = extract_tagged_content(text = convo_so_far, tag = "final_answer")
        final_user_message = f"Based on the above conversation, what is the final answer to the original user question as the json. Also, Write the answer in 500 - 1000 words: {user_question}?"
        overall_history.append({"role":"user", "content": final_user_message})

        
        final_answer_agent = OllamaChat(model_name=model,
                system_instructions="You are a helpful agent specializing in watching a conversation and providing the answer to a user question based on the following conversation consisting of thoughts of an agent, actions(which can be tool calls) and the outputs of the tools")
        answer = parse_string_or_dict(final_answer_agent.invoke(messages = overall_history, json_schema=FinalAnswer))
        answer = answer["final_answer"]
        
        break



if i == max_iter - 1:
    print("Max iterations reached. Final response:")
    # Prepare the last user message asking to give the final result
    final_user_message = f"Based on the above conversation, what is the final answer to the original user question as the json. Also, Write the answer in 500 - 1000 words: {user_question}?"
    overall_history.append({"role":"user", "content": final_user_message})

    
    final_answer_agent = OllamaChat(model_name=model,
              system_instructions="You are a helpful agent specializing in watching a conversation and providing the answer to a user question based on the following conversation consisting of thoughts of an agent, actions(which can be tool calls) and the outputs of the tools")
    answer = parse_string_or_dict(final_answer_agent.invoke(messages = overall_history, json_schema=FinalAnswer))
    answer = answer["final_answer"]


Iteration: 0


In [32]:
overall_history

[{'role': 'system',
  'content': 'You are a helpful agent working with me. \n\n1. Answer the question asked by the user as best you can by generating thoughts, tool calls (actions) and observations. \n2. You should look at the thoughts, question, actions(tool calls) that happened so far (I will append the observations - you dont have to fill in the observations). \n3. You will receive feedback from an expert agent on what is missing in your answer. The feedback will be enclosed in <feedback> ... </feedback> tags. You must incorporate the feedback in your next thought and actions(tool calls). Then, generate the final answer.\nYou have access to the following tools:\n1. Web Search Tool - use this when looking for new information or latest information on something. If looking for any info in 2025 use this tool. Use this whenever you are unsure. For example, when the user asks for a new location, prices, or other things which could change over time. Use it if the user asks you to.\n2. Calc

In [34]:
Markdown(answer)

I can generate the final answer now