<a href="https://colab.research.google.com/github/vishnupancharatnala/Agents-from-scratch/blob/main/agents_from_scratch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Agents works in Thought , Action and observation cycle
#### 1) Thought :

*   understanding the user request (requires natural understanding)
*   Reasoning (thinking about what it can do)
*   Planning to full fill the requirements






#### 2) Action :


*   interacting with enviroment to check what tools will full fill the task
*   Invoking the tools

#### 3) observation :


*   Output of the LLM
*   If the response is wrong or incorrect again it goes to thought
*   If the response is correct it gives as output


# **Thought implementation**

#### It's like brain of the agent ,it has to understand and reason accordingly
#### we use an LLM for this purpose
#### we have to give instructions in proper format
#### for agent to think we have give in prompt to think step by step.

#### for our agent we have to give prompt as
#### 1)role
#### 2)what it has to do
#### 3)Tools available
#### 4)thought action observation cycle


In [None]:
system_message="""You are an AI assistant designed to help users efficiently and accurately. Your primary goal is to provide helpful, precise, and clear responses.

You have access to the following tools:

Tool Name: calculator, Description: Multiply two integers., Arguments: a: int, b: int, Output: int

You should think tep by step in order to fulfill the objective with a reasoning divided into Thought/Action/Observation steps that can be repeated multiple times if needed.

You should first reflect on the current situation using Thought: (your_thoughts)', then (if necessary), call a tool with the proper JSON formatting Action: (JSON_BLOB},
or print your final answer starting with the prefix Final Answer:"""

#### The above is the format how we have to give instructions to agent.
#### Thought will be done using React (reason and act)
#### In the above if we observe for the tool we are manully giving Tool_name, description,arguments,output to invoke the tool
#### If there are many tools this will become difficult
#### using python introscopic function we can extract all these features from the function itself

In [None]:
# function to extract tool_name,description,arguments,output_type from the given function(tool)

import inspect

def extract_tool_info(func):
    # Get the function signature
    signature = inspect.signature(func)

    # Extract (param_name, param_annotation) pairs for inputs
    arguments = []
    for param in signature.parameters.values():
        annotation_name = (
            param.annotation.__name__
            if hasattr(param.annotation, '__name__')
            else str(param.annotation)
        )
        arguments.append((param.name, annotation_name))

    # To find the output type
    return_annotation = signature.return_annotation
    if return_annotation is inspect._empty:
        outputs = "No return annotation"
    else:
        outputs = (
            return_annotation.__name__
            if hasattr(return_annotation, '__name__')
            else str(return_annotation)
        )

    # extract the description
    description = func.__doc__ or "No description provided."

    # extract tool name ...here function name becomes tool name
    name = func.__name__

    return f"Tool Name: {name}, Description: {description}, Arguments: {arguments}, Output: {outputs}"


In [None]:

def calculator(a: int, b: int) -> int:
    """Multiply two integers."""
    return a * b

print(extract_tool_info(calculator))

Tool Name: calculator, Description: Multiply two integers., Arguments: [('a', 'int'), ('b', 'int')], Output: int


In [None]:
# but using the above function we have to manually pass the tool to the extract_tool_info function
# instead of this we use a decorator and wrap the tool inside that decorator
# inside the decorator we will all the tool info and pass this info to instance of Tool class where we write a method which we will invoke the function(tool) automatically

from typing import Callable


class Tool:
    """
    A class representing a reusable piece of code (Tool).

    Attributes:
        name (str): Name of the tool.
        description (str): A textual description of what the tool does.
        func (callable): The function this tool wraps.
        arguments (list): A list of argument.
        outputs (str or list): The return type(s) of the wrapped function.
    """
    def __init__(self,
                 name: str,
                 description: str,
                 func: Callable,
                 arguments: list,
                 outputs: str):
        self.name = name
        self.description = description
        self.func = func
        self.arguments = arguments
        self.outputs = outputs

    def to_string(self) -> str:
        """
        Return a string representation of the tool,
        including its name, description, arguments, and outputs.
        """
        args_str = ", ".join([
            f"{arg_name}: {arg_type}" for arg_name, arg_type in self.arguments
        ])

        return (
            f"Tool Name: {self.name},"
            f" Description: {self.description},"
            f" Arguments: {args_str},"
            f" Outputs: {self.outputs}"
        )

    def __call__(self, *args, **kwargs):
        """
        Invoke the underlying function (callable) with provided arguments.
        """
        return self.func(*args, **kwargs)

In [None]:
calculator_tool = Tool(
    "calculator",                   # name
    "Multiply two integers.",       # description
    calculator,                     # function to call
    [("a", "int"), ("b", "int")],   # inputs (names and types)
    "int",                          # output
)

In [None]:
# decorator code
import inspect

def tool(func):
    """
    A decorator that creates a Tool instance from the given function.
    """
    # Get the function signature
    signature = inspect.signature(func)

    # Extract (param_name, param_annotation) pairs for inputs
    arguments = []
    for param in signature.parameters.values():
        annotation_name = (
            param.annotation.__name__
            if hasattr(param.annotation, '__name__')
            else str(param.annotation)
        )
        arguments.append((param.name, annotation_name))

    # Determine the return annotation
    return_annotation = signature.return_annotation
    if return_annotation is inspect._empty:
        outputs = "No return annotation"
    else:
        outputs = (
            return_annotation.__name__
            if hasattr(return_annotation, '__name__')
            else str(return_annotation)
        )

    # Use the function's docstring as the description (default if None)
    description = func.__doc__ or "No description provided."

    # The function name becomes the Tool name
    name = func.__name__

    # Return a new Tool instance
    return Tool(
        name=name,
        description=description,
        func=func,
        arguments=arguments,
        outputs=outputs
    )

In [None]:
@tool
def calculator(a: int, b: int) -> int:
    """Multiply two integers."""
    return a * b

print(calculator.to_string())

Tool Name: calculator, Description: Multiply two integers., Arguments: a: int, b: int, Outputs: int


# **Action**
#### we are done with thought part ,where it will understand the user query , reason,plan,interact with env
#### now in the above system prompt we have seen giving agent the info about the tool
#### Now after knowing which tool to use and extracting those details using python introscopic functions we want our agent to invoke the Tools
#### Agent can't invoke the Tools directly, but it can generate the text so that the text is used to invoke the Tools. Tools are invoked by python controller
#### Generating the text to invoke the Tools is called Action

### For example :

#### User gives input query:

#### What is 6 times 7?

#### Agent Reasoning Output

```
#The agent replies with:

Thought: I need to multiply two numbers.
Action:
```json
{"tool": "calculator", "tool_input": {"a": 6, "b": 7}}

```


#### it’s not literally calling a Python function — it's saying:

#### “Hey, someone please run the calculator tool with a=6, b=7.”

#### That’s exactly like an API request:

#### The agent writes the request

#### You (as the developer) or the framework plays the role of the API server

#### The server reads the request, runs the logic, and sends back a response . Here Python Controller Executes It

#### Now the python controller takes this json string(text) given by agent and parses it (converts json string to dictionary kind of json)




```
# code :
import json

json_text = '{"tool": "calculator", "tool_input": {"a": 6, "b": 7}}'

#Parse it into a Python dict
parsed = json.loads(json_text)
print(parsed)

Output: {'tool': 'calculator', 'tool_input': {'a': 6, 'b': 7}}

#now we can use parsed['tool'] to get name of the tool

# all the tools will be there in tools registary
tools = ToolRegistry()
tools.add_tool(calculator)

#now python controller will look for the tool in the registary and call that tool

tool_name = action["tool"]
tool_input = action["tool_input"]

#Look up the tool from registry
tool_fn = tools.get_tool(tool_name)  # Returns the actual `calculator` function

#Call the function
result = tool_fn(**tool_input)  # result = calculator(a=6, b=7) => 42


```




# **Observation**  
#### now the python controller will pass the output of the Tool to agent as observation
#### Observation: 42

#### Now the agent continues reasoning:


```
Thought: Now I know the result of 6 * 7.
Final Answer: The result is 42.

```



#### 🧠 So to be super clear:

#### The agent never calls a Python function. It only outputs JSON saying “I want to use tool X with arguments Y”.

#### Your Python controller must:

#### Parse that JSON.

#### Look up the right tool.

#### Call it.

#### Return the result as an Observation.

# flow of the agent






In [None]:
"""

+-------------------+
|   User Input      |
|  "6 times 7?"     |
+---------+---------+
          |
          v
+-------------------------------+
|       Agent Model             |
| - Receives system prompt      |
| - Thinks using tools          |
+-------------------------------+
          |
          | Text output:
          v
   Thought: I need to multiply...
   Action: "{"tool": "calculator", "tool_input": {"a": 6, "b": 7}}"

          |
          v
+-------------------------------+
|     Python Controller         |
| - Parses JSON                 |
| - Finds tool                  |
| - Calls calculator(6, 7)      |
| - Gets result = 42            |
+-------------------------------+
          |
          v
Text injected back to agent:
Observation: 42

Then agent may say:
Final Answer: 42


"""

| Step              | What Happens                                | Who Does It       |
| ----------------- | ------------------------------------------- | ----------------- |
| Tool Definition   | You define tool functions and decorate them | Developer (you)   |
| Tool Registration | Tools are added to registry                 | Framework / You   |
| Prompt Generation | Tool info is added to system message        | Framework         |
| Reasoning         | Agent generates `Thought` and `Action`      | Agent model       |
| Tool Execution    | JSON parsed, tool invoked                   | Python controller |
| Result Injected   | `Observation` added back to agent           | Python controller |
| Final Answer      | Agent produces answer using tool result     | Agent model       |


<br>

---
<br>
<br>


#### Q3: What is a Python Controller?
#### This is very important.
#### A Python controller is the logic in your program that manages the whole agent loop.
#### It does things like:
#### Sends the user’s input to the agent model.
#### Gets back the agent’s response.
#### If the agent outputs Action: {...}, it:
#### Parses the JSON
#### Finds the tool (function)
#### Calls the tool
#### Gets the result
#### Sends it back as Observation: ...
#### Repeats until the agent gives a Final Answer.

In [None]:
# Pseudo-controller for demonstration purpose
def agent_loop(user_input):
    context = []  # Chat history

    while True:
        # 1. Send context to agent
        response = agent_model.generate(context + [user_input])

        if response.startswith("Final Answer:"):
            print(response)
            break

        elif response.startswith("Action:"):
            # 2. Extract JSON from the Action:
            json_block = extract_json(response)
            action = json.loads(json_block)

            tool_name = action["tool"]
            tool_args = action["tool_input"]

            # 3. Call tool
            tool_fn = registry.get_tool(tool_name)
            result = tool_fn(**tool_args)

            # 4. Add tool result as Observation
            observation = f"Observation: {result}"
            context.append(response)  # Agent's action
            context.append(observation)

        else:
            # Unexpected response
            print("Unexpected agent output:", response)
            break


# **Complete agent code from scratch**
####  🔧 What We're Building
#### ✅ A calculator(a, b) tool (you already have it)
#### ✅ A full agent prompt (LLM system prompt)
#### ✅ LLM call to generate Thought + Action (we stop before Observation:)
#### ✅ Controller executes tool call
#### ✅ Add the real observation
#### ✅ Continue LLM generation to get final answer



In [None]:
###########################    1. Tool Setup
# in tools.py
from typing import Callable
import inspect

class Tool:
    def __init__(self, name: str, description: str, func: Callable, arguments: list, outputs: str):
        self.name = name
        self.description = description
        self.func = func
        self.arguments = arguments
        self.outputs = outputs

    def __call__(self, *args, **kwargs):
        return self.func(*args, **kwargs)


def tool(func):
    signature = inspect.signature(func)
    arguments = [
        (param.name, param.annotation.__name__) for param in signature.parameters.values()
    ]
    return_type = signature.return_annotation.__name__ if signature.return_annotation else "Any"
    return Tool(name=func.__name__, description=func.__doc__, func=func, arguments=arguments, outputs=return_type)


@tool
def calculator(a: int, b: int) -> int:
    """Multiply two integers."""
    return a * b


In [None]:
################################## 🧠 2. System Prompt (LLM Instruction)

# in prompts.py
SYSTEM_PROMPT = """Answer the following questions as best you can. You have access to the following tools:

calculator: Multiply two integers. args: {"a": {"type": "int"}, "b": {"type": "int"}}

The way you use the tools is by specifying a json blob.
Specifically, this json should have an `action` key (with the name of the tool to use) and an `action_input` key (with the tool's inputs).

Use ONLY the following format:

Question: the input question you must answer
Thought: you should always think about one action to take
Action:

```json
{
  "action": "calculator",
  "action_input": { "a": 6, "b": 7 }
}

Observation: the result of the action. This Observation is unique, complete, and the source of truth.

... (repeat Thought/Action/Observation as needed)

You must always end your output with the following format:

Thought: I now know the final answer
Final Answer: the final answer to the original input question

Now begin! Reminder to ALWAYS use the exact characters Final Answer: when you provide a definitive answer."""





In [None]:
#################################### 🤖 3. LLM Generation
#  in agent_llm.py

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from prompts import SYSTEM_PROMPT

# LLaMA-style tokenizer (use another if you prefer)
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3-8B-Instruct")
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3-8B-Instruct", device_map="auto")

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

def build_prompt(user_question: str):
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": user_question},
    ]
    return tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)


def get_llm_action(prompt):
    response = pipe(prompt, max_new_tokens=200, stop=["Observation:"], return_full_text=False)[0]["generated_text"]
    return response



In [None]:
##################################### 🕹️ 4. Controller to Run Tool
# in controller.py

import json
import re
from tools import calculator
from agent_llm import build_prompt, get_llm_action

TOOLS = {
    "calculator": calculator,
}

def extract_json_block(text):
    # Get JSON inside ```json block
    match = re.search(r"```json\s*({.*?})\s*```", text, re.DOTALL)
    if match:
        return json.loads(match.group(1))
    raise ValueError("No JSON action block found")


def run_agent(question):
    # STEP 1: Create full prompt
    base_prompt = build_prompt(question)

    # STEP 2: Ask model what to do (up to Action)
    print("\n>>> LLM Generating Thought + Action ...")
    response = get_llm_action(base_prompt)
    print(response)

    # STEP 3: Parse JSON and run tool
    parsed = extract_json_block(response)
    tool_name = parsed["action"]
    tool_input = parsed["action_input"]
    result = TOOLS[tool_name](**tool_input)

    print(f"\n>>> Tool Output (Observation): {result}")

    # STEP 4: Build new prompt with Observation and continue
    final_prompt = base_prompt + response + f"\nObservation: {result}\n"
    final_response = get_llm_action(final_prompt)

    print("\n>>> Final Output:")
    print(final_response)


In [None]:
####################################### 🚀 5. Run the System
# in main.py
from controller import run_agent

if __name__ == "__main__":
    run_agent("What is 6 times 7?")

In [None]:
# ✅ Final Output Should Look Like:

# >>> LLM Generating Thought + Action ...
#
# Thought: I need to multiply 6 and 7 to get the answer.
# Action:
# ```json
# {
  # "action": "calculator",
  # "action_input": {
    # "a": 6,
    # "b": 7
  # }
# }

# Tool Output (Observation): 42

# Final Output:
# Thought: I now know the final answer.
# Final Answer: 42


---

## ✅ Summary

 | Component         | Purpose |
 |------------------|---------|
 | `tools.py`       | Tool definition (`calculator`) |
 | `prompts.py`     | LLM system prompt |
 | `agent_llm.py`   | Prompt builder + first step of LLM generation |
 | `controller.py`  | Executes reasoning cycle, extracts tool calls, and handles real observations |
 | `main.py`        | Entry point |


---

