- Install Dependencies and Export  gemini api key
- Get gemini api key from [Google AI Studio](https://aistudio.google.com/app/api-keys)

In [None]:
!uv pip install -q -U google-genai

!export GEMINI_API_KEY = <Gemini-API-KEY>

# 1. Introduction

## 1.1 LLM Inference
![LLM-Inference](./images/1.1-LLM-Inference.png)

In [31]:
from google import genai
from google.genai import types

client = genai.Client()
config = types.GenerateContentConfig(
    system_instruction="You are a helpful, knowledgeable assistant that can answer any question. Answer any question the user asks in 1-2 sentences."
)

def respond(history, config=config) -> str:
    response = client.models.generate_content(
        model="gemini-2.0-flash",
        contents=history,
        config = config
    )
    return response


respond("What are the must-see places in Pokhara?").text

"Some must-see places in Pokhara include Phewa Lake, where you can enjoy boating and lakeside views, and the World Peace Pagoda, offering panoramic views of the Himalayas and the city. Don't miss Devi's Fall, a unique waterfall, and the serene Begnas Lake for a peaceful retreat.\n"

## 1.2 The Loop (Birth of Chatbot)
![LLM-Inference](./images/1.2-The-Loop.png)


In [None]:
while True:
    query = input()
    print("User:", query)
    
    if not query or query == "exit":
        break

    print("AI:", respond(query).text)

User: What are the must-see places in Pokhara?
AI: Some must-see places in Pokhara include Phewa Lake, World Peace Pagoda, and Sarangkot. You should also consider visiting Devi's Fall, Gupteshwor Cave, and the International Mountain Museum.

User: I prefer peaceful spots - can you adjust the plan?
AI: I can definitely adjust the plan to prioritize peaceful spots. To do this best, could you tell me what kind of activities you enjoy in peaceful settings and where you're planning to go?



- It only have current query so it cannot modify the previous response properly. Lets give it memory in context

## 1.3 Conversation History (Memory Emerges)
![](images/1.3-History.png)


In [None]:
history = []
    
while True:
    query = input()
    print("User: ", query)
    
    if query.lower() in ['exit', 'quit', '']:
        break
    
    # Add user query to history
    history.append({"role": "user", "parts": [{"text": query}]})
    
    # Get response
    response = respond(history).text
    print(f"LLM: {response}\n")
    
    # Add model response to history
    history.append({"role": "model", "parts": [{"text": response}]})


User:  What are the must-see places in Pokhara?
LLM: Some must-see places in Pokhara include Phewa Lake, the World Peace Pagoda, and the Begnas Lake. You might also enjoy Devi's Fall and the Gupteshwor Cave.


User:  I prefer peaceful spots — can you adjust the plan?
LLM: For peaceful spots in Pokhara, consider visiting the World Peace Pagoda for panoramic views and serenity, or head to the less crowded Begnas Lake for a tranquil boating experience. You might also enjoy exploring the serene monasteries and temples around the valley.


User:  What is the weather in Pokhara right now?
LLM: Unfortunately, I do not have real-time access to weather information. You can check a reliable weather app or website for the current weather conditions in Pokhara.


User:  


- we can see it does not have information about the weather.

## 1.4 Tools (Agency Appears)
- Lets write a function that takes `place_name` and returns a random weather

![](images/1.4-Tool-Use.png)


In [23]:
# Imports

from google import genai
from google.genai import types
import random

client = genai.Client()

In [24]:
# 1. Tool Implementation
def get_weather(place_name):
    """Return dummy random weather for the given place_name."""
    
    print(f"Checking weather for '{place_name}'...")

    conditions_list = ["Sunny", "Cloudy", "Rainy", "Thunderstorm", "Snowy", "Windy", "Foggy"]
    
    temp_min = random.randint(10, 25)       # minimum temperature
    temp_max = random.randint(temp_min, temp_min + 15)  # max is at least min
    
    return {
        "location": place_name,
        "temp_max": str(temp_max),
        "temp_min": str(temp_min),
        "conditions": random.choice(conditions_list)
    }

get_weather("Pokhara")

Checking weather for 'Pokhara'...


{'location': 'Pokhara',
 'temp_max': '31',
 'temp_min': '21',
 'conditions': 'Thunderstorm'}

- Lets give this function as tool to our LLM
- refer to [gemini-api-docs](https://ai.google.dev/gemini-api/docs/function-calling) for more information on function calling

- Below code is modified version of template given in [gemini-api-docs](https://ai.google.dev/gemini-api/docs/function-calling) page.

In [25]:
# 2. Tool definitions
weather_function = {
    "name": "get_weather",
    "description": "Gets weather forecast for a specified location",
    "parameters": {
        "type": "object",
        "properties": {
            "place_name": {
                "type": "string",
                "description": "Name of the city or location"
            }
        },
        "required": ["place_name"]
    }
}

tools = types.Tool(function_declarations=[weather_function])
config = types.GenerateContentConfig(
    tools=[tools],
    system_instruction="You are a general-purpose helpful assistant. Answer any question the user asks in 1-2 short sentences. You have a weather tool available - use it ONLY when users specifically ask about weather/temperature/forecast. For all other questions, answer directly without using tools."
)

- And lets add another function to generate response with tool use.

In [29]:
# 3. Giving tool to LLM
def respond(history, config):
    # print(f"\n\n history:{history}")
    response = client.models.generate_content(
        model="gemini-2.5-flash",
        contents=history,
        config=config
    )
    
    candidate = response.candidates[0]
    if not candidate.content or not candidate.content.parts:
        return "I couldn't generate a response."
    
    first_part = candidate.content.parts[0]
    
    # Check if it's a function call
    if hasattr(first_part, 'function_call') and first_part.function_call:
        function_call = first_part.function_call
        
        if function_call.name == "get_weather":
            result = get_weather(**function_call.args)
            
            # Add function call results to history
            history.append({"role": "model", "parts": [{"function_call": function_call}]})
            history.append({"role": "user", "parts": [{"function_response": {
                "name": "get_weather",
                "response": result
            }}]})
            
            # Re-invokek the LLM
            # print(f"\n\n history:{history}")
            final_response = client.models.generate_content(
                model="gemini-2.5-flash",
                contents=history,
                config=config
            )
            
            if final_response.candidates and final_response.candidates[0].content.parts:
                return final_response.text
            return "I couldn't generate a response with the weather data."
    
    # Check if it's text
    if hasattr(first_part, 'text'):
        return response.text
    
    return "I couldn't understand the response."

In [30]:
# 4. The Loop
history = []

while True:
    query = input()
    print("User: ", query)
    
    if query.lower() in ['exit', 'quit', '']:
        break
    
    # Add user query to history
    history.append({"role": "user", "parts": [{"text": query}]})
    
    # Get response
    response = respond(history, config)
    print(f"LLM: {response}\n")
    
    # Add model response to history
    history.append({"role": "model", "parts": [{"text": response}]})


User:  What are the must-see places in Pokhara?
LLM: Pokhara is known for its stunning natural beauty. Some must-see places include Phewa Lake, Sarangkot, Davis Falls, and the World Peace Pagoda.

User:  I prefer peaceful spots — can you adjust the plan?
LLM: For a more peaceful experience in Pokhara, consider spending more time around Phewa Lake, visiting the World Peace Pagoda for serene views, and exploring the less crowded areas of Begnas Lake. You might also enjoy a quiet boat ride on Phewa Lake early in the morning.

User:  What is the weather in Pokhara right now?
Checking weather for 'Pokhara'...
LLM: The weather in Pokhara right now is rainy, with temperatures ranging from 14 to 15 degrees Celsius.

User:  


- **Exercise:** Replace get_weather function with below code and see if it gets actual weather

```
import requests

def get_weather(place_name):
    """Get weather for given place_name."""
    
    print("Checking weather...")

    response = requests.get(f"https://wttr.in/{place_name}?format=j1")
    data = response.json()
    tomorrow = data["weather"][1]
    
    return {
        "location": place_name,
        "temp_max": tomorrow["maxtempC"],
        "temp_min": tomorrow["mintempC"],
        "conditions": tomorrow["hourly"][0]["weatherDesc"][0]["value"]
    }
```

# 2. Coding Agent: A Minimal Claude-like CLI

- In this section, we will:
  1.  implement tools to: read, write, list files
  2. Define tools for gemini (as instructed in [gemini-api-docs](https://ai.google.dev/gemini-api/docs/function-calling) page.)
  3. Write `respond` function that can call the tool call (similar to last section)
  4. Put it in loop


In [85]:
# Imports
from google import genai
from google.genai import types
import os

In [86]:
## 2.1 Tool implementations
def read_file(path, offset=0, limit=None):
    """Read file with line numbers."""
    
    print("Read file:  {path}, offset: {} weather...")

    
    with open(path, 'r') as f:
        lines = f.readlines()
    
    if limit is None:
        limit = len(lines)
    
    selected = lines[offset : offset + limit]
    return "".join(f"{offset + idx + 1:4}| {line}" for idx, line in enumerate(selected))


def write_file(path, content):
    """Write content to file."""
    with open(path, 'w') as f:
        f.write(content)
    return "ok"


def list_directory(path):
    """List files and directories."""
    items = os.listdir(path)
    result = []
    for item in sorted(items):
        full_path = os.path.join(path, item)
        if os.path.isdir(full_path):
            result.append(f"[DIR]  {item}")
        else:
            result.append(f"[FILE] {item}")
    return "\n".join(result)
 

In [87]:
## 2.2 Tool definitions
read_function = {
    "name": "read_file",
    "description": "Read file with line numbers (file path, not directory)",
    "parameters": {
        "type": "object",
        "properties": {
            "path": {"type": "string", "description": "File path to read"},
            "offset": {"type": "number", "description": "Starting line number (0-indexed)"},
            "limit": {"type": "number", "description": "Number of lines to read"}
        },
        "required": ["path"]
    }
}

write_function = {
    "name": "write_file",
    "description": "Write content to file",
    "parameters": {
        "type": "object",
        "properties": {
            "path": {"type": "string", "description": "File path to write"},
            "content": {"type": "string", "description": "Content to write"}
        },
        "required": ["path", "content"]
    }
}

list_function = {
    "name": "list_directory",
    "description": "List files and directories at path",
    "parameters": {
        "type": "object",
        "properties": {
            "path": {"type": "string", "description": "Directory path to list"}
        },
        "required": ["path"]
    }
}

tools = types.Tool(function_declarations=[read_function, write_function, list_function])
config = types.GenerateContentConfig(
    tools=[tools],
    system_instruction=f"Concise coding assistant. cwd: {os.getcwd()}"
)

In [None]:
## 2.3 Respond function (that can call the tool)
def respond(history, config):
    response = client.models.generate_content(
        model="gemini-2.5-flash",
        contents=history,
        config=config
    )
    
    if not response.candidates:
        return "I couldn't generate a response."
    
    candidate = response.candidates[0]
    if not candidate.content or not candidate.content.parts:
        return "I couldn't generate a response."
    
    first_part = candidate.content.parts[0]
    
    # Check if it's a function call
    if hasattr(first_part, 'function_call') and first_part.function_call:
        function_call = first_part.function_call
        
        # Execute the appropriate function
        try:
            if function_call.name == "read_file":
                result = read_file(**function_call.args)
            elif function_call.name == "write_file":
                result = write_file(**function_call.args)
            elif function_call.name == "list_directory":
                result = list_directory(**function_call.args)
            else:
                result = f"Unknown function: {function_call.name}"
        except Exception as err:
            result = f"error: {err}"
        
        # Add function call and result to history
        history.append({"role": "model", "parts": [{"function_call": function_call}]})
        history.append({"role": "user", "parts": [{"function_response": {
            "name": function_call.name,
            "response": {"result": result}
        }}]})
        
        # Get final response with the tool result
        final_response = client.models.generate_content(
            model="gemini-2.5-flash",
            contents=history,
            config=config
        )
        
        if final_response.candidates and final_response.candidates[0].content.parts:
            return final_response.text
        return "I couldn't generate a response with the tool result."
    
    # Check if it's text
    if hasattr(first_part, 'text'):
        return response.text
    
    return "I couldn't understand the response."

In [None]:
## 2.4 The Loop
history = []
    
while True:
    query = input()
    print(f"User: {query}")
    if query.lower() in ['exit', 'quit', '']:
        break
    
    # Add user query to history
    history.append({"role": "user", "parts": [{"text": query}]})
    
    # Get response
    response = respond(history, config)
    print(f"LLM: {response}\n")
    
    # Add model response to history
    history.append({"role": "model", "parts": [{"text": response}]})


User: create file add.py with function to add two numbers
LLM: File add.py created with function to add two numbers.

User: create file substract.py with function to substract two numbers
LLM: File substract.py created with function to substract two numbers.

User: create file math.py that imports these two functions and perform demo calculation
LLM: File math.py created. It imports `add` and `substract` functions and performs demo calculations, printing the results.

User: 


- Lets see what it has created

In [97]:
!cat add.py

def add(a, b):
    return a + b

In [98]:
!cat substract.py

def substract(a, b):
    return a - b

In [99]:
!cat math.py

from add import add
from substract import substract

result_add = add(5, 3)
print(f"5 + 3 = {result_add}")

result_substract = substract(10, 4)
print(f"10 - 4 = {result_substract}")

In [101]:
# Executing math.py
!python3 math.py

5 + 3 = 8
10 - 4 = 6


- Great, We have our own coding agent.
- We can improve it further with prompt engineering and adding more tools.


## 3. Autonomous Personal Assistant: Minimal `openClaw` Implementation
- **Minimal openClaw Implementation - from Scratch**

![](images/3-Personal-Assistant.png)


In [None]:
# Imports
from google import genai
from google.genai import types
import os
import subprocess

### 3.1 SKILLS.md: Define What the Agent Can Do

In [None]:
%%writefile SKILLS.md
# SKILLS

## Message User in telegram

1. Get Telegram token and user_id
use run_bash_command to get TELEGRAM_BOT_TOKEN, TELEGRAM_USER_ID from .env file, use bash command: `cat .env` to read them from .env file

2. Sending message
send a post request to endpoint: "https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMessage" with payload like below
    
```
{
    'chat_id': TELEGRAM_USER_ID,
    'text': <message>
}
```

note: Replace TELEGRAM_BOT_TOKEN, TELEGRAM_USER_ID in above template with actual user token and ids from .env


## Checking new research papers on LLM
send a GET request to `export.arxiv.org` e.g.
https://export.arxiv.org/api/query?search_query=all:llm&start=0&max_results=3&sortBy=submittedDate&sortOrder=descending



## Tasks
Tasks file: TASKS.md
when user asks to update tasks: append to TASKS.md using bash command `echo "Task" >> TASKS.md`
to read from tasks: use: `cat TASKS.md`
Send result of task execution to telegram, without asking for user confirmation.

### 3.2 Tool: `run_bash_command`

In [None]:
# Tool implementation
def run_bash_command(command):
    """Execute a bash command and return output."""
    try:
        result = subprocess.run(
            command,
            shell=True,
            capture_output=True,
            text=True,
            timeout=30
        )
        output = result.stdout if result.stdout else result.stderr
        return output if output else "Command executed successfully (no output)"
    except subprocess.TimeoutExpired:
        return "Error: Command timed out after 30 seconds"
    except Exception as err:
        return f"Error: {err}"

- Tool Defination

In [None]:
# --- Tool definition ---
bash_function = {
    "name": "run_bash_command",
    "description": "Execute a bash command and return its output",
    "parameters": {
        "type": "object",
            "properties": {
            "command": {"type": "string", "description": "Bash command to execute"}
        },
        "required": ["command"]
    }
}

tools = types.Tool(function_declarations=[bash_function])

### ### 3.3 Respond Function with Iterations

In [None]:

client = genai.Client()

def respond(history, config):
    max_iterations = 10  # Prevent infinite loops
    
    for iteration in range(max_iterations):
        response = client.models.generate_content(
            model="gemini-2.5-pro",
            contents=history,
            config=config
        )
        
        if not response.candidates:
            return "I couldn't generate a response."
        
        candidate = response.candidates[0]
        if not candidate.content or not candidate.content.parts:
            return "I couldn't generate a response."
        
        first_part = candidate.content.parts[0]
        
        # Check if it's a function call
        if hasattr(first_part, 'function_call') and first_part.function_call:
            function_call = first_part.function_call
            
            # Execute the bash command
            try:
                result = run_bash_command(**function_call.args)
                
                # removing TELEGRAM_BOT_TOKEN and TELEGRAM_USER_ID from print statement
                TELEGRAM_BOT_TOKEN = os.environ.get("TELEGRAM_BOT_TOKEN")
                TELEGRAM_USER_ID = os.environ.get("TELEGRAM_USER_ID")
                command = function_call.args.get('command', 'unknown').replace(TELEGRAM_BOT_TOKEN, "<telegram-token>").replace(TELEGRAM_USER_ID, "<telegram-user-id>")
                print(f"[Executing: {command}]")
            except Exception as err:
                result = f"error: {err}"
            
            # Add function call and result to history
            history.append({"role": "model", "parts": [{"function_call": function_call}]})
            history.append({"role": "user", "parts": [{"function_response": {
                "name": function_call.name,
                "response": {"result": result}
            }}]})
            
            # Continue loop to allow more function calls
            continue
        
        # Check if it's text - this means we're done
        if hasattr(first_part, 'text'):
            return response.text
        
        return "I couldn't understand the response."
    
    return "Maximum iterations reached. The assistant may be stuck in a loop."


### 3.4 Configure the LLM

In [None]:
config = types.GenerateContentConfig(
    tools=[tools],
    system_instruction=f"You are a helpful assistant that can execute bash commands. cwd: {os.getcwd()}. Read SKILLS.md and TASKS.md using `cat SKILLS.md` and `cat TASKS.md` before responding anything. You are autonomous agent, **Execute tasks fully without asking for follow up questions or statements**. e.g. never reply with: \"I will now send these to you on Telegram...\""
)

### 3.5 Main Loop
- query: search arxiv for papers on llm and send them to me on telegram

In [None]:
history = []

print("OpenClaw Assistant - Type 'exit' or 'quit' to end\n")

while True:
    query = input()
    print(f"User: {query}")

    if query.lower() in ['exit', 'quit', '']:
        break
    
    # Add user query to history
    history.append({"role": "user", "parts": [{"text": query}]})
    
    # Get response
    response = respond(history, config)
    print(f"Assistant: {response}\n")
    
    # Add model response to history
    history.append({"role": "model", "parts": [{"text": response}]})

OpenClaw Assistant - Type 'exit' or 'quit' to end

User: find research papers on llm from arxiv and send them to me on telegram
[Executing: cat SKILLS.md]
[Executing: cat TASKS.md]
[Executing: curl "https://export.arxiv.org/api/query?search_query=all:llm&start=0&max_results=3&sortBy=submittedDate&sortOrder=descending"]
[Executing: TELEGRAM_BOT_TOKEN=$(grep TELEGRAM_BOT_TOKEN .env | cut -d '=' -f2)
TELEGRAM_USER_ID=$(grep TELEGRAM_USER_ID .env | cut -d '=' -f2)

MESSAGE="Here are the latest research papers on LLM from arxiv:

**1. AttentionRetriever: Attention Layers are Secretly Long Document Retrievers**
*Summary:* Retrieval augmented generation (RAG) has been widely adopted to help Large Language Models (LLMs) to process tasks involving long documents. However, existing retrieval models are not designed for long document retrieval and fail to address several key challenges of long document retrieval, including context-awareness, causal dependence, and scope of retrieval. In this pape

### 3.6 Heartbeat

#### 3.6.1 TASKS.md
- lets write `TASKS.md` to define things the agent should do when it wakes up.

In [17]:
%%writefile TASKS.md
find research papers on llm from arxiv and send them to me on telegram

Overwriting TASKS.md


### 3.6.2 Heartbeat Loop Example

In [None]:
import time
history = []

HEARTBEAT_INTERVAL = 86400 # Run once every 24 hours
while True:
    # query = input("User:")
    query = "Read TASKS.md, and execute the instructions"
    
    if query.lower() in ['exit', 'quit', '']:
        break
    
    # Add user query to history
    history.append({"role": "user", "parts": [{"text": query}]})
    
    # Get response
    response = respond(history, config)
    print(f"Assistant: {response}\n")
    
    # Add model response to history
    history.append({"role": "model", "parts": [{"text": response}]})

    # Note: This is the Only thing we are adding in previous code
    print("sleeping...")
    time.sleep(HEARTBEAT_INTERVAL)

[Executing: cat SKILLS.md]
[Executing: cat TASKS.md]
[Executing: curl "https://export.arxiv.org/api/query?search_query=all:llm&start=0&max_results=3&sortBy=submittedDate&sortOrder=descending"]
[Executing: cat .env]
[Executing: 
curl -s -X POST "https://api.telegram.org/bot<telegram-token>/sendMessage" -d "chat_id=<telegram-user-id>" --data-urlencode "text=Here are the latest research papers on LLMs:

**1. AttentionRetriever: Attention Layers are Secretly Long Document Retrievers**
*Link:* http://arxiv.org/abs/2602.12278v1
*Summary:* This paper proposes AttentionRetriever, a novel long document retrieval model that leverages the attention mechanism and entity-based retrieval to build context-aware embeddings for long documents and determine the scope of retrieval.

**2. Agentic Test-Time Scaling for WebAgents**
*Link:* http://arxiv.org/abs/2602.12276v1
*Summary:* This work introduces Confidence-Aware Test-Time Scaling (CATTS), a technique that uses vote-derived uncertainty to dynamicall

#### #### 3.6.3 Non-Blocking Heartbeat (Background Thread)

```
- Main loop → reactive user input
- Background thread → periodic tasks (heartbeat)
- Both run concurrently, independent of each other
```

In [None]:
import threading
import time

# Background Loop for TASKS.md
# -----------------------------
def heartbeat_loop(HEARTBEAT_INTERVAL):
    while True:
        
        query = "Read TASKS.md, and execute the instructions"
        response = respond([{"role": "user", "parts": [{"text": query}]}], config)
        print(f"Assistant (heartbeat): {response}\n")
        time.sleep(HEARTBEAT_INTERVAL)

# Run heartbeat in background
threading.Thread(target=heartbeat_loop, args=(86400,), daemon=True).start()


# Main reactive loop
# -------------------
while True:
    query = input()
    print("User: ", query)
    if query.lower() in ['exit', 'quit']:
        break
    response = respond([{"role": "user", "parts": [{"text": query}]}])
    print(f"Assistant: {response}\n")


[Executing: cat SKILLS.md]
[Executing: cat TASKS.md]
[Executing: curl "https://export.arxiv.org/api/query?search_query=all:llm&start=0&max_results=3&sortBy=submittedDate&sortOrder=descending"]
User:  Add this to TASKS.md: find research papers on llm from arxiv and send them to me on telegram
[Executing: echo "find research papers on llm from arxiv and send them to me on telegram" >> TASKS.md]




Assistant: I have added the task to `TASKS.md`. Now I will read the `SKILLS.md` and `TASKS.md` files to get a complete understanding of my capabilities and all assigned tasks.

User:  
[Executing: cat .env]
[Executing: cat SKILLS.md]
[Executing: cat TASKS.md]
[Executing: curl "https://export.arxiv.org/api/query?search_query=all:llm&start=0&max_results=3&sortBy=submittedDate&sortOrder=descending"]
[Executing: curl "https://export.arxiv.org/api/query?search_query=all:llm&start=0&max_results=3&sortBy=submittedDate&sortOrder=descending" > arxiv_papers.xml]
[Executing: sudo apt-get install -y xml-twig-tools]
[Executing: 
MESSAGE="**New Research Papers on LLM**

**1. AttentionRetriever: Attention Layers are Secretly Long Document Retrievers**
*Summary:* Retrieval augmented generation (RAG) has been widely adopted to help Large Language Models (LLMs) to process tasks involving long documents. However, existing retrieval models are not designed for long document retrieval and fail to address s