<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/001_Prompt_Response.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

###install Libraries

In [None]:
!pip install transformers huggingface_hub
!pip install python-dotenv
# !pip install bitsandbytes

Collecting python-dotenv
  Downloading python_dotenv-1.1.0-py3-none-any.whl.metadata (24 kB)
Downloading python_dotenv-1.1.0-py3-none-any.whl (20 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.1.0





# üß† AI Agents 101 ‚Äì Foundational Concepts & Architecture

---

## ü§ñ What is an AI Agent?

An **AI agent** is a system that:
1. **Receives a user input (goal or question)**
2. **Thinks about what to do** (using a language model)
3. **Takes an action** using tools like APIs, web access, or databases
4. **Observes the result**
5. **Uses that observation to produce an answer or plan the next step**

This loop is often described as:

```
Thought ‚Üí Action ‚Üí PAUSE ‚Üí Observation ‚Üí Answer
```

---

## üß± Core Components of a Simple AI Agent

| Component           | Purpose                                                                 |
|---------------------|-------------------------------------------------------------------------|
| üß† Language Model   | Thinks and decides what to do (e.g. GPT-4 or open-source LLMs)          |
| üìú System Prompt     | Tells the LLM how to behave (e.g. think in steps, use tools)           |
| üß∞ Tools / Functions | The agent's hands ‚Äî lets it do useful things like web search, scraping |
| üß© JSON Parsing      | Helps extract structured actions from the LLM‚Äôs response                |
| üîÅ Control Loop      | Coordinates each step: calls LLM, parses response, runs tools, repeats  |

---

## üß† Big Picture Flow

```mermaid
graph TD
A[User Input] --> B[System Prompt + User Message]
B --> C[Language Model]
C --> D[Agent Response with JSON Tool Call]
D --> E[Parse JSON to Identify Tool + Args]
E --> F[Call Tool Function]
F --> G[Return Result to Agent as Observation]
G --> H[Agent Uses Observation to Answer]
```

---

## üõ†Ô∏è Tools ‚Äì What Can the Agent Do?

The agent doesn‚Äôt know everything ‚Äî so we give it **tools**:
- `search_wikipedia(query)` ‚Üí Calls Wikipedia API
- `load_content_from_url(url)` ‚Üí Fetches content from a page
- `seo_audit_web_page(url)` ‚Üí Sends site to an SEO tool

These tools are just **Python functions** that return some output.

---

## üìú Prompt ‚Äì How Does It Know What to Do?

The prompt gives the LLM instructions like:

```plaintext
Use this format:
Thought: What you're thinking
Action: {
   "function_name": "search_wikipedia",
   "function_parms": {"query": "Albert Einstein"}
}
PAUSE
```

Then we call the tool, get an Observation, and repeat.

---

## üí¨ Message History

The agent uses **message history** to talk to the LLM:
```python
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "When was Albert Einstein born?"},
    {"role": "assistant", "content": "... LLM response ..."},
    {"role": "user", "content": "Observation: Einstein was born in 1879"}
]
```

---

## üß™ Agent Loop Summary (like in `agent.py`)

1. Build messages (`system`, `user`, etc.)
2. Ask the LLM for a response
3. Extract JSON from the LLM‚Äôs response
4. Run the function using the extracted info
5. Add the result to the conversation as an ‚ÄúObservation‚Äù
6. Repeat until the LLM gives a final answer

---

## ‚úÖ Key Skills to Practice

| Skill                        | What You‚Äôll Learn                                                   |
|------------------------------|---------------------------------------------------------------------|
| Writing system prompts       | Teach the LLM how to behave like an agent                          |
| Creating function/tool APIs  | Give your agent useful capabilities                                |
| Parsing LLM output           | Safely extract JSON or instructions from natural language responses |
| Looping through steps        | Manage flow of messages, actions, and observations                 |

---

## üß∞ Let‚Äôs Build It Step-by-Step

We‚Äôll now go through **small hands-on exercises** to help you:
1. Simulate how the LLM ‚Äúthinks‚Äù using JSON-style responses
2. Write a simple tool and connect it
3. Build a single-step agent that calls a tool based on a user question

---

If you‚Äôre ready, we can start with:

### ‚úÖ Exercise 1: Simulating Agent Thinking (No LLM yet)

We‚Äôll just write mock LLM responses to learn the structure.

Would you like to begin with this exercise?

In [None]:
import os
from dotenv import load_dotenv
from huggingface_hub import login
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline


# Explicitly load your .env file
load_dotenv("/content/HUGGINGFACE_HUB_TOKEN.env")

# Now it can find the variable
token = os.getenv("HUGGINGFACE_HUB_TOKEN")

if not token:
    raise ValueError("üö® Hugging Face token not found. Is your .env file set correctly?")

## ‚úÖ Exercise 1: Simulate Agent Thinking (No LLM Yet)

### üéØ Goal:
Manually simulate how the LLM thinks, chooses a tool, and outputs a JSON-style action. This will teach you the **structure of agent reasoning**.

---

## üß© Step-by-Step

### üìå Step 1: Define Available Tools

```python
available_tools = {
    "search_wikipedia": "Searches Wikipedia for a topic and returns a summary.",
    "get_weather": "Returns the current weather for a given city.",
    "summarize_text": "Summarizes the provided text."
}
```

---

### üìå Step 2: Simulate a User Question

```python
user_question = "What's the weather like in Paris?"
```

---

### üìå Step 3: Simulate the Agent's Thought Process

We‚Äôll write it manually like the LLM would respond:

```python
thought = "I need to find out the current weather for Paris, so I should use the weather tool."

action = {
    "function_name": "get_weather",
    "function_parms": {
        "city": "Paris"
    }
}
```

---

### üìå Step 4: Simulate the Observation

```python
observation = "It's currently 18¬∞C and sunny in Paris."
```

---

### üìå Step 5: Generate the Final Answer

```python
answer = "The weather in Paris is currently 18¬∞C and sunny."
```

---

### ‚úÖ Output It All Together

```python
print("üß† Thought:", thought)
print("üîß Action:", action)
print("üëÅÔ∏è Observation:", observation)
print("‚úÖ Final Answer:", answer)
```

---

## üéâ You Just Simulated an AI Agent!

You:
- Identified a goal
- Chose the right tool
- Formed the right arguments
- Waited for an observation
- Used it to form an answer

This is *exactly* what the LLM will do once we hook it up.

---

## üß™ Your Turn!

Try changing the `user_question` to one of these:
- "When was Marie Curie born?"
- "Summarize this text: 'Machine learning is a field of AI focused on pattern recognition‚Ä¶'"
- "What's the weather like in Tokyo?"

Then manually simulate the agent‚Äôs `thought`, `action`, `observation`, and `answer`.

---

When you‚Äôre ready, we‚Äôll move on to **Exercise 2**, where we‚Äôll write a real tool in Python and connect it to a very simple agent loop.



In [None]:
available_tools = {
    "search_wikipedia": "Searches Wikipedia for a topic and returns a summary.",
    "get_weather": "Returns the current weather for a given city.",
    "summarize_text": "Summarizes the provided text."
}

In [None]:
user_question = "What's the weather like in Paris?"

In [None]:
thought = "I need to find out the current weather for Paris, so I should use the weather tool."

action = {
    "function_name": "get_weather",
    "function_parms": {
        "city": "Paris"
    }
}

In [None]:
observation = "It's currently 18¬∞C and sunny in Paris."

In [None]:
answer = "The weather in Paris is currently 18¬∞C and sunny."

In [None]:
print("üß† Thought:", thought)
print("üîß Action:", action)
print("üëÅÔ∏è Observation:", observation)
print("‚úÖ Final Answer:", answer)


üß† Thought: I need to find out the current weather for Paris, so I should use the weather tool.
üîß Action: {'function_name': 'get_weather', 'function_parms': {'city': 'Paris'}}
üëÅÔ∏è Observation: It's currently 18¬∞C and sunny in Paris.
‚úÖ Final Answer: The weather in Paris is currently 18¬∞C and sunny.


###practice

In [None]:
available_tools = {
    "search_wikipedia": "Searches Wikipedia for a topic and returns a summary.",
    "get_weather": "Returns the current weather for a given city.",
    "summarize_text": "Summarizes the provided text."
}

user_question = "How old was Picasso when he died?"

thought = "I need to find out how old Picasso was when he died, so I should search Wikipedia for the answer."

action = {
    "function_name": "search_wikipedia",
    "function_parms": {
        "query": "Picasso"
    }
}

observation = "Pablo Picasso was born on 25 October 1881 and died on 8 April 1973, at the age of 91."

answer = "Picasso was 91 years old when he died."

print("üß† Thought:", thought)
print("üîß Action:", action)
print("üëÅÔ∏è Observation:", observation)
print("‚úÖ Final Answer:", answer)


üß† Thought: I need to find out how old Picasso was when he died, so I should search Wikipedia for the answer.
üîß Action: {'function_name': 'search_wikipedia', 'function_parms': {'query': 'Picasso'}}
üëÅÔ∏è Observation: Pablo Picasso was born on 25 October 1881 and died on 8 April 1973, at the age of 91.
‚úÖ Final Answer: Picasso was 91 years old when he died.



## ‚úÖ **Exercise 2: Build Real Tools in Python**

> You already defined your tools ‚Äî now we‚Äôll implement one: `search_wikipedia(query)`.

This makes your agent **actually capable** of doing something useful.

---

### ‚úÖ **Exercise 3: Connect the Tool to Simulated Agent Logic**

You‚Äôll feed in the `"function_name"` and `"function_parms"` (like you just did), then:
- Run the tool in real Python
- Print the result
- Build the full loop manually

---

### ‚úÖ **Exercise 4: Bring in the LLM (Hugging Face)**

Now that your agent can think and act:
- We‚Äôll load a lightweight open-source LLM (e.g., `tiiuae/falcon-rw-1b`)
- Feed it your system prompt and user input
- Let the LLM generate `Thought`, `Action`, and `Answer` itself

---

### ‚úÖ **Exercise 5: Add Observations and Loop**

Once you have the LLM deciding actions, we:
- Parse the response
- Call tools based on what it says
- Feed the result back into the next prompt
- Let it loop just like in your course's `agent.py`

---

# üî® Let‚Äôs Start Exercise 2: Write a Real Tool


In [None]:
import requests

def search_wikipedia(query):
    """Searches Wikipedia and returns a snippet of the first result."""
    url = "https://en.wikipedia.org/w/api.php"
    params = {
        "action": "query",
        "list": "search",
        "srsearch": query,
        "format": "json"
    }
    response = requests.get(url, params=params)
    results = response.json()["query"]["search"]

    if results:
        return results[0]["snippet"]
    else:
        return "No results found."

print(search_wikipedia("Picasso"))


Pablo Ruiz <span class="searchmatch">Picasso</span> (25 October 1881 ‚Äì 8 April 1973) was a Spanish painter, sculptor, printmaker, ceramicist, and theatre designer who spent most of his


###API Requests

Every API you interact with will have its **own format, rules, and structure**, defined in its **API documentation**. While there are common patterns (especially with REST APIs), each one will have:

### üîë Unique things you'll need to look up:
| What | Why It's Important |
|------|--------------------|
| `base URL` | Where to send requests (e.g., `https://api.openai.com/v1`) |
| `endpoints` | The different functions it offers (`/search`, `/users`, `/weather`) |
| `HTTP method` | Whether to `GET`, `POST`, `PUT`, or `DELETE` data |
| `query/body parameters` | What inputs are required (like `"srsearch"` for Wikipedia) |
| `headers` | For things like authentication tokens (e.g., `"Authorization: Bearer <token>"`) |
| `response format` | Usually JSON, but the structure is always unique |
| `rate limits` | So you don‚Äôt overload the server and get blocked |
| `auth method` | Some need API keys, some use OAuth, etc. |

---

## üì¶ For example:

### üîç Wikipedia API
- Requires: `"action"`, `"list"`, `"srsearch"` as parameters
- No auth needed

### ‚òÅÔ∏è OpenWeatherMap API
- Requires: `"q"` (city name), `"appid"` (API key), etc.
- Returns: current weather data

### ü§ñ OpenAI API
- Endpoint: `/v1/chat/completions`
- Requires: `model`, `messages`, `temperature`, etc.
- Needs: Authorization header with API key

---

## üîç How to Research an API Before Using It

1. **Find the official API docs** (search ‚ÄúXYZ API docs‚Äù)
2. Look for:
   - Authentication
   - Endpoints and example requests
   - Required parameters
   - Example responses
3. Test it using:
   - Postman (GUI)
   - Curl (command line)
   - Python requests

---

## ‚úÖ Good News for AI Agents

Once you write a tool function that knows how to ‚Äútalk‚Äù to the API:
- Your LLM-powered agent doesn‚Äôt have to worry about the weird formatting
- You can expose it like:
  ```json
  {
    "function_name": "get_weather",
    "function_parms": {"city": "Tokyo"}
  }
  ```
- Behind the scenes, your Python function handles all the complexity

---

You‚Äôre really starting to think like a **tool builder for agents** now ‚Äî which is exactly how developers are building real-world AI assistants today. Ready for **Exercise 3**, where we hook up this function and simulate calling it with a structured action block?

## ‚úÖ Exercise 3: Connect Agent Output to Real Function

> You‚Äôll simulate how an LLM might return an action block (as JSON), and your Python code will call the right function based on it.

---

## üß± Step-by-Step Overview

We‚Äôll do four things:

1. Define a real tool (`search_wikipedia`)
2. Simulate an LLM output (with `function_name` and `function_parms`)
3. Check that the function exists
4. Call it with the correct parameters and print the result

---

## üß™ Step 1: Real Tool Function (from before)

```python
import requests

def search_wikipedia(query):
    url = "https://en.wikipedia.org/w/api.php"
    params = {
        "action": "query",
        "list": "search",
        "srsearch": query,
        "format": "json"
    }
    response = requests.get(url, params=params)
    results = response.json()["query"]["search"]
    
    if results:
        return results[0]["snippet"]
    else:
        return "No results found."
```

---

## ‚öôÔ∏è Step 2: Define Available Tools

```python
available_tools = {
    "search_wikipedia": search_wikipedia
}
```

---

## üé≠ Step 3: Simulated Agent Output (Like LLM Would Return)

```python
action = {
    "function_name": "search_wikipedia",
    "function_parms": {
        "query": "Marie Curie"
    }
}
```

---

## üîÅ Step 4: Check and Run the Tool

```python
# Extract the function name and parameters
fn_name = action["function_name"]
fn_params = action["function_parms"]

# Check if the tool is available
if fn_name not in available_tools:
    raise ValueError(f"Unknown tool: {fn_name}")

# Get the function reference
tool_fn = available_tools[fn_name]

# Call the function with unpacked parameters
result = tool_fn(**fn_params)

# Print the result (the "Observation")
print("üëÅÔ∏è Observation:", result)
```

---

## ‚úÖ Output Example

If all goes well, you should see something like:

```
üëÅÔ∏è Observation: Marie Curie was a pioneering physicist and chemist who conducted groundbreaking research on radioactivity...
```

---

## üí° You Just Simulated the Agent Control Loop!

You now have:
- A real tool that works
- A structured action block
- A way to route actions to the right function
- A way to handle the result and pass it to the agent

---

Ready to move on to **Exercise 4**, where we bring in an **actual LLM from Hugging Face** to generate these action blocks automatically?

In [None]:
import requests

def search_wikipedia(query):
    url = "https://en.wikipedia.org/w/api.php"
    params = {
        "action": "query",
        "list": "search",
        "srsearch": query,
        "format": "json"
    }
    response = requests.get(url, params=params)
    results = response.json()["query"]["search"]

    if results:
        return results[0]["snippet"]
    else:
        return "No results found."

available_tools = {
    "search_wikipedia": search_wikipedia
}

action = {
    "function_name": "search_wikipedia",
    "function_parms": {
        "query": "Marie Curie"
    }
}

# Extract the function name and parameters
fn_name = action["function_name"]
fn_params = action["function_parms"]

# Check if the tool is available
if fn_name not in available_tools:
    raise ValueError(f"Unknown tool: {fn_name}")

# Get the function reference
tool_fn = available_tools[fn_name]

# Call the function with unpacked parameters
result = tool_fn(**fn_params)

# Print the result (the "Observation")
print("üëÅÔ∏è Observation:", result)


üëÅÔ∏è Observation: Sk≈Çodowska-<span class="searchmatch">Curie</span> (Polish: [Ààmarja sal…îÀàm…õa skw…îÀàd…îfska k ≤iÀàri] ; n√©e¬†Sk≈Çodowska; 7 November 1867 ‚Äì 4 July 1934), known simply as <span class="searchmatch">Marie</span> <span class="searchmatch">Curie</span> (/Ààkj ä…ôri/


In [None]:
import requests

# Define the tool
def search_wikipedia(query):
    url = "https://en.wikipedia.org/w/api.php"
    params = {
        "action": "query",
        "list": "search",
        "srsearch": query,
        "format": "json"
    }
    response = requests.get(url, params=params)
    results = response.json()["query"]["search"]

    if results:
        return results[0]["snippet"]
    else:
        return "No results found."

# Define available tools
available_tools = {
    "search_wikipedia": search_wikipedia
}

# Loop to try different queries
while True:
    user_query = input("\nüß† Enter a Wikipedia search topic (or 'exit' to quit): ")
    if user_query.lower() == "exit":
        print("üëã Exiting...")
        break

    # Simulate agent output
    action = {
        "function_name": "search_wikipedia",
        "function_parms": {
            "query": user_query
        }
    }

    # Extract and run
    fn_name = action["function_name"]
    fn_params = action["function_parms"]

    if fn_name not in available_tools:
        print(f"üö´ Unknown tool: {fn_name}")
        continue

    tool_fn = available_tools[fn_name]
    result = tool_fn(**fn_params)

    print("üëÅÔ∏è Observation:", result)



üß† Enter a Wikipedia search topic (or 'exit' to quit): Picasso
üëÅÔ∏è Observation: Pablo Ruiz <span class="searchmatch">Picasso</span> (25 October 1881 ‚Äì 8 April 1973) was a Spanish painter, sculptor, printmaker, ceramicist, and theatre designer who spent most of his

üß† Enter a Wikipedia search topic (or 'exit' to quit): Tasmania
üëÅÔ∏è Observation: <span class="searchmatch">Tasmania</span> (/t√¶zÀàme…™ni…ô/; palawa kani: Lutruwita) is an island state of Australia. It is located 240 kilometres (150 miles) to the south of the Australian

üß† Enter a Wikipedia search topic (or 'exit' to quit): exit
üëã Exiting...


**YES! You got it ‚Äî and that insight is at the heart of what makes AI agents powerful.** üôå

---

### ü§Ø Big Idea: Tools Extend the Power of LLMs

Language models like GPT-4, Mistral, or Falcon are:
- **Very smart**
- **Very good at reasoning, writing, and understanding**
- But they are also:
  - ‚ùå Blind to the real world
  - ‚ùå Unable to access live data (like websites, databases, files)
  - ‚ùå Not great at math or APIs by themselves

---

### üõ†Ô∏è Tools Fix That

By giving an LLM access to external tools like:

- üåê Web search
- üìÑ File readers (PDF, CSV)
- üìä Data analysis
- üìÜ Calendars
- üì° APIs (weather, finance, YouTube, etc.)

...you‚Äôre turning it from a **passive text generator** into an **active, goal-oriented agent** that can take real-world actions.

---

## üß† LLM + Tools = AI Agent

| Part            | Example                                          |
|------------------|--------------------------------------------------|
| LLM             | "To answer this, I need to check Wikipedia..."   |
| Tool Call       | `search_wikipedia("Marie Curie")`               |
| Observation     | "She was born in 1867 and died in 1934..."       |
| Final Answer    | "Marie Curie was 66 when she died."              |

You‚Äôre not replacing the LLM ‚Äî you‚Äôre **enhancing it** with a toolkit, like giving a smart person access to the internet and a calculator.

---

## üß† Why This Is So Important

Agents are the **bridge between LLM intelligence and real-world usefulness**.

This is how:
- üßë‚Äçüíº Virtual assistants book meetings
- üîç Research bots look up info
- üìä Analyst agents run data analysis
- üßæ Legal/medical/chatbots reference documents

---

Now that you understand this, you're really thinking like an **agent developer**. üí°

Let me know when you're ready to plug in an actual Hugging Face LLM to start auto-generating `function_name` and `function_parms` based on user input!

Awesome! You‚Äôre ready for the exciting part ‚Äî bringing in a **real LLM from Hugging Face** to start generating actions like a true AI agent. This is **Exercise 4** in our journey.

---

# ‚úÖ Exercise 4: Use a Hugging Face LLM to Think Like an Agent

---

## üîç Goal

We want the LLM to:
1. Receive a user question
2. Think through the problem
3. Output a structured action block like:
```json
{
  "function_name": "search_wikipedia",
  "function_parms": {
    "query": "Marie Curie"
  }
}
```

This is the moment your **LLM starts acting like an agent**.

---

## üõ†Ô∏è What We‚Äôll Use

- Model: `tiiuae/falcon-rw-1b` (small, no restrictions, good for testing)
- Library: `transformers`
- Interface: `pipeline("text-generation")`

---

## ‚úÖ Step-by-Step Setup

### 1. Install Dependencies (if needed)

```bash
pip install transformers huggingface_hub
```

---

### 2. Load the Model from Hugging Face

```python
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model_id = "tiiuae/falcon-rw-1b"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Create text generation pipeline
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
```

---

### 3. Define the System Prompt

We‚Äôll guide the LLM to act like an agent by showing it an **example**:

```python
system_prompt = """
You are an AI agent. You receive a user's question and must decide what tool to use and how to use it.

Use this format exactly:
{
  "function_name": "...",
  "function_parms": {
    ...
  }
}

Available tools:
- search_wikipedia(query): Searches Wikipedia for a topic.

Examples:

User: When was Albert Einstein born?
{
  "function_name": "search_wikipedia",
  "function_parms": {
    "query": "Albert Einstein"
  }
}

User: {}
""".strip()
```

We‚Äôll insert the user‚Äôs question into `{}`.

---

### 4. Ask the LLM a Question

```python
user_question = "How old was Marie Curie when she died?"

# Insert the question into the prompt
full_prompt = system_prompt.format(user_question)

# Generate a response
output = generator(full_prompt, max_new_tokens=200, do_sample=True)[0]['generated_text']

# Print it
print(output)
```

---

## üß† What to Look For

You're hoping the LLM returns something like:

```json
{
  "function_name": "search_wikipedia",
  "function_parms": {
    "query": "Marie Curie"
  }
}
```

You can then:
- Extract that JSON
- Run the corresponding function (from earlier)
- Return the observation
- Send it back to the model to continue

---

## üöÄ Next Step

Let me know if you‚Äôd like help:
- Parsing the LLM response into JSON
- Running the actual tool based on the LLM‚Äôs decision
- Building the full loop (LLM ‚Üí Tool ‚Üí Observation ‚Üí Final Answer)

You‚Äôre just a couple steps away from a fully working AI agent!

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model_id = "tiiuae/falcon-rw-1b" # continues to run after completion of task
model_id = "HuggingFaceH4/zephyr-7b-beta"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Create text generation pipeline
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/234 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/99.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.05k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/2.62G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.62G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/115 [00:00<?, ?B/s]

Device set to use cpu


Yes ‚Äî great call! The `zephyr-7b-beta` model is powerful but **heavy** (~13B+ parameters after loading) and **not ideal for limited environments** like Colab Free or machines without a GPU.

Let‚Äôs switch to a **lighter, faster model** that still works well for prototyping agents.

---

## ‚úÖ Recommended Small Models (Open Source, Hugging Face)

Here are some great lightweight options:

| Model ID                          | Size  | Notes |
|-----------------------------------|-------|-------|
| `tiiuae/falcon-rw-1b`             | 1B    | Very small and fast, decent reasoning |
| `microsoft/DialoGPT-medium`       | 345M  | Chat-focused, very lightweight        |
| `google/flan-t5-base`             | 250M | Instruction-tuned, good reasoning    |
| `facebook/blenderbot-3B`          | 3B    | Trained for dialogue                 |

Yes ‚Äî all the models we‚Äôve talked about (like `falcon-rw-1b`, `flan-t5-base`, `DialoGPT`, `blenderbot`, etc.) are **pretrained LLMs**. Let‚Äôs break that down a bit so you understand exactly what you‚Äôre working with:

---

## üß† What Is a Pretrained LLM?

A **pretrained Large Language Model (LLM)** is a model that has already been:
1. **Trained on a large dataset** (e.g., books, websites, conversations)
2. **Taught the structure of language** (grammar, reasoning, logic)
3. **Saved and shared** so you can use it out of the box

You don‚Äôt need to retrain or fine-tune it to start using it.

---

## üèóÔ∏è Common Types of Pretrained Models

### 1. **Base LLMs**
- Just predict text or fill in blanks
- No special tuning for tasks or instruction-following

üîπ Example: `tiiuae/falcon-rw-1b`

---

### 2. **Instruction-Tuned LLMs**
- Trained to follow commands like "Summarize this" or "Answer this question"
- Usually work better for agents and tools

üîπ Example: `google/flan-t5-base`, `HuggingFaceH4/zephyr-7b-beta`

---

### 3. **Chat-Tuned Models**
- Specifically trained for back-and-forth dialogue
- Often trained on user/assistant style prompts

üîπ Example: `microsoft/DialoGPT-medium`, `facebook/blenderbot-3B`

---

## üß∞ Pretrained Models Are Perfect for Agents

Why?
- They already "know things"
- They can reason and make decisions
- You can guide them with a well-written prompt

And by adding **tools**, you give them superpowers ‚Äî the ability to:
- Get live data
- Read your documents
- Query APIs
- Control apps

---

## üí° Later, You Can Fine-Tune

Once you're comfortable with agents and want them to behave in a more custom way, you can:
- Fine-tune a model on your own business data
- Or use techniques like RAG (Retrieval-Augmented Generation) for domain knowledge

But for now ‚Äî **pretrained models are exactly what you want** for building AI agents and learning the ropes.



In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

#model_id = "tiiuae/falcon-rw-1b" # continues to run after completion of task
model_id = "microsoft/DialoGPT-medium"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

generator = pipeline("text-generation", model=model, tokenizer=tokenizer)


tokenizer_config.json:   0%|          | 0.00/614 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/642 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/863M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/863M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Device set to use cpu


In [None]:
# System prompt template with JSON action format
system_prompt_template = """
You are an AI agent. You receive a user's question and must decide what tool to use and how to use it.

Use this format exactly:
{{
  "function_name": "...",
  "function_parms": {{
    ...
  }}
}}

Available tools:
- search_wikipedia(query): Searches Wikipedia for a topic.

Examples:

User: When was Albert Einstein born?
{{
  "function_name": "search_wikipedia",
  "function_parms": {{
    "query": "Albert Einstein"
  }}
}}

User: {}
"""


# Choose a test question
user_question = "How old was Marie Curie when she died?"

# Format the full prompt
full_prompt = system_prompt_template.format(user_question)

# ‚úÖ Print the prompt sent to the LLM
print("üì§ Full Prompt Sent to LLM:\n")
print(full_prompt)
print("\n" + "="*60 + "\n")

# Generate model output
output = generator(
    full_prompt,
    max_new_tokens=150,
    temperature=0.3,  # Less randomness
    do_sample=False,  # Deterministic generation
    pad_token_id=tokenizer.eos_token_id
)[0]['generated_text']


# ‚úÖ Print the output from the model
print("ü§ñ Model Output:\n")
print(output)


üì§ Full Prompt Sent to LLM:


You are an AI agent. You receive a user's question and must decide what tool to use and how to use it.

Use this format exactly:
{
  "function_name": "...",
  "function_parms": {
    ...
  }
}

Available tools:
- search_wikipedia(query): Searches Wikipedia for a topic.

Examples:

User: When was Albert Einstein born?
{
  "function_name": "search_wikipedia",
  "function_parms": {
    "query": "Albert Einstein"
  }
}

User: How old was Marie Curie when she died?



ü§ñ Model Output:


You are an AI agent. You receive a user's question and must decide what tool to use and how to use it.

Use this format exactly:
{
  "function_name": "...",
  "function_parms": {
    ...
  }
}

Available tools:
- search_wikipedia(query): Searches Wikipedia for a topic.

Examples:

User: When was Albert Einstein born?
{
  "function_name": "search_wikipedia",
  "function_parms": {
    "query": "Albert Einstein"
  }
}

User: How old was Marie Curie when she died?



Awesome! You got it running üéâ ‚Äî and this is an **important milestone**. You‚Äôve:

- Loaded a real Hugging Face model (`DialoGPT`)
- Sent it a properly structured system prompt
- Got a response back

Now you're ready for the **next critical step** in building an agent:

---

# ‚úÖ Next Step: Parse the Model Output to Extract the Action

Right now, the model is **just echoing the prompt** ‚Äî which means it's not reasoning or generating its own action block yet. That‚Äôs totally normal for `DialoGPT`, because it's **chat-optimized** but not **instruction-tuned**.

We‚Äôll move forward by:
1. Testing a model that better follows instructions (like `flan-t5-base`)
2. Extracting a real JSON-style action from the model
3. Running the tool based on that action

---

## üß™ Step 1: Switch to a Model That Can Generate Actions

Let's use:
```python
model_id = "google/flan-t5-base"
```

Because:
- It's **instruction-tuned** ‚Äî perfect for this kind of task
- It uses the `"text2text-generation"` pipeline, which expects prompts like:
  ```
  Question: ...
  Answer: ...
  ```

---

## üß± Updated Code for `flan-t5-base`

### üîß Load the model

```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline

model_id = "google/flan-t5-base"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

generator = pipeline("text2text-generation", model=model, tokenizer=tokenizer)
```

---

### üß† Simplified Prompt (no JSON escaping needed here)

```python
prompt = """
You are an AI agent. Based on the question, choose the best function to call and its parameters.

Use this exact format:
function_name: search_wikipedia
function_parms: { "query": "..." }

Question: How old was Marie Curie when she died?
""".strip()
```

---

### üöÄ Generate the action

```python
output = generator(prompt, max_new_tokens=100)[0]["generated_text"]
print("ü§ñ Model Output:\n", output)
```

---

## üß∞ What‚Äôs Next?

If this gives you a real function name and parameters, we can:
1. Extract that output using regex or simple parsing
2. Look it up in your `available_tools`
3. Run the tool and display the final answer (like an actual agent!)

---

Would you like me to provide the full working version of this pipeline, or would you like to try the `flan-t5-base` code first and share the result?

In [None]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline

model_id = "google/flan-t5-base"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

generator = pipeline("text2text-generation", model=model, tokenizer=tokenizer)


Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/990M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

Device set to use cpu


In [None]:
prompt = """
You are an AI agent. Based on the question, choose the best function to call and its parameters.

Use this exact format:
function_name: search_wikipedia
function_parms: { "query": "..." }

Question: How old was Marie Curie when she died?
""".strip()

output = generator(prompt, max_new_tokens=100)[0]["generated_text"]
print("ü§ñ Model Output:\n", output)


In [2]:
import json
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

notebook_path = "/content/drive/My Drive/AI AGENTS/001_Prompt_Response.ipynb"

# Load the notebook JSON
with open(notebook_path, 'r', encoding='utf-8') as f:
    nb = json.load(f)

# 1. Remove widgets from notebook-level metadata
if "widgets" in nb.get("metadata", {}):
    del nb["metadata"]["widgets"]
    print("‚úÖ Removed notebook-level 'widgets' metadata.")

# 2. Remove widgets from each cell's metadata
for i, cell in enumerate(nb.get("cells", [])):
    if "metadata" in cell and "widgets" in cell["metadata"]:
        del cell["metadata"]["widgets"]
        print(f"‚úÖ Removed 'widgets' from cell {i}")

# Save the cleaned notebook
with open(notebook_path, 'w', encoding='utf-8') as f:
    json.dump(nb, f, indent=2)

print("‚úÖ Notebook deeply cleaned. Try uploading to GitHub again.")

Mounted at /content/drive
‚úÖ Notebook deeply cleaned. Try uploading to GitHub again.
