# Multi-Step Agent with Qwen/Qwen3-4B-Instruct-2507

This notebook implements a multi-step agent using the **Qwen/Qwen3-4B-Instruct-2507** model.
It supports:
- **Tool Calling**: Arithmetic, Stock/Crypto Prices (yfinance), News Search (Tavily).
- **Contextual History**: Maintains conversation history for follow-up queries.
- **Gradio UI**: Interactive chat interface.

**Note**: This model requires the latest version of `transformers`.

## 1. Install Dependencies

In [1]:
# Install latest transformers from source to support Qwen3
!pip install -q git+https://github.com/huggingface/transformers.git accelerate yfinance tavily-python gradio

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


## 2. Configuration & Model Loading

In [2]:
import os
from getpass import getpass
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# --- API Keys ---
try:
    from google.colab import userdata
    HF_TOKEN = userdata.get('HF_TOKEN')
    TAVILY_API_KEY = userdata.get('TAVILY_API_KEY')
except:
    HF_TOKEN = os.getenv('HF_TOKEN')
    TAVILY_API_KEY = os.getenv('TAVILY_API_KEY')

if not HF_TOKEN:
    HF_TOKEN = getpass("Enter your Hugging Face Token (HF_TOKEN): ")
if not TAVILY_API_KEY:
    TAVILY_API_KEY = getpass("Enter your Tavily API Key (TAVILY_API_KEY): ")

os.environ['HF_TOKEN'] = HF_TOKEN
os.environ['TAVILY_API_KEY'] = TAVILY_API_KEY

# --- Model Loading ---
MODEL_NAME = "Qwen/Qwen3-4B-Instruct-2507"
print(f"Loading {MODEL_NAME}...")

# Auto-detect device
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
TORCH_DTYPE = "auto" if DEVICE == "cuda" else torch.float32

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, token=HF_TOKEN, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    torch_dtype=TORCH_DTYPE,
    device_map="auto",
    token=HF_TOKEN,
    trust_remote_code=True
)

def generate_response(messages, tools=None):
    """Generate response from the model, optionally with tools."""
    # Apply chat template. Qwen models usually handle 'tools' in apply_chat_template if customized,
    # but we will manual inject system prompt for robustness if needed, 
    # HOWEVER, using the standard 'tools' parameter is the 'right way' (Ä‘Ãºng format) for newer transformers.
    
    try:
        text = tokenizer.apply_chat_template(
            messages,
            tools=tools,
            tokenize=False,
            add_generation_prompt=True
        )
    except Exception as e:
        # fallback
        print(f"Warning: apply_chat_template issue ({e}), falling back.")
        text = tokenizer.apply_chat_template(
            messages,
            tokenize=False,
            add_generation_prompt=True
        )

    model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

    # Explicitly disable sampling parameters to avoid warnings
    generated_ids = model.generate(
        **model_inputs,
        max_new_tokens=512,
        do_sample=False,
        temperature=None,
        top_p=None,
        top_k=None
    )
    
    output_ids = generated_ids[0][len(model_inputs.input_ids[0]):]
    content = tokenizer.decode(output_ids, skip_special_tokens=False)
    return content

Loading Qwen/Qwen3-4B-Instruct-2507...


Error while fetching `HF_TOKEN` secret value from your vault: 'Requesting secret HF_TOKEN timed out. Secrets can only be fetched when running from the Colab UI.'.
You are not authenticated with the Hugging Face Hub in this notebook.
If the error persists, please let us know by opening an issue on GitHub (https://github.com/huggingface/huggingface_hub/issues/new).


Loading weights:   0%|          | 0/398 [00:00<?, ?it/s]



## 3. Tool Definitions

In [3]:
import yfinance as yf
import re
import json
from tavily import TavilyClient

# --- Helper Functions ---
def resolve_symbol(symbol):
    if not symbol: return None
    symbol = symbol.strip().upper()
    # Basic mappings
    MAPPING = {
        "BITCOIN": "BTC-USD", "BTC": "BTC-USD",
        "ETHEREUM": "ETH-USD", "ETH": "ETH-USD",
        "NVIDIA": "NVDA", "GOOGLE": "GOOGL", "APPLE": "AAPL",
        "AMAZON": "AMZN", "MICROSOFT": "MSFT", "TESLA": "TSLA"
    }
    # Check mapping
    if symbol in MAPPING: return MAPPING[symbol]
    for k, v in MAPPING.items():
        if k in symbol: return v
    # If it looks like a ticker (3-5 chars), use it
    if re.match(r'^[A-Z]{1,5}$', symbol):
        return symbol
    # Fallback: Try with yfinance search (mocked here for speed, or basic heuristics)
    return symbol

# --- Tools Implementations ---
def arithmetic_tool(op, a, b):
    try:
        a, b = float(a), float(b)
        if op == 'add': return a + b
        if op == 'subtract': return a - b
        if op == 'multiply': return a * b
        if op == 'divide': return a / b if b != 0 else "Error: Div0"
    except: return "Error: Invalid numbers"
    return "Error: Unknown Op"

def get_price(symbol):
    resolved = resolve_symbol(symbol)
    try:
        ticker = yf.Ticker(resolved)
        # fast_info is often faster/more reliable than history for current price
        price = ticker.fast_info.last_price
        if price:
            return price
        # Fallback to history
        hist = ticker.history(period="1d")
        if not hist.empty:
            return hist['Close'].iloc[-1]
        return f"No price found for {resolved}"
    except Exception as e:
        return f"Error fetching price for {symbol}: {e}"

def get_news(query):
    if not TAVILY_API_KEY:
        return "Error: TAVILY_API_KEY not set."
    try:
        client = TavilyClient(api_key=TAVILY_API_KEY)
        response = client.search(query, search_depth="basic", max_results=3)
        results = response.get('results', [])
        if not results: return "No news found."
        return "\n".join([f"- {r['title']} ({r['url']})" for r in results])
    except Exception as e:
        return f"Error fetching news: {e}"

# --- Tool Schemas (JSON format for Qwen) ---
tools = [
    {
        "type": "function",
        "function": {
            "name": "arithmetic_tool",
            "description": "Perform basic arithmetic operations (add, subtract, multiply, divide).",
            "parameters": {
                "type": "object",
                "properties": {
                    "op": {"type": "string", "enum": ["add", "subtract", "multiply", "divide"]},
                    "a": {"type": "number"},
                    "b": {"type": "number"}
                },
                "required": ["op", "a", "b"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_price",
            "description": "Get the current stock or cryptocurrency price for a given symbol (e.g., AAPL, BTC, Nvidia).",
            "parameters": {
                "type": "object",
                "properties": {
                    "symbol": {"type": "string", "description": "The ticker symbol or company name."}
                },
                "required": ["symbol"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_news",
            "description": "Search for the latest news about a topic.",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "The search query."}
                },
                "required": ["query"]
            }
        }
    }
]

tool_map = {
    "arithmetic_tool": arithmetic_tool,
    "get_price": get_price,
    "get_news": get_news
}

## 4. Agent Logic

In [4]:
import json
import re

class QwenAgent:
    def __init__(self, tools, tool_map):
        self.tools = tools
        self.tool_map = tool_map

    def parse_tool_calls(self, text):
        """
        Parse tool calls from Qwen instructions.
        """
        calls = []
        text = str(text)
        
        # 1. Try <tool_call> XML-like tags
        tool_call_pattern = r"<tool_call>(.*?)</tool_call>"
        matches = re.findall(tool_call_pattern, text, re.DOTALL)
        
        for m in matches:
            try:
                call_data = json.loads(m.strip())
                calls.append(call_data)
            except Exception as e:
                print(f"Warning: Failed to parse tool call JSON: {e} | Content: {m}")

        if calls:
            return calls

        # 2. Fallback: Detect standard tool structure
        try:
            json_pattern = r"\{.*?\}"
            potential_jsons = re.findall(json_pattern, text, re.DOTALL)
            for pj in potential_jsons:
                try:
                    data = json.loads(pj)
                    if "name" in data and "arguments" in data:
                        calls.append(data)
                except:
                    continue
        except:
            pass
        
        return calls

    def run(self, user_query, history=[], max_steps=5):
        """
        Run the agent with conversation history.
        """
        messages = []
        # System prompt
        messages.append({"role": "system", "content": "You are a helpful assistant. You can use tools to answer questions. If you need to use a tool, output the function call inside <tool_call> tags."})

        # Add history (Gradio type='messages' provides list of dicts)
        if history:
            messages.extend(history)

        # Add current query
        messages.append({"role": "user", "content": user_query})

        for step in range(max_steps):
            print(f"--- Step {step+1} ---")
            
            response_text = generate_response(messages, tools=self.tools)
            print(f"Agent Raw Output: {response_text}")

            tool_calls = self.parse_tool_calls(response_text)
            
            if not tool_calls:
                # Final Answer
                final_answer = re.sub(r"<tool_call>.*?</tool_call>", "", response_text, flags=re.DOTALL).strip()
                final_answer = final_answer.replace("<|im_end|>", "")
                return final_answer

            # Execute tools
            messages.append({"role": "assistant", "content": response_text})
            
            for call in tool_calls:
                func_name = call.get("name")
                args = call.get("arguments")
                print(f"Calling Tool: {func_name} with {args}")
                
                if func_name in self.tool_map:
                    try:
                        result = self.tool_map[func_name](**args)
                    except Exception as e:
                        result = f"Error executing {func_name}: {e}"
                else:
                    result = f"Error: Tool {func_name} not found."
                
                print(f"Tool Result: {result}")
                
                messages.append({
                    "role": "tool",
                    "name": func_name,
                    "content": str(result)
                })
        
        return "Maximum steps reached."


## 5. Gradio UI

In [5]:
import gradio as gr

agent = QwenAgent(tools, tool_map)

def chat_interface(message, history):
    """
    Gradio Chat Interface callback.
    """
    response = agent.run(message, history=history)
    return response

with gr.Blocks(theme=gr.themes.Soft()) as demo:
    gr.Markdown("# ðŸ¤– Multi-Step Agent (Qwen3-4B-Instruct)")
    
    chat = gr.ChatInterface(
        fn=chat_interface,
        type="messages", 
        examples=[
            "What is the price of Bitcoin?",
            "Multiply that price by 2.",
            "Search for the latest news about OpenAI."
        ],
    )

demo.launch(share=True, debug=True)

  with gr.Blocks(theme=gr.themes.Soft()) as demo:


Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
* Running on public URL: https://bc7a321b8f0235c4cb.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


--- Step 1 ---
Agent Raw Output: <tool_call>
{"name": "get_price", "arguments": {"symbol": "BTC"}}
</tool_call><|im_end|>
Calling Tool: get_price with {'symbol': 'BTC'}
Tool Result: 87917.0625
--- Step 2 ---
Agent Raw Output: The current price of Bitcoin (BTC) is $87,917.06.<|im_end|>
--- Step 1 ---
Agent Raw Output: <tool_call>
{"name": "arithmetic_tool", "arguments": {"op": "multiply", "a": 87917.06, "b": 2}}
</tool_call><|im_end|>
Calling Tool: arithmetic_tool with {'op': 'multiply', 'a': 87917.06, 'b': 2}
Tool Result: 175834.12
--- Step 2 ---
Agent Raw Output: Multiplying the price of Bitcoin ($87,917.06) by 2 results in $175,834.12.<|im_end|>
Keyboard interruption in main thread... closing server.
Killing tunnel 127.0.0.1:7860 <> https://bc7a321b8f0235c4cb.gradio.live


