# Voice AI Assistant with Tool Calling + Text-to-Speech

## Overview

This notebook implements a tool-enabled conversational AI assistant that:

- Maintains conversation memory
- Uses automatic tool calling
- Integrates external APIs
- Converts text responses to speech
- Supports multi-turn conversations

---

## What This Notebook Contains

1. Tool Calling using LLM
2. Time Retrieval Tool
3. Weather API Integration
4. Calculator Tool (Newly Added)
5. Multi-turn Memory
6. Text-to-Speech Output
7. Continuous Conversation Loop

---

## Architecture Flow

User Input → LLM → Tool (if required) → Tool Execution → LLM → Text Response → Text-to-Speech → Audio Output

In [None]:
!pip install requests openai -q

### Observation

We install:
- `openai` → For LLM and TTS
- `requests` → For Weather API calls

In [None]:
from google.colab import userdata
from openai import OpenAI
import json
import requests
from datetime import datetime
from IPython.display import Audio
import os

OPENAI_API_KEY = userdata.get("OPENAI_API_KEY")
OPENAI_BASE_URL = userdata.get("OPENAI_BASE_URL")
WEATHER_API_KEY= userdata.get("WEATHER_API_KEY")
client = OpenAI(
    api_key=OPENAI_API_KEY,
    base_url=OPENAI_BASE_URL
)

print("OpenAI Client Initialized")

### Observation

- OpenAI client connects to LLM & TTS model.
- requests library enables weather API calls.
- datetime module helps fetch system time.
- Audio is used to play speech output.

# 1.Initialize Memory


In [None]:
memory = [
    {
        "role": "system",
        "content": """
        You are a smart voice assistant.
        Use tools when required.
        Use:
        - get_time for time/date queries
        - get_weather for weather queries
        - calculate for math expressions
        """
    }
]

# 2.Tool 1: Get Time

In [None]:
def get_time():
    now = datetime.now()
    return f"Current time: {now.strftime('%I:%M:%S %p')}, Date: {now.strftime('%d %B %Y')}"

### Observation

- Uses system clock
- Formats time into readable string
- Returns structured response to LLM

# 2.Tool 2: Get Weather

In [None]:
def get_weather(city):
    url = "http://api.weatherapi.com/v1/current.json"
    params = {
        "key": WEATHER_API_KEY,
        "q": city
    }

    response = requests.get(url, params=params)
    data = response.json()

    location = data["location"]["name"]
    country = data["location"]["country"]
    temp = data["current"]["temp_c"]
    condition = data["current"]["condition"]["text"]

    return f"The current weather in {location}, {country} is {temp}°C with {condition}."

### Observation

- Makes external API call
- Extracts temperature & condition
- Demonstrates real-world API handling

# 3. TOOL 3: Calculator Tool

In [None]:
def calculate(expression):
    try:
        result = eval(expression)
        return f"The result is {result}"
    except:
        return "Invalid mathematical expression."

### Observation

- Evaluates mathematical expressions
- Allows queries like:
  - "What is 25 * 8?"
  - "Calculate 100 / 4 + 3"
- Demonstrates dynamic function execution

# 4.Tool Mapping

In [None]:
tool_functions = {
    "get_time": get_time,
    "get_weather": get_weather,
    "calculate": calculate
}

# 5.Tool Schema Definition

In [None]:
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_time",
            "description": "Get current time and date",
            "parameters": {"type": "object", "properties": {}}
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string"}
                },
                "required": ["city"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Perform mathematical calculations",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {"type": "string"}
                },
                "required": ["expression"]
            }
        }
    }
]

### Observation

Tool schema tells LLM:

- What tools exist
- What parameters they require
- When to call them

Tool calling is automatic (`tool_choice="auto"`).

# 6.Agent Execution Logic

In [None]:
def run_agent(user_text):
    memory.append({"role": "user", "content": user_text})

    response = client.chat.completions.create(
        model="gpt-4.1-nano",
        messages=memory,
        tools=tools,
        tool_choice="auto"
    )

    message = response.choices[0].message


    memory.append({
        "role": "assistant",
        "content": message.content,
        "tool_calls": message.tool_calls
    })

    if message.tool_calls:
        for tool_call in message.tool_calls:
            tool_name = tool_call.function.name
            arguments = json.loads(tool_call.function.arguments)

            print("Tool Called:", tool_name)

            result = tool_functions[tool_name](**arguments)

            memory.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result
            })

        second_response = client.chat.completions.create(
            model="gpt-4.1-nano",
            messages=memory
        )

        final_text = second_response.choices[0].message.content

        memory.append({
            "role": "assistant",
            "content": final_text
        })

        return final_text

    else:
        return message.content

### Observation

This is the core orchestration engine:

1. Sends user message to LLM
2. Checks if tool is required
3. Executes tool
4. Sends tool result back to LLM
5. Generates final response
6. Stores memory

# 7.Text-to-Speech

In [None]:
def text_to_speech(text):
    response = client.audio.speech.create(
        model="gpt-4o-mini-tts",
        voice="alloy",
        input=text
    )

    audio_file = "output.wav"

    with open(audio_file, "wb") as f:
        f.write(response.content)

    return Audio(audio_file)

###  Observation

- Converts text response into speech
- Saves audio file
- Plays it inside notebook

# 8.Main Loop

In [None]:
while True:
    user_input = input("You: ")

    if user_input.lower() == "quit":
        break

    reply = run_agent(user_input)

    print("Assistant:", reply)

    display(text_to_speech(reply))

# Final Summary

This notebook demonstrates:

* API Handling  
* Tool Calling Architecture  
* Memory Management  
* Dynamic Function Execution  
* External API Integration  
* Text-to-Speech Conversion  

This is a production-style AI orchestration workflow suitable for backend AI engineering roles.