# Additional End of week Exercise - week 2

Now use everything you've learned from Week 2 to build a full prototype for the technical question/answerer you built in Week 1 Exercise.

This should include a Gradio UI, streaming, use of the system prompt to add expertise, and the ability to switch between models. Bonus points if you can demonstrate use of a tool!

If you feel bold, see if you can add audio input so you can talk to it, and have it respond with audio. ChatGPT or Claude can help you, or email me if you have questions.

I will publish a full solution here soon - unless someone beats me to it...

There are so many commercial applications for this, from a language tutor, to a company onboarding solution, to a companion AI to a course (like this one!) I can't wait to see your results.

   ## NOTE
   
   **Run the following in your system terminal (you will be prompted for your password):**

   _sudo apt-get update && sudo apt-get install -y libportaudio2 portaudio19-dev_

   **For chat:** Set `OPENROUTER_API_KEY` in `.env`. **For TTS:** Set `GROQ_API_KEY` in `.env` (Groq TTS for first-level audio).

In [None]:
# Install STT dependencies

! uv pip install moonshine-voice

In [None]:
import os
import json
import sqlite3
import tempfile
from dotenv import load_dotenv
from openai import OpenAI
from groq import Groq
import gradio as gr
from moonshine_voice import (
    Transcriber,
    TranscriptEventListener,
    get_model_for_language,
    load_wav_file,
    ModelArch,
)

In [None]:
load_dotenv(override=True)

MODEL = "openai/gpt-4o-mini"
openrouter_key = os.getenv("OPENROUTER_API_KEY")
groq_key = os.getenv("GROQ_API_KEY")

client = OpenAI(base_url="https://openrouter.ai/api/v1", api_key=openrouter_key) if openrouter_key else None
groq_client = Groq(api_key=groq_key) if groq_key else None

DB = "technical_qa.db"

In [None]:
# System prompt and follow-up (3 levels of explanation)

SYSTEM_PROMPT = """You are an expert technical educator. Keep every answer to ONE concise paragraph (no bullet points, no headings, no code blocks).
Answer directly without preamble. Explain in plain text only; if code is essential, describe it verbally.
When asked to go deeper, name the single hardest concept from your previous answer and explain only that concept in one new paragraph. Reference but do not repeat earlier explanations.
Use the lookup_technical_term tool when a precise definition would help."""

FOLLOWUP = "What is the single hardest concept in your last answer? Explain it in one concise paragraph. You may reference what you said before but do not repeat it."

In [None]:
# Tool: technical term lookup (demonstrates tool use)

TECH_TERMS = {
    "decorator": "A decorator is a function that modifies another function. In Python, @decorator wraps a function to add behavior.",
    "closure": "A closure is a function that captures variables from its enclosing scope.",
    "generator": "A generator yields values one at a time using yield, supporting lazy evaluation.",
    "recursion": "Recursion is when a function calls itself; it requires a base case to terminate.",
    "api": "An API defines how software components communicate. REST APIs use HTTP methods on URLs.",
}

def lookup_technical_term(term: str) -> str:
    key = term.strip().lower()
    return TECH_TERMS.get(key, f"No definition for '{term}'.")

tools = [{
    "type": "function",
    "function": {
        "name": "lookup_technical_term",
        "description": "Look up a technical term (decorator, closure, generator, recursion, api).",
        "parameters": {
            "type": "object",
            "properties": {"term": {"type": "string", "description": "Term to look up"}},
            "required": ["term"],
            "additionalProperties": False
        }
    }
}]

In [None]:
# SQLite conversation storage and TTS

def setup_database():
    with sqlite3.connect(DB) as conn:
        conn.execute("""
            CREATE TABLE IF NOT EXISTS conversations (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                model TEXT,
                user_msg TEXT,
                assistant_msg TEXT,
                timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP
            )
        """)
        conn.commit()
setup_database()

def save_conversation(model, user_msg, assistant_msg):
    with sqlite3.connect(DB) as conn:
        conn.execute(
            "INSERT INTO conversations (model, user_msg, assistant_msg) VALUES (?, ?, ?)",
            (model, user_msg, assistant_msg)
        )
        conn.commit()

def talker(text):
    """Convert text to speech via Groq. Returns WAV bytes or None."""
    if not groq_client or not text:
        return None
    try:
        r = groq_client.audio.speech.create(
            model="canopylabs/orpheus-v1-english",
            voice="troy",
            input=text[:4000],
            response_format="wav"
        )
        return r.read()
    except Exception as e:
        if "429" in str(e) or "rate" in str(e).lower():
            print("TTS rate limit or error:", str(e)[:300])
        return None

In [None]:
# 3-level iterative explanation with tool use (list comp where compact)

def _get_reply_with_tools(messages):
    """One API round with tool handling. Returns (full_reply, messages, tools_used)."""
    full_reply = ""
    tools_used = []
    for _ in range(3):
        resp = client.chat.completions.create(model=MODEL, messages=messages, tools=tools, stream=False)
        msg = resp.choices[0].message
        full_reply = msg.content or ""
        if not getattr(msg, "tool_calls", None):
            messages.append({"role": "assistant", "content": full_reply})
            return full_reply, messages, tools_used
        messages.append({"role": "assistant", "content": full_reply, "tool_calls": msg.tool_calls})
        for tc in msg.tool_calls or []:
            if getattr(tc.function, "name", None) == "lookup_technical_term":
                term = json.loads(tc.function.arguments or "{}").get("term", "")
                result = lookup_technical_term(term)
                tools_used.append(f"lookup_technical_term(term={term!r})")
                messages.append({"role": "tool", "tool_call_id": tc.id, "content": result})
    return full_reply, messages, tools_used

def chat_stream(history, message):
    """Three levels of explanation (What #1, #2, #3). Yields (history, audio) after each level (full output, no chunking)."""
    if not client:
        yield history + [{"role": "assistant", "content": "OpenRouter not configured. Set OPENROUTER_API_KEY in .env"}], None
        return
    api_messages = [{"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": message}]
    current_history = list(history)
    last_audio = None
    all_tools_used = []
    for level in range(3):
        user_msg = message if level == 0 else FOLLOWUP
        if level > 0:
            api_messages.append({"role": "user", "content": FOLLOWUP})
        full_reply, api_messages, tools_used = _get_reply_with_tools(api_messages)
        all_tools_used.extend(tools_used)
        save_conversation(MODEL, user_msg, full_reply)
        content = f"**What #{level + 1}**\n\n{full_reply}"
        level_audio = talker(full_reply) if level == 0 else None
        if level == 0 and level_audio is not None:
            with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as f:
                f.write(level_audio)
                last_audio = f.name
        if level == 0 and level_audio is None and groq_client:
            content += "\n\n_(Audio unavailable: TTS rate limit or error. Try again later or check Groq billing.)_"
        current_history = current_history + [{"role": "assistant", "content": content}]
        out_audio = last_audio
        if level < 2:
            current_history = current_history + [{"role": "user", "content": "Go deeper (hardest concept from above)"}]
            yield current_history, out_audio
        else:
            yield current_history, out_audio
    if all_tools_used:
        print("Tools used this conversation:", all_tools_used)
    with sqlite3.connect(DB) as conn:
        rows = conn.execute("SELECT id, model, user_msg, assistant_msg, timestamp FROM conversations ORDER BY id").fetchall()
    print("Conversation history from DB:")
    for row in rows:
        print(f"  --- id={row[0]} model={row[1]} {row[4]}")
        print(f"    user: {(row[2][:200] + '...') if len(row[2]) > 200 else row[2]}")
        print(f"    assistant: {(row[3][:200] + '...') if len(row[3]) > 200 else row[3]}")
        print()

In [None]:
# STT: Moonshine file-based transcription (first voice use may download the model)

_cached_transcriber = None

def _get_transcriber(language: str = "en"):
    global _cached_transcriber
    if _cached_transcriber is None:
        model_path, model_arch = get_model_for_language(language, ModelArch.BASE)
        _cached_transcriber = Transcriber(model_path=model_path, model_arch=model_arch)
    return _cached_transcriber

class _CollectingListener(TranscriptEventListener):
    def __init__(self):
        self.lines = []
    def on_line_completed(self, event):
        if event.line.text.strip():
            self.lines.append(event.line.text.strip())

def _wav_chunk_generator(wav_path: str, chunk_duration: float = 0.1):
    audio_data, sample_rate = load_wav_file(wav_path)
    chunk_size = int(chunk_duration * sample_rate)
    for i in range(0, len(audio_data), chunk_size):
        yield (audio_data[i : i + chunk_size], sample_rate)

def transcribe_audio_file(audio_path: str | None, language: str = "en") -> str:
    """Transcribe a WAV file to text using Moonshine. Returns '' if path is invalid or transcription fails."""
    transcriber = _get_transcriber(language)
    listener = _CollectingListener()
    stream = transcriber.create_stream(update_interval=0.5)
    stream.add_listener(listener)
    stream.start()
    for chunk, sample_rate in _wav_chunk_generator(audio_path):
        stream.add_audio(chunk, sample_rate)
    stream.stop()
    stream.close()
    return " ".join(listener.lines)

## Gradio UI

Technical Q&A with **3 levels of explanation** (What #1 → What #2 → What #3). Each level picks the hardest concept from the previous answer and explains it.
- **Full response** per level (no chunk streaming)
- **Tool use**: `lookup_technical_term` for decorator, closure, generator, recursion, api
- **Audio output**: TTS via Groq for the first level only (if GROQ_API_KEY set)
- **SQLite**: Conversation history saved to `technical_qa.db`
- **Voice input**: Record with the microphone; Moonshine transcribes to text, then the same 3-level chat runs (first use may download the model).

In [None]:
# Gradio UI

def chat_from_history(history):
    """Get last user message from history and run chat_stream."""
    if not history or history[-1].get("role") != "user":
        yield history, None
        return
    message = history[-1].get("content", "").strip()
    for h, a in chat_stream(history, message):
        yield h, a

def voice_submit(audio_path, history):
    """Transcribe audio and append as user message, or show error if empty."""
    transcript = transcribe_audio_file(audio_path)
    if not transcript:
        return history + [{"role": "assistant", "content": "Could not transcribe audio."}]
    return history + [{"role": "user", "content": transcript}]

with gr.Blocks(title="Technical Q&A Assistant") as ui:
    with gr.Row():
        chatbot = gr.Chatbot(height=450, type="messages")
    with gr.Row():
        audio_output = gr.Audio(autoplay=True)
    with gr.Row():
        audio_input = gr.Audio(sources=["microphone"], type="filepath", label="Ask by voice (first use may download model)")
    with gr.Row():
        submit_voice_btn = gr.Button("Submit voice")

    submit_voice_btn.click(voice_submit, inputs=[audio_input, chatbot], outputs=[chatbot]).then(
        chat_from_history, inputs=[chatbot], outputs=[chatbot, audio_output]
    )

ui.launch(inbrowser=True)