# Additional End of week Exercise - week 2

Now use everything you've learned from Week 2 to build a full prototype for the technical question/answerer you built in Week 1 Exercise.

This should include a Gradio UI, streaming, use of the system prompt to add expertise, and the ability to switch between models. Bonus points if you can demonstrate use of a tool!

If you feel bold, see if you can add audio input so you can talk to it, and have it respond with audio. ChatGPT or Claude can help you, or email me if you have questions.

I will publish a full solution here soon - unless someone beats me to it...

There are so many commercial applications for this, from a language tutor, to a company onboarding solution, to a companion AI to a course (like this one!) I can't wait to see your results.

# My Tutor: Evolved

## Imports & Setup

In [18]:
import os
import json
from dotenv import load_dotenv
from IPython.display import Markdown, display, update_display

from openai import OpenAI
import ollama
import gradio as gr

# handling images
import base64
from io import BytesIO
from PIL import Image

# handling audio
from pydub import AudioSegment
from pydub.playback import play
from pathlib import Path

import sounddevice as sd 
import numpy as np 
import wave 
import threading 
import queue 


## Constants

In [2]:
MODEL_GPT = "gpt-4o-mini"

OLLAMA_API = "http://localhost:11434/api/chat"
LL_HEADERS = {"Content-Type": "application/json"}
MODEL_LLAMA = "llama3.2"

## Environment & Initialization

In [3]:
load_dotenv(override=True)

# OpenAI
openai_api_key = os.getenv('OPENAI_API_KEY')
if openai_api_key:
    print(f"OpenAI key exists and begins with {openai_api_key[:8]}")
else:
    print("OpenAI key not set")

openai = OpenAI()

OpenAI key exists and begins with sk-proj-


## Prompts

In [4]:
system_prompt = "You are an assistant that takes technical questions and responds with an explanation, like a tutor. You should gently encourage the student to ask a question."

In [5]:
def user_prompt_question(question):
    user_prompt = "You are a tutor assistant. Please provide an answer and explanation. Respond in Markdown. \nThe question is as follows: \n\n"
    user_prompt += question
    return user_prompt

In [6]:
example_question = """
Please explain what this code does and why:
yeild from {book.get("author") for book in books if book.get("author")}
"""

In [7]:
print(user_prompt_question(example_question))

You are a tutor assistant. Please provide an answer and explanation. Respond in Markdown. 
The question is as follows: 


Please explain what this code does and why:
yeild from {book.get("author") for book in books if book.get("author")}



## Chat Function w History

### OpenAI

In [8]:
def chat_gpt(message, history):

    messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": message}]

    stream = openai.chat.completions.create(model=MODEL_GPT, messages=messages, stream=True)

    response = ""
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        yield response

In [9]:
gr.ChatInterface(fn=chat_gpt, type="messages").launch()

* Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.




### Llama 3.2

In [10]:
def chat_llama(message, history):
    messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": message}]

    stream = ollama.chat(model=MODEL_LLAMA, messages=messages, stream=True)

    response_text = ""
    for chunk in stream:
        response_text += chunk['message']['content']
        yield response_text


In [11]:
gr.ChatInterface(fn=chat_llama, type="messages").launch()

* Running on local URL:  http://127.0.0.1:7861

To create a public link, set `share=True` in `launch()`.




In [19]:
def talker_gpt_realtime(message):
    # Passes audio chunks from the network thread to the playback thread
    audio_queue = queue.Queue()

    # Reads small audio buffers and plays them with sounddevice
    def audio_player(q):
        # OpenAI TTS output usually 24kHz
        samplerate = 24000
        playing = False
        frames = []

        while True:
            chunk = q.get()
            if chunk is None:
                break

            buffer = BytesIO(chunk)


            try:
                with wave.open(buffer, 'rb') as wf:
                    frame_data = wf.readframes(wf.getnframes())
                    if frame_data:
                        frames.append(np.frombuffer(frame_data, dtype=np.int16))

            except wave.Error:
                # Might get partial/incomplete frame data = just skip
                continue

            if not playing and frames:
                audio_data = np.concatenate(frames)
                frames=[]
                sd.play(audio_data, samplerate=samplerate)
                playing = True
            elif playing and frames:
                # append new frames while playing
                audio_data = np.concatenate(frames)
                frames = []
                sd.play(audio_data, samplerate=samplerate)

        if frames:
            audio_data = np.concatenate(frames)
            sd.play(audio_data, samplerate=samplerate)
            sd.wait()

    player_thread = threading.Thread(target=audio_player, args=(audio_queue))
    player_thread.start()


    with openai.audio.speech.with_streaming_response.create(
        model="tts-1",
        voice="onyx",  # can also try alloy
        input=message,
        response_format="wav",
    ) as response:
        # when chunk arrives, it's fed into queue
        for chunk in response.iter_bytes():
                if chunk:
                    audio_queue.put(chunk)

    # Signal player thread to stop, stream has finished
    audio_queue.put(None)
    player_thread.join()

In [20]:
def chat_gpt(message, history):
    messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": message}]
    stream = openai.chat.completions.create(model=MODEL_GPT, messages=messages, stream=True)

    response = ""
    buffer = ""

    for chunk in stream:
        delta = chunk.choices[0].delta.content or ''
        response += delta
        buffer += delta
        
        # If buffer is big enough (end of sentence or > 30 characters), speak it
        if any(punct in buffer for punct in ['.', '?', '!']) or len(buffer) > 30:
            talker_gpt_realtime(buffer.strip())
            # clear buffer after speaking
            buffer = ""

        yield response

    # Speak any leftover text
    if buffer.strip():
        talker_gpt_realtime(buffer.strip())


In [21]:
gr.ChatInterface(fn=chat_gpt, type="messages").launch()

* Running on local URL:  http://127.0.0.1:7863

To create a public link, set `share=True` in `launch()`.




Exception in thread Thread-282 (audio_player):
Traceback (most recent call last):
  File "/home/ksg-dev/anaconda3/envs/llms/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/home/ksg-dev/anaconda3/envs/llms/lib/python3.11/site-packages/ipykernel/ipkernel.py", line 766, in run_closure
    _threading_Thread_run(self)
  File "/home/ksg-dev/anaconda3/envs/llms/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
TypeError: __main__.talker_gpt_realtime.<locals>.audio_player() argument after * must be an iterable, not Queue
Exception in thread Thread-283 (audio_player):
Traceback (most recent call last):
  File "/home/ksg-dev/anaconda3/envs/llms/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/home/ksg-dev/anaconda3/envs/llms/lib/python3.11/site-packages/ipykernel/ipkernel.py", line 766, in run_closure
    _threading_Thread_run(self)
  File "/home/ksg-dev/anaconda3/envs/llms/lib/pyt

In [20]:
# # More involved Gradio code as we're not using the preset chat interface
# # Passing in inbrowser=True in the last line will cause a Gradio window to pop up immediately

# with gr.Blocks() as ui:
#     with gr.Row():
#         chatbot = gr.Chatbot(height=500, type="messages")
#     with gr.Row():
#         entry = gr.Textbox(label="Chat with your Tutor:")
#     with gr.Row():
#         clear = gr.Button("Clear")

    
#     def do_entry(message, history):
#         history += [{"role": "user", "content": message}]
#         return "", history
    
#     entry.submit(do_entry, inputs=[entry, chatbot], outputs=[entry, chatbot]).then(
#         chat_gpt, inputs=chatbot, outputs=[chatbot]
#     )
#     clear.click(lambda: None, inputs=None, outputs=chatbot, queue=False)

# ui.launch()

* Running on local URL:  http://127.0.0.1:7874

To create a public link, set `share=True` in `launch()`.




Traceback (most recent call last):
  File "/home/ksg-dev/anaconda3/envs/llms/lib/python3.11/site-packages/gradio/queueing.py", line 625, in process_events
    response = await route_utils.call_process_api(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ksg-dev/anaconda3/envs/llms/lib/python3.11/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ksg-dev/anaconda3/envs/llms/lib/python3.11/site-packages/gradio/blocks.py", line 2137, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ksg-dev/anaconda3/envs/llms/lib/python3.11/site-packages/gradio/blocks.py", line 1675, in call_function
    prediction = await utils.async_iteration(iterator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ksg-dev/anaconda3/envs/llms/lib/python3.11/site-packages/gradio/utils.py", line 735,