# Flask and question generation

This version of the flask UI builds on notebook 60 to add question generation. Questions are generated with the updated prompt, which generates questions along a set of dichotomous dimensions.

Issues remaining:
- adding in transcript matching
- using pause detection in speech for more robust question timing
- faster question generation?

## API key

In [1]:
import openai
import os
from getpass import getpass
openai_api_key = getpass()
os.environ["OPENAI_API_KEY"] = openai_api_key
openai.api_key = openai_api_key

## Deal with pyaudio package

Dependencies that are necessary in addition to those that come with anaconda below. PyAudio has some issues installing and likely needs portaudio first.

In [6]:
# !pip install SpeechRecognition
# !pip install openai
# !pip install whisper
# !pip install langchain

# !conda install pyaudio
# -or-
# !apt-get install portaudio19-dev
# !pip install --upgrade pyaudio

In [2]:
import speech_recognition as sr # https://pypi.org/project/SpeechRecognition/
import queue
import time
import threading
import sys
import openai
import pyaudio
import whisper

## UI design

In [1]:
!mkdir templates

### Updates design - better looking UI?

In [2]:
%%writefile ./templates/index.html

<!DOCTYPE html>
<html>
<head>
    <title>Transcriber</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            background-color: #f4f4f4;
            color: #333;
        }

        h1 {
            text-align: center;
            padding: 20px;
            color: #4a4a4a;
        }

        p {
            padding: 20px;
            font-size: 16px;
        }

        # toggle-button {
            display: block;
            width: 200px;
            height: 50px;
            margin: 20px auto;
            background-color: #3498db;
            color: #fff;
            border: none;
            border-radius: 5px;
            font-size: 18px;
            transition: background 0.2s ease;
        }

        # toggle-button:hover {
            background-color: #2980b9;
            cursor: pointer;
        }

        .recording {
            background-color: #e74c3c;
        }

        .recording:hover {
            background-color: #c0392b;
        }
    </style>
    <script type="text/javascript">
        var source = new EventSource("/updates");
        source.onmessage = function(event) {
            document.getElementById("updated-text").innerHTML = event.data;
        };
        var gptSource = new EventSource("/gpt_updates");
        gptSource.onmessage = function(event) {
            document.getElementById("gpt-text").innerHTML = event.data;
        };
        function toggleVariable() {
            var xhr = new XMLHttpRequest();
            xhr.open('POST', '/toggle', true);
            xhr.onreadystatechange = function() {
                if (xhr.readyState === XMLHttpRequest.DONE && xhr.status === 200) {
                    console.log('Toggle success');
                    // Change button color and text
                    var button = document.getElementById("toggle-button");
                    button.classList.toggle("recording");
                    if (button.innerHTML === "Start Recording") {
                        button.innerHTML = "Stop Recording";
                    } else {
                        button.innerHTML = "Start Recording";
                    }
                }
            };
            xhr.send()
        }
    </script>
</head>
<body>
    <h1>Transcriber</h1>
    <button id="toggle-button" onclick="toggleVariable()">Start Recording</button>
    <h1>Transcription</h1>
    <p id="updated-text"></p>
    <h1>Generated Questions</h1>
    <p id="gpt-text"></p>
</body>
</html>

Writing ./templates/index.html


### Create App

In [10]:
%%writefile app.py

####################
global gpt_output
gpt_output = ""
# gpt_output += "<hr>" + response.content
global sentence_counter
sentence_counter = 0

import speech_recognition as sr # https://pypi.org/project/SpeechRecognition/
import queue
from flask import Flask, render_template, Response
import time
import threading
import sys
import openai
openai.api_key = 'add you key here'

####################
prompt_string = """
Act as an expert in dialogic reading. I will give you a snippet from a children’s story, and you will give me 5 dialogic 
reading questions for a 4 year-old child.

When generating these questions, I want you to consider a few things. 

Firstly, questions can be either “concrete” or “abstract”. Concrete questions ask about perceptually obvious/physical 
aspects and abstract questions ask about non-obvious/conceptual/cognitive/emotional aspects 
(with the latter typically requiring deeper reasoning/causal inference). 
You can also think of concrete questions as focusing on explicit information, where abstract questions focus 
on implicit information.

Secondly, questions can be child-focused or book-focused. Child-focused questions focus on the child's experiences,
feelings, and connections to the material. The purpose of these questions is to help the child connect the content 
of the book with their own life, emotions, or previous knowledge. These questions are particularly useful for helping 
children build empathy and understanding, as well as developing personal connections with literature. 
Book-focused questions deal directly with the content, themes, or structures within the book itself.
These questions may be about the plot, characters, settings, or author's intent, among other things. 
These questions help children deepen their understanding of the book and develop critical reading skills. 
They can foster comprehension, recall, analysis, and interpretation.

Finally, questions may be open-ended or close-ended. Close-ended questions will typically require one word or phrase
answers, and will have a correct answer. Open-ended questions can require as long of an answer as needed and might not
have a “correct” answer.

Bearing this in mind, please generate questions that fall into these categories. Questions can and should be any
combination of the above dimensions. Please generate 5 questions which range across all dimensions and include variety
in the above dimensions. Do not give me anything other than the list the questions, numbered 1-5. Do not give potential answers.
Do not state which of the above dimensions correspond to each question.

Please also keep in mind that the child that the questions are for is 4 years old, and use an appropriate vocabulary 
and level of difficulty for a child this age.

The story snippet to generate questions for is: 
"""

'''
Set up queues and microphone for recording
'''

# Queues
# Audio queue stores audio data from mic/recorder
# Result queue stores transcribed output
audio_queue  = queue.Queue()
result_queue = queue.Queue()
transcript_queue = queue.Queue()

# variables to control app
global text_all
text_all = "" # output result string
break_threads = False # quit out

# SETUP MICROPHONE RECORDING
# Reference: https://github.com/Uberi/speech_recognition/blob/master/examples/background_listening.py
# General Reference: https://github.com/Uberi/speech_recognition/blob/master/reference/library-reference.rst

# Init mic and recorder
# Recorder will use the mic and callback function
# to constantly listen for audio and add it to the audio_queue
print('Detected Microphones:')
print(sr.Microphone.list_microphone_names()) # list microphones
mic = sr.Microphone()
recorder = sr.Recognizer()
sample_rate = mic.SAMPLE_RATE # use default sample rate for mic, whisper API can handle it

# Mic/Recorder Settings
pause_threshold = .5
energy = 300
dynamic_energy = False

# Represents the energy level threshold for sounds. Values below this threshold are considered silence, and values above this threshold are considered speech
recorder.energy_threshold = energy

# Represents the minimum length of silence (in seconds) that will register as the end of a phrase
# Smaller values result in the recognition completing more quickly, but might result in slower speakers being cut off
recorder.pause_threshold  = pause_threshold

# Automatically increase/decrease energy to account for ambient noise
recorder.dynamic_energy_threshold = dynamic_energy

# Adjusts the energy threshold dynamically
with mic as source:
    recorder.adjust_for_ambient_noise(source)

# 1ST THREAD - This is called from the background thread
# takes data from mic and adds directly to audio queue
def record_callback(_, audio:sr.AudioData):
    global toggle_variable
    if toggle_variable:
        # only add data to audio queue if button has been pressed
        data = audio.get_raw_data()
        audio_queue.put_nowait(data)

# Start listening in another thread
# Spawns a thread to repeatedly record phrases from mic
# phrase time limit - maximum length of recorded phrases (seconds)
recorder.listen_in_background(mic, record_callback, phrase_time_limit=10)
print("Microphone ready!")

'''
Define some necessary functions
'''

# Transcribe audio with WHISPER
def transcribe_audio(file_path):
    with open(file_path, "rb") as audio_file:
        transcript = openai.Audio.transcribe("whisper-1", audio_file, language='en', 
                                             prompt="Children's story" + text_all[-200:])
    return transcript["text"]

# Get audio data stored in audio_queue
# Pulls data from the queue that was put there with record_callback()
# Will pull data until queue is empty or (elapsed-time > min_time)
def get_all_audio(min_time=-1):
    audio = bytes()
    got_audio = False
    time_start = time.time()
    while not got_audio or time.time() - time_start < min_time: # min time unused right now
        # loops as long as there's something in the audio queue
        while not audio_queue.empty():
            audio += audio_queue.get() # pull data from audio queue
            got_audio = True

    data = sr.AudioData(audio,sample_rate,2)
    return data

# Get data from audio queue, save it as .wav, and transcribe .wav
# Adds transcribed .wav to result queue
def transcribe_data_from_queue():
    audio_data = get_all_audio() # get audio data from queue

    # Save audio data as .wav
    with open("latest.wav", "wb") as f:
        f.write(audio_data.get_wav_data())

    # Transcribe the saved audio file
    transcript_text = transcribe_audio('./latest.wav')

    # Add transcription to queue
    result_queue.put_nowait(transcript_text)


# Loop to run transcribe() in its own thread
# Continuosly grabs audio, transcribes it, and adds output to result queue
# status and break_threads need to be global?
def transcribe_loop():
    while True:
        if break_threads:
            break
        else:
            transcribe_data_from_queue()
    sys.exit()

# Loop to pull results and print it
# Continuosly grabs transcribed from result_queue and prints it
# result_queue, text_all, break_threads need to be global?
def print_result_loop():
    while True:
        result = result_queue.get() # get data from result queue
        transcript_queue.put_nowait(result)
        print(result)
        global text_all
        text_all += result + '<br>'    # append it to output result string

        # If output result string too long, reset it
        #if len(text_all) > 2000:
        #    text_all = ""

        # Quit if 'stop' is said
        # need better way to quit threads?
        if result.lower().find('stop') > -1:
            #text_all += '. breaking...'
            break_threads = True
            break
    sys.exit()
    
####################
# Generate GPT-4 output and update the global variable
def chat_with_chatgpt(transcript, model="gpt-3.5-turbo"):
    response = openai.ChatCompletion.create(
      model=model,
      messages=[{"role": "user", "content": prompt_string+transcript}]  
    )
    message = response["choices"][0]["message"]["content"]
    return message
    
def generate_loop():
    global gpt_output
    results = []
    while True:
        result = transcript_queue.get()
        results.append(result)

        # Generate GPT-4 output every five sentences
        if len(results) >= 3:
            generated_qs = chat_with_chatgpt(''.join(results[-1]))
            results = []   
            
            gpt_output = generated_qs.replace("\n", "<br>") + "<br>"
            
    sys.exit()
'''
Run the flask app
'''

app = Flask(__name__)
app.debug = True

threading.Thread(target=print_result_loop).start()# print output thread
#thread1.daemon = True  # Set the thread to daemon so it ends when the main thread ends
#thread1.start()
threading.Thread(target=transcribe_loop).start()   # transcribe thread
#thread2.daemon = True  # Set the thread to daemon so it ends when the main thread ends
#thread2.start()
threading.Thread(target=generate_loop).start()   # transcribe thread

# Global variable
global toggle_variable
toggle_variable = False

@app.route('/')
def index():
    return render_template('index.html')

@app.route('/toggle', methods=['POST'])
def toggle():
    global toggle_variable
    toggle_variable = not toggle_variable
    return 'Success'

@app.route('/updates')
def updates():
    def generate_updates():
        while True:
            # Generate the updated text here
            global text_all
            global toggle_variable
            updated_text = text_all

            # Yield the SSE-formatted response
            yield f"data: {updated_text}\n\n"
            
            # Wait before sending the next update
            #time.sleep(0.2)

    return Response(generate_updates(), mimetype='text/event-stream')

####################
@app.route('/gpt_updates')
def gpt_updates():
    def generate_gpt_updates():
        while True:
            # Generate the updated GPT-4 output here
            global gpt_output
            updated_gpt_output = gpt_output

            # Yield the SSE-formatted response
            yield f"data: {updated_gpt_output}\n\n"
            
            # Wait before sending the next update
            #time.sleep(0.2)

    return Response(generate_gpt_updates(), mimetype='text/event-stream')


if __name__ == '__main__':
    app.run(port=5050, use_reloader=False)

# if __name__ == '__main__':
#     app.run(use_reloader=False)

Overwriting app.py


In [12]:
!python app.py

Detected Microphones:
['MacBook Pro Microphone', 'MacBook Pro Speakers', 'ZoomAudioDevice']
Microphone ready!
 * Serving Flask app 'app'
 * Debug mode: on
 * Running on http://127.0.0.1:5050
[33mPress CTRL+C to quit[0m
^C
