<a href="https://colab.research.google.com/github/vanderbilt-data-science/AIDA/blob/main/30-whisper-basic-inferencing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project AIDA - Whisper exploration (preliminary)

## Problem Definition

We are trying to generate 'CROWD' questions for kids when parents read storybooks to their children.

The expected input:(1) mp3 file of the story book; (2) real-time parents audio (reading the story book)

The expected output (1) real time transcription of the audio; (2) 'CROWD' questions generated in real-time

Note: The real-time question generation should follow the natural pause of the story. (Parents can also manually decide to generate or skip questions)

Other Difficulties: (1) Front-end App Development; (2) Real-time question generation (speed); (3) Deciding where are the natural pauses


## Libraries

This section will install and import some important libraries such as Langchain, openai, Gradio, and so on

In [1]:
# install libraries here
# -q flag for "quiet" install
%%capture
!pip install -q langchain
!pip install -q openai
!pip install -q gradio
!pip install -q PyPDF2
!pip install -q pycryptodome

In [2]:
# import libraries here
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.document_loaders import TextLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain import ConversationChain, LLMChain, PromptTemplate
from langchain.chat_models import ChatOpenAI
from langchain.memory import ConversationBufferWindowMemory
from langchain.prompts import ChatPromptTemplate
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
import openai
import os
from getpass import getpass
import PyPDF2
import requests

## API Keys

Use these cells to load the API keys required for this notebook. The below code cell uses the `getpass` library.

In [3]:
openai_api_key = getpass()
os.environ["OPENAI_API_KEY"] = openai_api_key
openai.api_key = openai_api_key

··········


## Model Setup

In this exploration notebook, the best model GPT-4 is used, but you can also use the newly released GPT3.5-turbo-16k, original GPT3.5 model, or other LLMs.

In [30]:
chat = ChatOpenAI(temperature=0.0, model_name='gpt-4')
chat

ChatOpenAI(cache=None, verbose=False, callbacks=None, callback_manager=None, tags=None, client=<class 'openai.api_resources.chat_completion.ChatCompletion'>, model_name='gpt-4', temperature=0.0, model_kwargs={}, openai_api_key='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', openai_api_base='', openai_organization='', openai_proxy='', request_timeout=None, max_retries=6, streaming=False, n=1, max_tokens=None)

In [48]:
template_string = """
You are an expert in {expertise}. You will ask different dialogic questions to prompt conversation about a story using CROWD prompts. \
You should ask questions when there are natural pauses in the reading of a book (quote the section and list questions). \

CROWD stands for:
“C” is for completion prompts. These are fill-in-the-blank questions using text from the book. “When Peter ate his soup, he used a ____." and “Sally rode to school in a _____.” are examples of fill-in-the-blank questions. \

“R” is for recall prompts. These are questions that require the child to remember aspects of the book. "What are some of the things that Sally did at school?" is an example of a recall prompt. \

“O” is for open-ended questions. These are statements that encourage the child to respond to the book in his or her own words. “Now it’s your turn to tell me about this page.” and “What do you think will happen next?” are examples of open-ended prompts. \

“W” is for who, what, where, when, and why questions. "What is this called?" and "Why did Peter stay home from school?" are examples of this type of prompt. \

"D” is for distancing prompts. These are questions that require the child to relate the content of the book to aspects of life outside of the book. "Did you ever stay home from school like Peter did?" and “How would you feel if you had to stay home from school?” are examples of distancing prompts. \

Please use simple language that a {age}-year-old would understand when generating the questions. This is the {time} the child has read the book. \

Remember the questions should be designed for the child (questions for parents to ask their kids). \

Use only the following text from the children’s storybook. \

Important notes for the output:
 - Generate {number_of_Q} question(s) for each quote follow the CROWD questions
 - Mention if it is C,R,O,W, or D
 - Quote the original story sentences that questions are basing on) \

The storybook content is listed below: ```{storybook}``` \
"""

In [6]:
prompt_template = ChatPromptTemplate.from_template(template_string)


In [7]:
# prompt_template.messages[0].prompt
prompt_template.messages[0].prompt.input_variables

['age', 'expertise', 'number_of_Q', 'storybook', 'time']

## Data Loading

The data used includes one pdf storybook (use PyPDF) and an audio story (use Whisper) that are publically available

In [8]:
def download(url, save_path):
    response = requests.get(url)
    with open(save_path, 'wb') as file:
        file.write(response.content)

In [9]:
pdf_url = 'https://cdn.shopify.com/s/files/1/2081/8163/files/002-GINGER-THE-GIRAFFE-Free-Childrens-Book-By-Monkey-Pen.pdf?v=1589846892'
save_location = 'book1.pdf'
download(pdf_url, save_location)

In [10]:
# Open the PDF file in read-binary mode
with open('book1.pdf', 'rb') as file:
    # Create a PDF reader object
    reader = PyPDF2.PdfReader(file)

    # Initialize an empty string to hold the text
    text = ''

    # Loop over all the pages in the PDF (if you know the specific pages, you can modify the range)
    for page in range(len(reader.pages)):
        # Add the text from the page to the text string
        text += reader.pages[page].extract_text()

In [17]:
print(text[500:700])

ry in Africa. Like all 
giraffes, Ginger had a long neck and long legs. Because 
she was so tall, she was able to eat food from the very 
tops of the trees in the savannah.  The savannah in 
Africa is


In [18]:
audio_url = 'https://drive.google.com/uc?export=download&id=0B2BusX3704XQR1JFY2VIYVNDMWs'
save_path = 'audio1.mp3'
download(audio_url, save_path)

In [34]:
file = open("audio1.mp3", "rb")
transcription = openai.Audio.transcribe("whisper-1", file)

In [37]:
transcription['text']

"The Purple Rocket Podcast presents Grandpa's Globe. Episode 1. Spain. Sawyer and Susie stared at the big door in front of them. For twins, they didn't look anything alike. Sawyer was blonde and short, Susie was tall with dark hair and freckles, but those were just the smallest differences between them. In fact, they were pretty sure they were born on different planets on opposite ends of the universe. Even so, they were stuck together as usual, and now they stood on their grandpa's front porch, deciding their next move. You gonna knock? asked Sawyer. You knock, insisted Susie. You're taller. You'll be able to hear yours better. Susie shook her head. From what mom says, I don't think you'll be able to hear it either way. His hearing aids are always running out of battery and making that annoying squeaking noise like a dying mouse. Sawyer put a stick of gum in his mouth and chewed. It helped him think better in times of distress. I can't believe they just dropped us off here, he said. I

## Implementation

We are taking a look of the how does the prompt work on the transcribed texts and PyPDF texts

In [49]:
test_input1 = prompt_template.format_messages(
                      expertise = 'dialogic reading',
                      age = 3,
                      storybook = transcription['text'],
                      number_of_Q = 2,
                      time = 'first time')

In [50]:
test_input2 = prompt_template.format_messages(
                      expertise = 'dialogic reading',
                      age = 3,
                      storybook = text,
                      number_of_Q = 3,
                      time = 'first time')

In [51]:
response1 = chat(test_input1)
print(response1.content)

'The Purple Rocket Podcast presents Grandpa's Globe. Episode 1. Spain. Sawyer and Susie stared at the big door in front of them.'

C: Sawyer and Susie stared at the big ____ in front of them.
R: What were Sawyer and Susie staring at?
O: How do you think Sawyer and Susie feel about the big door?
W: Where are Sawyer and Susie standing?
D: Have you ever stood in front of a big door like Sawyer and Susie?

'For twins, they didn't look anything alike. Sawyer was blonde and short, Susie was tall with dark hair and freckles, but those were just the smallest differences between them.'

C: Sawyer was ____ and short, Susie was tall with ____ hair and freckles.
R: What are some differences between Sawyer and Susie?
O: Can you describe how Sawyer and Susie look?
W: What are Sawyer and Susie?
D: Do you have any siblings? How are you different from them?

'You gonna knock? asked Sawyer. You knock, insisted Susie.'

C: "You gonna ____?" asked Sawyer. "You ____," insisted Susie.
R: What did Sawyer ask

In [52]:
response2 = chat(test_input2)
print(response2.content)

'Once upon a time, there was a giraffe named Ginger.'
C: The animal in the story is a ____.
R: What is the name of the giraffe in the story?
O: Can you tell me something about giraffes?
W: Where did Ginger the giraffe live?
D: Have you ever seen a giraffe at the zoo?

'Ginger lived in Kenya, a country in Africa.'
C: Ginger lived in a country called ____.
R: In which continent is Kenya located?
O: What do you think it's like to live in Africa?
W: What other animals might live in Africa?
D: Can you name another country in Africa?

'She loved the leaves and the new buds of the trees.'
C: Ginger loved to eat ____ and new buds from the trees.
R: What did Ginger like to eat?
O: What kind of food do you think giraffes like to eat?
W: Why do you think giraffes eat leaves and buds?
D: What is your favorite food to eat?

'He looked very tired. “What’s wrong?” asked Ginger.'
C: Mickey the Monkey looked very ____.
R: How did Mickey look when Ginger found him?
O: How do you feel when you are tired?

## Conclusion

This is the initial exploration notebook about whisper API for AIDA project.