# Teambuilding Speech to text

## Overview

This Colab notebook serves as a template for our teambuilding exercise, focusing on the development of a speech-to-text application to assist children with disabilities, particularly those with autism and deafness.

These children often experience sensory overload in traditional classroom environments, making it challenging to process both audio and visual information simultaneously.

By transcribing audio into text, this application aims to provide a more accessible and inclusive learning experience.

### Objective

Using this notebook as a template, achieve the following objectives:
*   Provide an accurate transcription of the given audio file.
*   Discern individual actors within the audio clip
*   Design a front end that is able to keep our personas engaged in their class




## Pre-requisite setup
This section installs all the requirements we need to run the streamlit application as well as use GenAI to help with our use case.

In [None]:
# @title Install Pre-requisites
# @markdown Here we will install all the dependencies required and files
!pip3 install -q streamlit
!pip3 install -q python-dotenv
!pip3 install -q google-generativeai

!npm install -q localtunnel

# @title Retrieve reference files
!git clone -q "https://github.com/RJDG97/Teambuilding.git"

## Speech to text code
This section contains the code to build your app including the streamlit UI as well as accessing gemini.

Feel free to modify the code to improve your app!

In [None]:
# @title Create your app
# @markdown # Retrieve reference files
%%writefile app.py

import os
import sys
import datetime
from dotenv import load_dotenv
import streamlit as st
import google.generativeai as genai
import glob

# Vertex AI imports
import vertexai
from vertexai.generative_models import (
    GenerationConfig,
    GenerativeModel,
    HarmBlockThreshold,
    HarmCategory,
    Part,
    Content,
)

# Environment variables
# @markdown ### Set your environment
# @markdown If the GenAI key does not work please go [here](https://aistudio.google.com/app/apikey) to get your api key
load_dotenv()
GENAI_APIKEY = "" # @param {type:"string"}
genai.configure(api_key=GENAI_APIKEY)

# @markdown ### Choose your AI Model
# Configure Vertex AI
MODEL_ID = "gemini-1.5-flash-001"  # @param ["gemini-1.5-pro-001", "gemini-1.5-flash-001"] {isTemplate:true}
model = genai.GenerativeModel(MODEL_ID)

# --- Helper Functions ---
def generate_ai_response(contents):
    generation_config = {
                            "max_output_tokens": 8192,
                            "temperature": 0.05,
                            "top_p": 0.95,
                        }

    response = model.generate_content(contents, generation_config=generation_config,)
    if response:
        st.write("Response:")
        st.write(response.text)
# --- End of Helper functions ---

# --- App Variables ---
# Sample outputs
audio_sample_output= """
    | Speaker | Text |
    |---|---|
    | **Teacher** | So the inquiry question is: Mauritius, is tourism the way to go? So we have three groups supporting yes, tourism should be done, and we have three groups supporting no, it shouldn't be done. |
    | **Student A** | We are well aware the tourism sector contribute to one quarter of our local economy. |
    | **Student B** | You guys claim that you guys want to invest more money into tourism, correct? Why don't we invest the money into upgrading our sugarcane plantation so that they are more resistant to rising water temperatures which is caused by the increased carbon footprint which is produced by tourist activity? |
    | **Student C** | Sugarcane is affected by natural disasters. We cannot control them. So what do you want the government to do? Did you read the background information properly? Because it's the natural disaster that is affecting sugarcane and not your carbon footprint. Thank you. |
    | **Teacher** | Okay. No, no, no, no more questions. No, no. No back and forth. Okay. So pass down the reflection sheet. Okay, while we discuss for about five minutes, you complete the post-conference reflections. |    Detailed Summary:

    \n
    This transcription captures a lively classroom debate in Singapore. The central question is whether tourism is the right path for Mauritius.

    \n
    Student A, representing the pro-tourism side, highlights tourism's significant contribution to the local economy.
    Student B, opposing excessive reliance on tourism, challenges the pro-tourism group's proposal to invest more in the sector. They suggest investing in sugarcane plantations instead, aiming to enhance their resilience against climate change impacts, which they argue are exacerbated by tourism.
    Student C, also seemingly against over-dependence on tourism, retorts that natural disasters, not carbon footprint from tourism, are the primary threat to sugarcane. They question the opposing group's grasp of the background information.
    The Teacher, mediating the debate, calls for order, preventing further back-and-forth arguments. They then instruct the students to complete a reflection sheet while a separate discussion takes place.
    The debate showcases critical thinking, with students presenting economic and environmental arguments. The use of Singaporean English is evident in the vocabulary and pronunciation.
    """

text_sample_output ="""
Response:

## Arguments for and Against Investing in Tourism in Mauritius

### Argument For Investing in Tourism

- **Tourism's Contribution to the Economy**:
  - The audio states, "The tourism sector contributes to one-quarter of our local economy."
  - This argument is found on slide 2, in the infographic titled **"SUSTAINABLE TOURISM"**.
  - The infographic highlights:
    - "Tourism is responsible for 235 million jobs in the world."
    - Tourism is the "Main income source for many developing countries."

### Argument Against Investing in Tourism

- **Environmental Impact**:
  - The opposing team suggests investing in upgrading sugarcane plantations to be more resistant to rising water temperatures caused by the increased carbon footprint of tourist activity.
  - This argument is based on the information provided on slide 3, which states:
    - "The growth of tourism and sugar plantations industries has [an] impact on the biodiversity of the country."
    - Highlights the "ecological destruction" caused by tourism-related activities.

- **Impact of Natural Disasters**:
  - The audio mentions the impact of natural disasters on sugarcane plantations. However, this point is refuted as not being directly related to the carbon footprint of tourism.
                    """
# --- End of App Variables ---

# --- Prompts ---
# @markdown ### AI Context Prompt
# @markdown Here you can edit the prompts given to the AI Model to give it context on its job \
# @markdown \
# @markdown Edit the prompt to get a more accurate transcription for the objectives \
# @markdown \
# @markdown **Sample Audio prompt**
# @markdown ```
# @markdown You are a transcriber for a classroom,
# @markdown provide a transcription that differentiates teachers and students.
# @markdown ```
AUDIO_CONTEXT_PROMPT = "You are a transcriber for a classroom, provide a transcription that differentiates teachers and students "# @param {type:"string"}
# @markdown **Sample PDF prompt** \
# @markdown ```
# @markdown You are a teacher's assistant looking to help make citations,
# @markdown cite where the arguments made in the audio file can be found in the slides
# @markdown and give the exact file name and page of the slide.
# @markdown ```
PDF_CONTEXT_PROMPT = "You are a teacher's assistant looking to help make citations , cite where the arguments made in the audio file can be found in the slides and give the exact file name and page of the slide."# @param {type:"string"}
# --- End of Prompts ---

def main():
# --- App layout ---
    tab1, tab2 = st.tabs(["Document + Audio", "Audio",])
    with tab1:
        st.subheader("Citing information in lessons to the slides", divider='gray')
        st.markdown("""
                    Gemini is able to ingest large amounts of unstructured data such as PDFs and audio
                    to help with the citations of materials. \n
                    """)

        # Selector for pdfs
        pdf_path = "/content/Teambuilding/Samples/pdf/"
        pdf_list = [os.path.basename(x) for x in glob.glob(pdf_path + "*")]
        uc1_reports = st.multiselect(
            "Select the slides you want to include. \n\n",
            pdf_list,
            key="uc1_reports",
        )

        #Selector for audio
        audio_path = "/content/Teambuilding/Samples/audio/"
        audio_list = [os.path.basename(x) for x in glob.glob(audio_path + "*")]
        uc1_audio = st.multiselect(
            "Select the audio you would like to reference. \n\n",
            audio_list,
            key="uc1_audio",
        )

        # Button to Analyze Audio & Doc
        generate_t2t_uc1 = st.button("Analyze my report", key="generate_t2t_uc1")

        # Generate response from given resources when pressed
        if generate_t2t_uc1 and PDF_CONTEXT_PROMPT:
            with st.spinner(f"Analyzing using {MODEL_ID} ..."):
                first_tab1, first_tab2, first_tab3 = st.tabs(["Analysis", "Prompt", "Sample"])
                with first_tab1:
                    # Resources you are providing the AI Model
                    contents =[]
                    # Upload resources to GenAI

                    for report in uc1_reports:
                        # Upload file to genai
                        pdf_file = genai.upload_file(path=pdf_path + report, mime_type="application/pdf",display_name=report)
                        contents.append(pdf_file)

                    for audio in uc1_audio:
                        # Upload audio to genai
                        audio_file_uploaded = genai.upload_file(path=audio_path + audio, mime_type="audio/mp3",display_name=audio)
                        contents.append(audio_file_uploaded)
                    contents.append(PDF_CONTEXT_PROMPT)

                    # Generate a Response from the AI Model
                    generate_ai_response(contents)

                with first_tab2:
                    st.text(PDF_CONTEXT_PROMPT)
                with first_tab3:
                    st.markdown(text_sample_output)

    with tab2:
        # Tab Subheader
        st.subheader(
            "Transcribing audio with Gemini", divider='gray'
        )
        # Tab Description
        st.write(
            """
            Gemini 1.5 is able to transcribe conversations from audio. It is able to do so with great accuracy and
            is able to identify multiple languages without any pre-calibration, as compared to traditional transcribing software.
            It can return the transcript in a file format that can be further processed as seen below.
            It also has a 1 Million context window. This allows you to process up to 11h of audio in a single prompt"""
        )

        # Multiselect for audio
        uc2_audio = st.multiselect(
            "Select the audio you would like to reference. \n\n",
            audio_list,
            key="uc2_audio",
        )

        # Create sub tabs
        tabo1, tabo2, tabo3 = st.tabs(["Response", "Prompt", "Answers"])

        # Button to generate an ai response
        audio_geolocation_description = st.button(
            "Generate", key="audio_geolocation_description"
        )

        # Response tab
        with tabo1:
            if audio_geolocation_description and AUDIO_CONTEXT_PROMPT:
                with st.spinner(f"Analyzing using {MODEL_ID} ..."):
                    for audio in uc2_audio:
                        # Upload audio to genai
                        audio_file_tab_uploaded = genai.upload_file(path=audio_path + audio, mime_type="audio/mp3",display_name=audio)
                        contents = [audio_file_tab_uploaded, AUDIO_CONTEXT_PROMPT,]
                        generate_ai_response(contents)
                        st.markdown("\n\n\n")
        # Prompt tab
        with tabo2:
            st.write("Prompt used:")
            st.write(AUDIO_CONTEXT_PROMPT)
        # Answer tab
        with tabo3:
            st.write("Answers:")
            st.write(
                audio_sample_output
            )
# --- End of App layout ---

if __name__ == "__main__":
    main()

## Play with the app layout
To change the layout of your app you can select **'show code'** in the block above and edit the **app layout section** to make it user friendly for our persona!

Below are some reference code to creating various UI elements on streamlit.

For additional features [go here](https://docs.streamlit.io/develop/api-reference)

### Navigation
References for creating tabs for app navigation.

#### Creating Tabs
To create tabs you can follow the following code
```
tab1, tab2 = st.tabs(["sample tab title 1", "sample tab title 2"])
```
Within the '[]' you can define tabs and their titles by quoting them with "" and assigning them to variables such as the tab1 and tab2 above

To add items into your tab, you can use "with < tab1 >:" similar to the code below
```
with tab1:
  st.write("This is sample text")
```

### Text Elements
Reference for creating titles, headers and subheader elements for the app.

#### Creating Title
To create titles for your app you can put in the following code
```
st.title("sample title")
```

#### Creating Header
Display text in header formatting.
```
st.header("This is a header")
```
Display text in subheader formatting.
```
st.subheader("This is a subheader")
```

### Writing text
Reference for writing text and stream text for the app

#### Write text
Write arguments to the app.
```
st.write("Hello")
```

#### Write stream text
Stream a generator, iterable, or stream-like sequence to the app.
```
def stream_data():
    for word in _LOREM_IPSUM.split(" "):
        yield word + " "
        time.sleep(0.02)

    yield pd.DataFrame(
        np.random.randn(5, 10),
        columns=["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"],
    )

    for word in _LOREM_IPSUM.split(" "):
        yield word + " "
        time.sleep(0.02)


if st.button("Stream data"):
    st.write_stream(stream_data)
```

### Media elements
Reference for creating images, audio, videos and dividers for the app.


#### Display image
Display an image or list of images.
```
st.image("sunrise.jpg", caption="Sunrise by the mountains")
```

#### Displays audio
Display an audio player.
```
st.audio("cat-purr.mp3", format="audio/mpeg", loop=True)
```

#### Displays video
Display a video player.
```
video_file = open("myvideo.mp4", "rb")
video_bytes = video_file.read()

st.video(video_bytes)
```

#### Creating Dividers
Display a horizontal rule.
```
st.divider()
```

### Input widgets
Reference for creating buttons, uploaders, checkboxes, links etc. for the app

#### Button
Display a button widget.
```
st.button("Reset", type="primary")
if st.button("Say hello"):
    st.write("Why hello there")
```

#### Download button
Display a download button widget.
```
text_contents = '''This is some text'''
st.download_button("Download some text", text_contents)
```

#### File Uploader
Display a file uploader widget.
```
uploaded_files = st.file_uploader(
    "Choose a CSV file", accept_multiple_files=True
)
for uploaded_file in uploaded_files:
    bytes_data = uploaded_file.read()
    st.write("filename:", uploaded_file.name)
    st.write(bytes_data)
```

#### Checkbox
Display a checkbox widget.
```
agree = st.checkbox("I agree")
```

#### Feedback
Display a feedback widget.
```
sentiment_mapping = ["one", "two", "three", "four", "five"]
selected = st.feedback("stars")
if selected is not None:
    st.markdown(f"You selected {sentiment_mapping[selected]} star(s).")
```


#### Link button
Display a link button element.
```
st.link_button("Go to gallery", "https://streamlit.io/gallery")
```

#### Page link
Display a link to another page in a multipage app or to an external page.
```
st.page_link("http://www.google.com", label="Google", icon="ðŸŒŽ")
```

#### Select box
Display a select widget.
```
option = st.selectbox(
    "How would you like to be contacted?",
    ("Email", "Home phone", "Mobile phone"),
)
```

#### Multselect
Display a multiselect widget.
```
options = st.multiselect(
    "What are your favorite colors",
    ["Green", "Yellow", "Red", "Blue"],
    ["Yellow", "Red"],
)
```

### Status Elements
Reference for creating progress bars, code messages and statuses for the app

#### Creating a progress bar
Display a progress bar
```
progress_text = "Operation in progress. Please wait."
my_bar = st.progress(0, text=progress_text)

for percent_complete in range(100):
    time.sleep(0.01)
    my_bar.progress(percent_complete + 1, text=progress_text)
```

#### Creating Temporary code messages
Temporarily displays a message while executing a block of code.

```
with st.spinner('Wait for it...'):
    time.sleep(5)
st.success("Done!")
```

#### Status message
Insert a [status container](https://docs.streamlit.io/develop/api-reference/status/st.status) to display output from long-running tasks.

```
with st.status("Downloading data...", expanded=True) as status:
    st.write("Searching for data...")
    time.sleep(2)
    st.write("Found URL.")
    time.sleep(1)
    st.write("Downloading data...")
    time.sleep(1)
    status.update(
        label="Download complete!", state="complete", expanded=False
    )
```

## Run your app
This section runs the python file generated above through streamlit to create your app! \
It generates the IP Address for accessing your app, copy the IP Address and select the link. \
Paste the IP Address in the the **Tunnel Password** and submit, you will be redirected to your app.

In [None]:
# @title Launch app
# @markdown Copy the tunnel password and paste it in the url to access your app!
! python3 -m streamlit run app.py &>/content/logs.txt & npx localtunnel --port 8501 & curl ipv4.icanhazip.com
# jx-eg-demo/demo

## NotebookLM
This section goes through NotebookLM.
Go [here](https://notebooklm.google.com/) to access NotebookLM