# Sonar Mind : Echoes of Knowledge ,Made Clear

# Team Composition 

1. 22BCB7267 Patan Mohammed Ibrahim Khan
2. 22BCB7275 Duggireddy Venu
3. 22BCB7132 Maile Hruday Raj
4. 22BCB7003 Sanikommu Divakar Reddy

# Introduction
Educational environments today face significant challenges in accommodating diverse learning needs, especially for students with language barriers, attention difficulties, or hearing impairments. Many students struggle to keep up with live lectures and often find themselves excluded from meaningful classroom participation. To address these pressing concerns, we developed Sonar Mind: The Educational Assistant, a comprehensive AI-powered tool specifically designed to make classroom content accessible and engaging for all learners. Our system transforms traditional lecture audio into interactive, searchable formats through the integration of Generative AI, Retrieval-Augmented Generation technology, and a user-friendly Gradio interface with an appealing aesthetic.

We built this platform around the core understanding that modern classrooms require flexible, adaptive solutions that meet students where they are in their learning journey. Our implementation includes essential AI capabilities such as intelligent question-answering systems, accurate speech-to-text transcription, sophisticated semantic search functionality, and personalized quiz generation features.

# Objectives:
1. Bridge Educational Accessibility Gaps
   1. Eradicate barriers faced by deaf and hard-of-hearing students in traditional lecture-based learning environments
   2. Provide real-time, automated transcription services to replace dependency on manual interpreters and note-takers
   3. Guarantee equal access to educational content regardless of physical abilities or learning differences
2. Transform Passive Learning into Interactive Education
    1. Convert static lecture audio into searchable, interactive digital resources
	2. Enable students to actively engage with course material through AI-powered question-answering capabilities
	3. Create engaging learning experiences that adapt to individual student needs and learning paces
3. Enhance Learning Retention and Self-Assessment
	1. Implement automated quiz generation based on lecture content to reinforce understanding
	2. Provide feedback and contextual answers derived from actual classroom discussions
	3. Support self-paced learning by allowing students to revisit and review specific lecture segments
4. Enhance Inclusive and Equitable Educational Environments
     1. Reduce achievement gaps between students with disabilities and their peers
	 2. Create universally beneficial learning tools that enhance the educational experience for all students
	 3. Promote educational equity by ensuring that learning barriers do not limit student potential or participation


# Gen AI functionalities:
 
1) Audio understanding: Uses Gemini's Speech-to-Text models to translate spoken lectures into text.
2) Structured output/JSON mode/controlled generation: For consistent rendering, quiz answers and content adhere to rigorous structured formats.
3) Few-shot prompting: Assists Gemini in producing precise responses, even in cases where transcripts contain small transcription errors.
4) Grounding: Ensures that no information from outside sources or hallucinations are used in any AI-generated responses, where all answers are based exclusively on the uploaded lecture.
5) Embeddings & vector databases: For effective vector-based retrieval, lecture text is converted into semantic embeddings and saved in Chroma DB.
6) Retrieval-Augmented Generation (RAG): Combines Gemini Flash's generation capabilities with vector retrieval to provide    accurate and contextual responses
7) Vector search: To find the most pertinent portions of the lecture, natural language queries initiate vector similarity lookups.
8) Function calling (Lang Chain): Organizes evaluators, chains, as well as retrievers into a seamless and modular pipeline

# System Workflow
The assistant's architecture follows a simple flow as follows:

1) Audio Upload

Teachers or students can upload recorded lecture audio using Gradio's user-friendly interface. Gemini STT transforms the audio into accurate text transcripts.

2) Transcription and Chunking

LangChain's RecursiveCharacterTextSplitter divides the raw transcript into digestible portions for efficient search and retrieval.

3) Embedding & Indexing

Each chunk is embedded using Gemini's embedding models and stored in ChromaDB for reliable and fast vector-based search.

4) Lecture Content Querying (RAG)

Students use natural language when typing questions. While the assistant uses ChromaDB for semantic search, then Gemini Flash generates grounded and context-aware responses using the lecture fragments that were retrieved.

5) Quiz Generation

Few-shot prompting is used to automatically generate quizzes from the lecture transcript. It is easy to render outputs into interactive UI elements because they follow a JSON schema.

6) Interactive Feedback Mechanism

Students evaluate themselves through gamified,quizzes with streak tracking and real-time scoring, which makes learning fun and inclusive.

# Tech Stack Used 

Throughout this project, there were a few key tech stacks used to create a seamless workflow.

1. **Gemini Flash / Pro**: For transcription, text generation, embeddings, and quiz creation.



2. **ChromaDB**: A vector store that facilitates quick semantic searches.


3. **LangChain**: Oversees the coordination of model prompting, retrieval, as well as function calls.


4. **Gradio**: A simplified user interface for audio upload, question interaction, as well as quiz feedback.


5. **Python**: Serves as the foundation for connecting all services and workflows.





# Install Libraries & Packages
In the code cell below, some important libraries and packages are installed to be able to run this notebook.  These include tools to access Google’s Generative AI models, creating interactive user interfaces with Gradio, as well as building smart applications with LangChain. On the other hand, ChromaDB is used to store and search text efficiently, whereas FastAPI is included to enable deployment of the project as a web service if needed. By installing these libraries and packages, it will ensure that all components of the AI assistant work properly within the notebook environment.

In [None]:
!pip install -q -U google-genai
!pip install langchain_community
!pip install gradio==4.14.0 google-generativeai==0.3.2 --quiet
!pip install -qU langchain-google-genai
!pip install chromadb
!pip install fastapi==0.112.2

# Import Libraries

In [None]:
import numpy as np
import tempfile
import soundfile as sf
import gradio as gr
import json
import random
import os
import logging
from google.api_core.exceptions import GoogleAPIError, NotFound, PermissionDenied
from google import genai
from google.genai import types
from google.colab import userdata
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.documents import Document
from langchain_core.output_parsers import StrOutputParser
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_core.runnables import RunnablePassthrough
from langchain.retrievers.multi_query import MultiQueryRetriever
from langchain.chains.retrieval import create_retrieval_chain
from langchain_core.output_parsers import JsonOutputParser
from langchain.chains.combine_documents import create_stuff_documents_chain

from kaggle_secrets import UserSecretsClient
API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")
client = genai.Client(api_key=API_KEY)

# Download Audio Dataset
In order to have a thorough testing using this Inclusive Classroom Assistant, there are 4 different types of audio in the dataset path as below, such as:


1) Short audio with clean background sound


2) Medium-length long audio with moderate noise


3) Long audio with clean background sound


4) Long audio with heavy background noise

In [None]:
dataset_path = "/kaggle/input/kaggle-capstone-project-audio-library"
print("Files:", os.listdir(dataset_path))

# Code Walkthrough on Each Function
The project starts by allowing users to upload audio files, which are then converted into text by Gemini's speech-to-text model. This transcript can be formatted, stored, and updated by users as needed. The transcript is then segmented into smaller parts and stored in a vector database using ChromaDB to enable efficient semantic search. This makes it possible for the Retrieval-Augmented Generation (RAG) chain to precisely respond to user inquiries using just the transcript. The system also has a quiz generation feature that creates multiple-choice questions from the transcript in structured JSON format using few-shot prompting to facilitate quiz-based learning. Interactive elements let users choose their responses, receive immediate feedback, and monitor their scores. Besides, the user-friendly Gradio interface combines these features to provide students a practical and interesting tool. Some of the GenAI features used throughout the project to guarantee the assistant produces relevant, context-aware outputs depending on lecture material are grounding, audio understanding, embeddings, vector search, structured output, and few-shot prompting.

## Set the Global State

## Global State Management
These variables store shared state across the notebook. They act like session memory that enables multiple parts of the app (transcription, RAG, quiz, etc.) to access and update the same data.

| Variable               | Purpose                                                       |
|------------------------|---------------------------------------------------------------|
| full_transcript        | Holds all transcribed chunks from audio                       |
| vector_db              | Stores the Chroma vector database for vector search           |
| rag_chain              | Holds the Retrieval-Augmented Generation pipeline             |
| current_correct_answer | Tracks quiz answers (used in quiz evaluation logic)           |


These variables ensure seamless interaction between various functions, as well as to help in maintaining the assistant's overall state.

In [None]:
# ------------- Global State -------------
full_transcript = []
lecture_index = []
vector_db = None
rag_chain = None
current_correct_answer = ""

## Transcription Module (Gemini STT)
This section will explain on uploading audio to Gemini API, generating transcriptions using a controlled prompt, storing and resetting the full transcript state, and integrating with Gradio (upload, clear, display). Besides, it will also demonstrates on the audio understanding, function calling, and grounded prompting (plain text, no hallucinations).


The definition of each function will be defined as below:

* **get_full_transcript_text()**: Combines all the transcribed pieces into one readable paragraph.

* **clear_transcript_data()**: Resets the transcript list, allowing a fresh start when a new audio file is uploaded.

On the other hand, the other function handles the speech-to-text transcription. When a user uploads an audio file, **transcribe_audio_chunk()** sends the file to Google’s Gemini AI. Then, it will listen to the audio, turns it into text, and returns it. However, if anything goes wrong such as a permission issue or upload failure, the function will catch it and display a helpful error message.

## Formatting the Output
Moreover, **format_transcription_result()** simply returns the transcript as it is. This function is kept separate in case formatting or cleaning is needed later.

## Gradio Transcription Handler
The functions named handle_transcription_request() as well as handle_clear_transcript() manage the interaction between the user and the app for transcription.

* **handle_transcription_request()**: This function is triggered when a user uploads an audio file. It sends the audio for transcription, stores the result, and updates the transcript display.

* **handle_clear_transcript()**: Resets the transcript area in the app when the user clicks the "Clear Full Transcript" button.


## Chunking, Embedding, Vector DB & RAG Setup
This section will explain on chunking of transcript into semantically rich and overlapping segments, Gemini embedding and storage in Chroma DB, LangChain-powered RAG pipeline using Gemini for Q&A, as well as few-shot prompt examples for more grounded answers.




This is where the transcript is prepared for smart search and question-answering.

* **chunk_transcript()**: Breaks long text into smaller overlapping pieces.

* **create_vector_db()**: Turns those chunks into vector form using embeddings and stores them in a searchable database called Chroma.

* **setup_rag_chain()**: Connects the Chroma database to a Gemini-powered answering system. It ensures that when a user asks a question, the app can find the most relevant transcript pieces and provide a focused answer.

## Indexing the Transcript
The **handle_indexing_request()** function brings everything together. It chunks the transcript, creates the searchable vector database, and sets up the RAG (Retrieval-Augmented Generation) chain. It must be run before users can start asking questions.

## Question Answering (RAG)
These below functions handle questions submitted by users.

* **query()**: Sends a user question to the RAG system and retrieves an answer.

* **answer_query_using_rag()**: Checks if the system is ready and then processes the user’s question based on the content of the transcript.

## Quiz Generation with Few-Shot Prompting
This section sets up an AI-powered quiz generator. The function named **setup_quiz_chain()** gives the model a structured example and clear format. Hence, it creates 5 multiple-choice questions using only the transcript as reference. It returns the output in clean JSON format so that it is easy to display in the interface.


## Global Quiz State
**quiz_state** is used to keep track of the quiz progress. This includes the current question, score, as well as streak of correct answers. It acts like the memory for the quiz system.

## Generate Quiz Questions
This module uses Gemini to generate multiple-choice questions from transcript data. It has a structured JSON output for clean parsing, uses few-shot prompting to improve clarity, and controls generation using LangChain JsonOutputParser. Besides, it also has a full quiz lifecycle from generation, to answer checking, and last to navigation. Lastly, it provides scoring and streak logic for a fun and interactive learning experience. Regarding the output, it is displayed in Gradio with buttons, feedback, and scoring badges.

The function named **generate_quiz()** calls the quiz chain to generate new questions based on the transcript. It prepares the quiz state and shows the first question along with 4 answer options.

## Quiz Answer Evaluation
Furthermore, the function called **select_answer()** checks if the selected option is correct. It updates the score and tracks the user’s answer streak. Besides, it also prevents the same question from being answered more than once.

## Next Question Handler
For **advance_to_next_question()** function, it helps to move to the next quiz question if the user has already answered the current one. When the quiz is done, it calculates the final score as a percentage and shows the result in color-coded text (green if you pass, red if not).




## Quiz and UI Helper Functions
* **generate_quiz_and_buttons()**: Generates a quiz and sets up the option buttons in the UI.

* **select_answer_and_update()**: Checks the selected answer and updates the quiz feedback.

* **load_transcript()**: Returns the current transcript only so that it can be expanded for more use later.

* **clear_transcript()**: Resets the transcript display area.

* **handle_query_request()**: Handles the user’s question submission and returns an answer based on the indexed transcript.


In [None]:
# ------------- Transcript State Management -------------
def get_full_transcript_text():
    return " ".join(full_transcript)

def clear_transcript_data():
    global full_transcript
    full_transcript = []
    print("Transcript data cleared.")
    return ""

# ------------- Gemini STT Function -------------
def transcribe_audio_chunk(audio_path):
    try:
        client = genai.Client(api_key=API_KEY)
        uploaded_file = client.files.upload(file=audio_path)
        print(f"✅ File uploaded. URI: {uploaded_file.uri}, Name: {uploaded_file.name}")

        prompt = (
            "Please perform speech-to-text transcription for the provided audio file. "
            "Output the transcribed text followed by the key points as a numbered list. "
            "Do not use any JSON formatting—just return plain text."
        )

        print("🚀 Sending transcription request to Gemini...")
        response = client.models.generate_content(
            model='gemini-2.0-flash',
            contents=[prompt, uploaded_file]
        )

        if not response.candidates:
            block_reason = response.prompt_feedback.block_reason if response.prompt_feedback else "Unknown"
            return f"Transcription failed. Block Reason: {block_reason}"

        candidate = response.candidates[0]
        if hasattr(candidate.content, 'parts') and candidate.content.parts:
            transcript = candidate.content.parts[0].text
            print("✅ Transcription successful.")
            return transcript
        else:
            return "Error: Failed to parse transcription response."

    except PermissionDenied as e:
        return f"❌ Permission Denied: {e.message}"
    except NotFound as e:
        return f"❌ Resource Not Found: {e.message}"
    except GoogleAPIError as e:
        return f"❌ API Error: {e.message}"
    except Exception as e:
        return f"❌ Unexpected Error: {str(e)}"

# ------------- Formatting the Output -------------
def format_transcription_result(result_text):
    return result_text

# ------------- Gradio Transcription Handler -------------
def handle_transcription_request(audio_file):
    if audio_file is None:
        return "", get_full_transcript_text(), gr.update(value=None), "Transcription not initiated.", "Input declined. No audio file provided."

    transcript_text = transcribe_audio_chunk(audio_file)
    formatted_chunk = format_transcription_result(transcript_text)
    full_transcript.append(formatted_chunk)

    return (
        formatted_chunk,
        get_full_transcript_text(),
        gr.update(value=None),
        "Transcription successful.",
        "Input accepted. Audio file is being processed."
    )

def handle_clear_transcript():
    clear_transcript_data()
    return "", "Transcript cleared."



# ------------- Chunking, Embedding, Vector DB & RAG -------------

def chunk_transcript(text, chunk_size: int = 800, overlap_size: int = 150):
    # Optionally, you could call: text = correct_transcript_errors(text)
    document = [Document(page_content=text)]
    splitter = RecursiveCharacterTextSplitter(
        chunk_size=chunk_size,
        chunk_overlap=overlap_size
    )
    chunks = splitter.split_documents(documents=document)
    print(f"File split into {len(chunks)} chunks.")
    return chunks

# ------------- Chunking, Embedding, Vector DB & RAG -------------

import asyncio

def create_vector_db(text_chunks, collection_name="transcription-rag"):
    global vector_db
    try:
        # Ensure we have an event loop in the current thread
        try:
            asyncio.get_running_loop()
        except RuntimeError:
            loop = asyncio.new_event_loop()
            asyncio.set_event_loop(loop)

        embeddings = GoogleGenerativeAIEmbeddings(
            model="models/text-embedding-004",
            google_api_key=API_KEY
        )

        vector_db = Chroma.from_documents(
            documents=text_chunks,
            embedding=embeddings,
            collection_name=collection_name,
            persist_directory="/content/chroma_db"  # Ephemeral persist directory
        )
        print(f"✅ Vector DB created with collection_name: {collection_name}")
        return vector_db

    except Exception as e:
        raise Exception(f"Error creating vector DB: {str(e)}")


def setup_rag_chain(vector_db):
    if not vector_db:
        raise ValueError("Vector DB not initialized!")

    try:
        llm = ChatGoogleGenerativeAI(
            model="gemini-2.0-flash",
            temperature=0.1,
            max_tokens=None,
            timeout=None,
            max_retries=2,
            google_api_key=API_KEY
        )

        # Few-shot query rewriting prompt
        query_prompt = PromptTemplate.from_template("""
            You are an AI assistant that helps rephrase queries.

            Example 1:
            Original Question: Who is Master Sito?
            Alternative Queries:
              1. According to the transcript, what is Master Sito's role?
              2. What does the transcript state about Master Sito?
              3. How is Master Sito described in the lecture?

            Example 2:
            Original Question: Who is Master Sito?
            Even if the transcript contains a minor typo (e.g., 'Master Ceto'),
            assume the intended name is Master Sito.

            Now, given the original question: {question}
            Generate three alternative queries:
        """)

        retriever = MultiQueryRetriever.from_llm(
            retriever=vector_db.as_retriever(search_kwargs={"k": 4}),
            llm=llm,
            prompt=query_prompt
        )

        # Main prompt for answering with grounding and few-shot examples
        main_template = """
            You are an educational assistant. Answer the user's question based solely on the transcript context provided.
            Disregard minor transcription errors (for example, if the transcript has "Master Ceto" but context indicates it should be "Master Sito").
            If the answer is explicitly stated, provide it exactly. Otherwise, reply "I don’t know."

            Few-shot examples:
            ---------------------
            Transcript Example 1:
            "Master Sito said: 'Face life with humor.'"
            Q: What did Master Sito say about life?
            A: Face life with humor.
            ---------------------
            Transcript Example 2:
            "According to the lecture, Master Sito is a monk living in seclusion."
            Q: Who is Master Sito?
            A: He is a monk.
            ---------------------
            Now, using the transcript below:
            Transcript:
            {context}

            Question: {question}
            Answer:
        """

        prompt = ChatPromptTemplate.from_template(template=main_template)

        chain = (
            {"context": retriever, "question": RunnablePassthrough()}
            | prompt
            | llm
            | StrOutputParser()
        )

        print("RAG chain setup complete!")
        return chain

    except Exception as e:
        raise Exception(f"Error setting up RAG chain: {str(e)}")

def handle_indexing_request(transcript_text):
    global vector_db, rag_chain
    if not transcript_text or len(transcript_text.strip()) == 0:
        return "⚠️ Transcript is empty. Please transcribe or paste something first."
    try:
        chunks = chunk_transcript(transcript_text)
        vector_db = create_vector_db(chunks)
        rag_chain = setup_rag_chain(vector_db)
        return f"✅ Indexing complete. {len(chunks)} chunks indexed."
    except Exception as e:
        return f"❌ Indexing failed: {str(e)}"

def query(chain, question: str):
    if not chain:
        print("RAG chain not initialized!")
    try:
        return chain.invoke(question)
    except Exception as e:
        raise Exception(f"Error processing query: {str(e)}")

def answer_query_using_rag(user_query):
    global rag_chain
    if not rag_chain:
        return "⚠️ Please index the transcript first."
    try:
        result = query(rag_chain, user_query)
        return f"💬 {result}"
    except Exception as e:
        return f"❌ Error: {str(e)}"

#-------------Quiz Generation with few shot prompting---------------------------------------------
def setup_quiz_chain():
    try:
        llm_quiz = ChatGoogleGenerativeAI(
            model="gemini-2.0-flash",
            temperature=0.1,
            # Consider setting a reasonable max_tokens limit, e.g., max_tokens=1024
            max_tokens=None,
            # Consider setting an explicit timeout, e.g., timeout=120
            timeout=None,
            max_retries=2,
            google_api_key=API_KEY
        )

        quiz_template = """
            You are an educational assistant. Your task is to generate 5 multiple-choice quiz questions based only on the transcript provided below.
            Please return the output strictly as valid JSON. Do not include any introductory text or markdown formatting around the JSON object.
            The JSON should be a list containing 5 objects, each following this format:

            {{
              "question": "Your quiz question here.",
              "options": ["Option A", "Option B", "Option C", "Option D"],
              "answer": "The correct option (must exactly match one of the options)"
            }}

            Transcript:
            {transcript}

            JSON Output:
        """ # Added "JSON Output:" hint and refined instructions slightly

        quiz_prompt = PromptTemplate.from_template(quiz_template)
        # For standard JSON:
        parser = JsonOutputParser()

        # Update the chain to use the JsonOutputParser
        chain = (
            {"transcript": RunnablePassthrough()}
            | quiz_prompt
            | llm_quiz
            | parser # <-- Use JsonOutputParser instead of StrOutputParser
        )
        print("Quiz chain setup complete!")
        return chain

    except Exception as e:
        raise Exception(f"Error setting up Quiz chain: {str(e)}")


# --- Global Quiz State ---
quiz_state = None

# --- Function to Generate Quiz ---
def generate_quiz(transcript: str):
    global quiz_state
    if not transcript or transcript.strip() == "":
        return "⚠️ Please provide a transcript.", [], "No quiz generated."
    try:
        chain = setup_quiz_chain()
        output = chain.invoke({"transcript": transcript})
        print("DEBUG - Chain output:", output)
        quiz_data = output  # Already parsed JSON from JsonOutputParser.
    except Exception as e:
        return f"Quiz generation failed: {str(e)}", [], "Error occurred."
    if not quiz_data or len(quiz_data) == 0:
        return "⚠️ No quiz questions returned by the model.", [], ""

    # Initialize quiz state with an additional 'answered' flag.
    quiz_state = {
    "questions": quiz_data,
    "current_index": 0,
    "score": 0,
    "streak": 0,            # New: Track consecutive correct answers.
    "answered": False       # New: Flag to indicate if the current question is answered.
}

    first_question = quiz_data[0]
    return first_question["question"], first_question["options"], ""

# --- Function to Evaluate Answer (without advancing to next question) ---
def select_answer(index: int):
    global quiz_state
    if not quiz_state or "questions" not in quiz_state:
        return "No quiz generated. Please generate a quiz first.", "N/A", "N/A", "N/A", "N/A", "⚠️", "Score: 0 | Streak: 0"

    # Prevent re-answering if the question was already answered.
    if quiz_state.get("answered", False):
        current_question = quiz_state["questions"][quiz_state["current_index"]]
        options = current_question.get("options", [])
        btn_labels = [options[i] if i < len(options) else "N/A" for i in range(4)]
        return (current_question["question"], btn_labels[0], btn_labels[1], btn_labels[2], btn_labels[3],
                "You have already answered. Click 'Next Question' to continue.",
                f"Score: {quiz_state.get('score', 0)} | Streak: {quiz_state.get('streak', 0)}")

    current_question = quiz_state["questions"][quiz_state["current_index"]]
    options = current_question.get("options", [])
    if index >= len(options):
        return "Invalid option selected.", "N/A", "N/A", "N/A", "N/A", "Error", f"Score: {quiz_state.get('score', 0)} | Streak: {quiz_state.get('streak', 0)}"

    selected_option = options[index]

    # Check answer and update score and streak.
    if selected_option == current_question["answer"]:
        feedback = "Correct!"
        quiz_state["score"] += 1
        quiz_state["streak"] += 1
    else:
        feedback = f"Incorrect. The correct answer was: {current_question['answer']}."
        quiz_state["streak"] = 0

    quiz_state["answered"] = True  # Mark the question as answered.
    btn_labels = [options[i] if i < len(options) else "N/A" for i in range(4)]
    score_text = f"Score: {quiz_state['score']} | Streak: {quiz_state['streak']}"
    return (current_question["question"], btn_labels[0], btn_labels[1], btn_labels[2], btn_labels[3],
            feedback, score_text)

# --- Function to Advance to the Next Question ---
def advance_to_next_question():
    global quiz_state
    if not quiz_state or "questions" not in quiz_state:
        return "No quiz generated. Please generate a quiz first.", "N/A", "N/A", "N/A", "N/A", "⚠️", "Score: 0 | Streak: 0"

    if not quiz_state.get("answered", False):
        return "Please select an answer before proceeding.", "N/A", "N/A", "N/A", "N/A", "⚠️", f"Score: {quiz_state['score']} | Streak: {quiz_state['streak']}"

    quiz_state["current_index"] += 1
    quiz_state["answered"] = False  # Reset the answered flag.
    if quiz_state["current_index"] < len(quiz_state["questions"]):
        next_q = quiz_state["questions"][quiz_state["current_index"]]
        options = next_q.get("options", [])
        btn_labels = [options[i] if i < len(options) else "N/A" for i in range(4)]
        return (next_q["question"], btn_labels[0], btn_labels[1], btn_labels[2], btn_labels[3],
                "", f"Score: {quiz_state['score']} | Streak: {quiz_state['streak']}")
    else:
        score = quiz_state["score"]
        total = len(quiz_state["questions"])
        percentage = round((score / total) * 100)
        color = "red" if percentage < 60 else "green"
        # Display final score with some HTML styling.
        percent_display = f"<span style='color:{color}; font-weight:bold;'>{percentage}%</span>"
        final_msg = f"Quiz complete! Your final score is {score} out of {total}: {percent_display}."
        quiz_state = None
        return final_msg, "", "", "", "", "", ""


# --- Combined function to update quiz question & button labels on generation ---
def generate_quiz_and_buttons(transcript: str):
    question, options, feedback = generate_quiz(transcript)
    btn_labels = ["N/A", "N/A", "N/A", "N/A"]
    if isinstance(options, list):
        for i in range(min(len(options), 4)):
            btn_labels[i] = options[i]
    score_text = "Score: 0 | Streak: 0"
    return question, btn_labels[0], btn_labels[1], btn_labels[2], btn_labels[3], feedback, score_text

def select_answer_and_update(index: int):
    # (Call our select_answer function.)
    return select_answer(index)

def load_transcript(full_text):
    # For now, simply return the same text.
    # Adjust this function based on your intended behavior.
    return full_text

def clear_transcript():
    # This dummy implementation clears the transcript and returns a cleared status message.
    return "", "Transcript cleared."

def handle_query_request(user_query):
    if not user_query or not user_query.strip():
        return "⚠️ Please enter a valid question about the lecture."

    # Hypothetical function that uses your indexed transcript + LLM:
    return answer_query_using_rag(user_query)

# Gradio Interface

In [None]:
# ------------------ Gradio Interface with Custom Golden Theme ------------------

with gr.Blocks(
    theme="d8ahazard/material_design_rd",
    css="""
    @import url('https://fonts.googleapis.com/css2?family=Playfair+Display:wght@600;700&family=Raleway:wght@400;500&display=swap');

    /* Universal font styles with Golden Theme */
    body,
    .gradio-container,
    .gr-button,
    .gr-markdown,
    .gr-textbox,
    h1, h2, h3, h4, p {
      font-family: 'Raleway', sans-serif !important;
      color: #f5deb3 !important;  /* warm golden beige */
      letter-spacing: 0.05em;
      line-height: 1.6;
      background-color: #1a1a1a !important; /* deep dark background */
      margin: 0;
      padding: 0;
    }

    /* Accent color definition for buttons and highlights */
    .accent-bg {
      background: linear-gradient(135deg, #caa44d, #e6c85d) !important; /* gold gradient */
      font-family: 'Playfair Display', serif !important;
      color: #1a1a1a !important;  /* dark text */
      font-weight: 700;
      border-radius: 6px;
      box-shadow: 0 0 8px rgba(202,164,77,0.7);
      transition: all 0.3s ease-in-out;
    }
    .accent-bg:hover {
      background: linear-gradient(135deg, #e6c85d, #f5e08a) !important;
      box-shadow: 0 0 12px rgba(255,215,0,0.9);
    }

    /* Elegant heading style */
    .golden-heading {
      font-family: 'Playfair Display', serif !important;
      letter-spacing: 0.1em;
      font-size: 30px;
      color: #ffd700 !important;
      text-shadow: 0 0 6px rgba(255,215,0,0.8);
    }

    /* Smooth typewriter effect for header */
    .typewriter {
      overflow: hidden;
      border-right: .15em solid #ffd700;
      white-space: nowrap;
      animation: typing 2.5s steps(30, end), blink-caret 0.75s step-end infinite;
      width: fit-content;
      font-weight: 700;
      line-height: 1.8;
    }
    @keyframes typing {
      from { width: 0; }
      to { width: 100%; }
    }
    @keyframes blink-caret {
      from, to { border-color: transparent; }
      50% { border-color: #ffd700; }
    }

    /* Logo bounce animation */
    #bot-logo img {
      animation: bounce 1.2s ease infinite !important;
      border-radius: 8px;
      width: 90px;
      height: 90px;
      object-fit: contain;
      margin-right: 8px;
    }
    @keyframes bounce {
      0%, 100% { transform: translateY(0); }
      50% { transform: translateY(-8px); }
    }

    /* Dark gold textboxes */
    .gr-textbox, .gr-textbox textarea, .gr-textbox input {
      background-color: #262626 !important;
      color: #f5deb3 !important;
      border: 1px solid #caa44d !important;
      border-radius: 6px;
      padding: 4px 8px;
    }
    ::placeholder {
      color: #d4af37 !important;
      opacity: 0.8;
    }

    /* Header spacing */
    #header {
      margin-bottom: 16px;
    }
    .tab-content {
      padding: 16px;
    }

    /* Gold scoreboard */
    #quiz-scoreboard {
      border: 2px solid #ffd700;
      padding: 8px;
      margin-bottom: 8px;
      font-family: 'Playfair Display', serif;
      color: #ffd700;
      background-color: #1a1a1a;
      text-align: right;
    }

    /* Golden panel for prompts/feedback */
    .golden-panel {
      border: 2px solid #caa44d;
      background-color: #111;
      padding: 8px;
      margin-bottom: 8px;
      font-family: 'Raleway', sans-serif;
      color: #f5deb3 !important;
      text-align: center;
      text-shadow: 0 0 4px rgba(202,164,77,0.7);
    }

    /* Tab label styles */
    .gradio-container .tabs button {
      font-family: 'Playfair Display', serif !important;
      color: #f5deb3 !important;
      background-color: transparent !important;
      border: none !important;
    }
    .gradio-container .tabs button:hover {
      background-color: #333 !important;
      color: #ffd700 !important;
    }
    .gradio-container .tabs button.selected {
      color: #ffd700 !important;
      text-shadow: 0 0 4px rgba(255,215,0,0.8);
    }

    /* Hide audio file icon */
    .gradio-audio .file-drop svg {
        display: none !important;
    }
    """
) as app:

    gr.Markdown('<link href="https://fonts.googleapis.com/css2?family=Playfair+Display:wght@600;700&family=Raleway:wght@400;500&display=swap" rel="stylesheet">')

    with gr.Row():
         with gr.Column(scale=1):
            gr.Markdown(
                """
                <h2 class="golden-heading typewriter" style="margin: 0;">
                    Inclusive Classroom Assistant
                </h2>
                <p style="margin: 4px 0 0 0; font-size: 14px; color: #f5deb3;">
                    Upload audio, transcribe, index, and ask anything about your lecture!
                </p>
                """,
                elem_id="header"
            )

    # Tab 1
    with gr.Tab("🎙️ Transcription & Indexing") as tab1:
        with gr.Row():
            with gr.Column(scale=1):
                gr.Markdown("<h3 style='color:#ffd700;'>Transcription Input</h3>")
                audio_input = gr.Audio(type="filepath", show_label=False)
                transcribe_button = gr.Button("Transcribe Chunk", elem_classes="accent-bg")
                transcription_input_status_textbox = gr.Textbox(label="Transcription Input Status", lines=1, interactive=False)
                latest_chunk_textbox = gr.Textbox(label="Latest Transcript Chunk", lines=10, interactive=False)
                status_textbox = gr.Textbox(label="Status", lines=1, interactive=False)
            with gr.Column(scale=1):
                gr.Markdown("<h3 style='color:#ffd700;'>Full Transcript & Indexing</h3>")
                full_transcript_textbox = gr.Textbox(label="Full Lecture Transcript", lines=20, interactive=False)
                with gr.Row():
                    index_button = gr.Button("Index Transcript for Search", elem_classes="accent-bg")
                    clear_button = gr.Button("Clear Full Transcript", elem_classes="accent-bg")
                indexing_status_display = gr.Textbox(label="Indexing Status", lines=2, interactive=False)

    # Tab 2
    with gr.Tab("💬 Query Lecture Content") as tab2:
        gr.Markdown("<h3 style='color:#ffd700;'>Ask a question about the lecture content</h3>")
        with gr.Row():
            query_input_textbox = gr.Textbox(
                label="Ask a question",
                placeholder="E.g., What lesson did Sam learn?",
                lines=2
            )
            ask_button = gr.Button("Ask Question", elem_classes="accent-bg")
        answer_display = gr.Markdown(
            "💡 Answer will appear here...",
            elem_classes="query-answer-box golden-panel"
        )

    # Tab 3
    with gr.Tab("📝 Quiz Generator") as tab3:
        scoreboard = gr.Markdown("Score: 0 | Streak: 0", elem_id="quiz-scoreboard")
        gr.Markdown("<h3 style='color:#ffd700;'>Generate Quiz from Transcript</h3>")
        gr.Markdown("<p class='golden-panel'>Click <strong>Generate Quiz</strong> to start. Answer each question and review your score and correct answer streak after each question.</p>")
        generate_btn = gr.Button("Generate Quiz", elem_classes="accent-bg")
        quiz_question = gr.Markdown("Question will appear here", elem_classes="golden-panel")
        with gr.Row():
            option_button1 = gr.Button("Option 1", elem_classes="accent-bg")
            option_button2 = gr.Button("Option 2", elem_classes="accent-bg")
            option_button3 = gr.Button("Option 3", elem_classes="accent-bg")
            option_button4 = gr.Button("Option 4", elem_classes="accent-bg")
        feedback_box = gr.Textbox(label="Feedback", interactive=False, elem_classes="golden-panel")
        next_btn = gr.Button("Next Question", elem_classes="accent-bg")

    # Button Callbacks
    transcribe_button.click(
        fn=handle_transcription_request,
        inputs=[audio_input],
        outputs=[latest_chunk_textbox, full_transcript_textbox, audio_input, status_textbox, transcription_input_status_textbox]
    )
    index_button.click(
        fn=handle_indexing_request,
        inputs=[full_transcript_textbox],
        outputs=[indexing_status_display]
    )
    clear_button.click(
        fn=clear_transcript_data,
        inputs=None,
        outputs=[full_transcript_textbox, status_textbox]
    )
    ask_button.click(
        fn=handle_query_request,
        inputs=[query_input_textbox],
        outputs=[answer_display]
    )
    generate_btn.click(
        fn=generate_quiz_and_buttons,
        inputs=[full_transcript_textbox],
        outputs=[quiz_question, option_button1, option_button2, option_button3, option_button4, feedback_box, scoreboard]
    )
    option_button1.click(
        fn=lambda: select_answer_and_update(0),
        inputs=[],
        outputs=[quiz_question, option_button1, option_button2, option_button3, option_button4, feedback_box, scoreboard]
    )
    option_button2.click(
        fn=lambda: select_answer_and_update(1),
        inputs=[],
        outputs=[quiz_question, option_button1, option_button2, option_button3, option_button4, feedback_box, scoreboard]
    )
    option_button3.click(
        fn=lambda: select_answer_and_update(2),
        inputs=[],
        outputs=[quiz_question, option_button1, option_button2, option_button3, option_button4, feedback_box, scoreboard]
    )
    option_button4.click(
        fn=lambda: select_answer_and_update(3),
        inputs=[],
        outputs=[quiz_question, option_button1, option_button2, option_button3, option_button4, feedback_box, scoreboard]
    )
    next_btn.click(
        fn=advance_to_next_question,
        inputs=[],
        outputs=[quiz_question, option_button1, option_button2, option_button3, option_button4, feedback_box, scoreboard]
    )

app.launch()


# Limitations
Although the Inclusive Classroom Assistant enhances accessibility and engagement, it still faces a few challenges. Firstly, **transcription accuracy** can be affected by factors such as audio clarity, accents, background noise, or even overlapping speakers. This will lead to errors in the speech-to-text process. Secondly, the system has **limited contextual awareness**. It generates answers solely based on the uploaded lecture content. This means that it lacks broader subject knowledge. Finally, there is a **user dependency on uploads**. Currently, the system relies on manual audio uploads. However, this could be automated in the future by using classroom microphones.