
# RAG Interview Preparation - End-to-End Demo

This notebook demonstrates the end-to-end pipeline:

1. Audio Recording
2. Transcription (Local Whisper or OpenAI API)
3. Document Upload & Embedding
4. Retrieval from Pinecone
5. Feedback Generation (GPT or LLaMA)


In [1]:
import os
import openai
import whisper
import sounddevice as sd
import numpy as np
from scipy.io.wavfile import write
from dotenv import load_dotenv
import uuid
import time
from langchain.text_splitter import RecursiveCharacterTextSplitter
# ... plus Pinecone, etc.

# Load environment variables
load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
PINECONE_API_KEY = os.getenv("PINECONE_API_KEY")
PINECONE_ENV = os.getenv("PINECONE_ENV")

GLOBAL_AUDIO_DATA = []
GLOBAL_STREAM = None
is_recording = False

def audio_callback(indata, frames, time_, status):
    global GLOBAL_AUDIO_DATA
    GLOBAL_AUDIO_DATA.append(indata.copy())

def toggle_recording(filename="candidate_answer.wav", fs=44100):
    global is_recording, GLOBAL_AUDIO_DATA, GLOBAL_STREAM
    if not is_recording:
        # Start
        print("Recording started...")
        GLOBAL_AUDIO_DATA = []
        GLOBAL_STREAM = sd.InputStream(samplerate=fs, channels=1, callback=audio_callback)
        GLOBAL_STREAM.start()
        is_recording = True
    else:
        # Stop
        GLOBAL_STREAM.stop()
        is_recording = False
        audio_data = np.concatenate(GLOBAL_AUDIO_DATA, axis=0)
        write(filename, fs, audio_data)
        print(f"Recording stopped. Saved to {filename}")
        return filename

# Example usage in the notebook:
print("Press Enter to start recording...")
input()
toggle_recording()  # starts recording

print("Press Enter to stop recording...")
input()
audio_file = toggle_recording()  # stops recording and returns filename

# Then transcribe:
model = whisper.load_model("base")
result = model.transcribe(audio_file)
candidate_text = result["text"]
print("Transcribed text:", candidate_text)

# The rest is your normal logic for chunking, Pinecone upsert, retrieval, etc.


Recording for 3 seconds...
Recording complete!
Transcribing locally with Whisper...




Candidate's transcribed text: 
Split into 1 chunk(s).
Processing 1 chunks for upsert...
Processed chunk 1/1
Upserted batch 1
Documents have been embedded and upserted successfully!
LW[vppPd best match: 0dE"w!:jXoȊC_K5SS:YX+	jmyV\o
j"B5'M$cG(<f4'K`̩pfiJRGǅUSp7Qq.p~
wѪ
K
S)CHqėhߺTy&2$ӞΔ'j.JɷU&W5JXNAnaǭ8WUC4:ʘ7{1wCmDOU%9Կ)l}MJ\K<{;w@_X׫m`IZ*q1	-˝ݡ=ۓOrZ􌳿Ś5Y\:,9[nJsp>v/5W	D?SMvhRV~Ѕdl2ߚ\d"F3b"b
渺>嵬ǁy䨾gu>bWRxt5|kØƶ[.8RO\VJXj(8frkRtF'c^Jt)z=YKz_ՓI9z{6:.؋v<WvNƾ5~$lcaf9;2̡RUt:3/[Ɂ.~@2|#p?Y]CE
zNw.Yt,·Bgڐ:s^wxC
Q-uB7jy[f5.le1o0xRA0J~cX]"o_Fm5}$33Maw}7˝8īJ{c,ʎ-_09Fi&hLpܦ^ltlh#.)F$5,r,iFj+pl?coF0e~ =E0eL͜KeU|dh|rnJaӆ+hU oDf<||JĻ9Y	@ch55K!$|7EhlJF}=ߌbC-
ѫ+7*d>3)*)F5|~2,D8l\]]E<TM"љRVm_G%[	YDfB7z=~Hn$2޹.RY;^&CҲ[_cOiVu
Feedback:
 | Key Point Missed | Details/Explanation |
|------------------|---------------------|
| Coherence and relevance | The candidate's answer is completely unr