<a href="https://colab.research.google.com/github/eunafita/Recomendacao-produtos-por-similaridade/blob/main/interview_assistant_audio.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **🧠 AI-Powered Interview Assistant with Audio (English Practice)**

Welcome! This notebook helps you simulate job interviews in English using your own resume and a job description. It uses GPT-4 to generate interview questions, Text-to-Speech (TTS) to speak the questions, and Whisper (Speech-to-Text) to transcribe your audio responses. The assistant then evaluates your answers and gives feedback or follow-up questions — just like a real interview.

# 🎯 Objectives

* Practice speaking English in a job interview context
* Receive realistic interview questions based on your resume and a job description
* Respond with your voice, not just typing
* Get feedback on your answers from AI
* Prepare more confidently for remote or live interviews

# 🔧 Tools Used

* OpenAI GPT-4 – for generating interview questions and feedback
* OpenAI Whisper – to transcribe your audio answers
* OpenAI TTS (Text-to-Speech) – to hear the interview questions spoken aloud
* Python – for combining all components
* Google Colab – to run it all in your browser

# 📋 How to Use

* 📄 Paste your resume and the job description into the notebook.
* 🧠 GPT-4 will generate customized interview questions.
* 🔊 The AI will speak each question aloud using Text-to-Speech.
* 🎙️ You will record or upload your spoken answer.
* 📝 The assistant will transcribe your answer using Whisper.
* 🧾 GPT-4 will evaluate and respond with feedback or a follow-up question.
* 🔁 Repeat as many times as you'd like.

# ⚠️ Requirements

* An OpenAI API Key with access to GPT-4, Whisper, and TTS
* A microphone or audio file to upload
* Basic understanding of English (intermediate+ recommended)




In [None]:
# 📦 Install required libraries (only needs to be run once)
!pip install --quiet pydub
!apt-get install -y ffmpeg
!pip install pymupdf ipywidgets

# 📦 Import required libraries
import openai
import os
import IPython.display as ipd
from google.colab import files
from pydub import AudioSegment
from pydub.playback import play

# 📚 Additional utilities
import tempfile
import base64
from IPython.display import HTML, Audio
from getpass import getpass


Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
ffmpeg is already the newest version (7:4.4.2-0ubuntu0.22.04.1).
0 upgraded, 0 newly installed, 0 to remove and 34 not upgraded.


# 🔐 API Key Setup


In [None]:
from openai import OpenAI
from getpass import getpass

# 🔑 Securely ask for the API key
api_key = getpass("🔐 Enter your OpenAI API Key: ")

# 🎯 Initialize the OpenAI client with your key
client = OpenAI(api_key=api_key)

# ✅ Test the key by listing available models
try:
    models = client.models.list()
    print("✅ API key is valid. OpenAI models retrieved successfully.")
except Exception as e:
    print("❌ Failed to authenticate. Please check your API key.")
    print("Error:", e)



🔐 Enter your OpenAI API Key: ··········
✅ API key is valid. OpenAI models retrieved successfully.


In [None]:
import ipywidgets as widgets
from IPython.display import display

resume_input_type = widgets.RadioButtons(
    options=["📄 Upload PDF file", "📝 Paste text manually"],
    value="📄 Upload PDF file",
    description='Resume:',
)

display(resume_input_type)



RadioButtons(description='Resume:', options=('📄 Upload PDF file', '📝 Paste text manually'), value='📄 Upload PD…

In [None]:
from google.colab import files
import fitz  # PyMuPDF

if resume_input_type.value == "📝 Paste text manually":
    resume_textarea = widgets.Textarea(
        placeholder='Paste your resume here...',
        description='Text:',
        layout=widgets.Layout(width='100%', height='200px')
    )
    display(resume_textarea)
    resume_text = resume_textarea
else:
    print("📄 Please upload your resume PDF file below:")
    uploaded = files.upload()

    # Read and extract text from the uploaded PDF
    for fname in uploaded:
        doc = fitz.open(fname)
        resume_text = ""
        for page in doc:
            resume_text += page.get_text()
    print("✅ Resume successfully loaded from PDF.")


📄 Please upload your resume PDF file below:


Saving Resume new.pdf to Resume new.pdf
✅ Resume successfully loaded from PDF.


In [None]:
import ipywidgets as widgets
from IPython.display import display, clear_output

# Create TextArea for job description
job_textarea = widgets.Textarea(
    placeholder='Paste the job description here...',
    description='Job:',
    layout=widgets.Layout(width='100%', height='200px')
)

# Create Confirm button
confirm_button = widgets.Button(
    description='✅ Confirm Job Description',
    button_style='success',
    tooltip='Click to store the job description',
)

# Output area to show result
output = widgets.Output()

# Handle button click
def on_confirm_clicked(b):
    with output:
        clear_output()
        global job_description  # To make it accessible outside this cell
        job_description = job_textarea.value
        print("✅ Job description stored successfully!")

# Attach handler
confirm_button.on_click(on_confirm_clicked)

# Display everything
display(job_textarea, confirm_button, output)


Textarea(value='', description='Job:', layout=Layout(height='200px', width='100%'), placeholder='Paste the job…

Button(button_style='success', description='✅ Confirm Job Description', style=ButtonStyle(), tooltip='Click to…

Output()

In [None]:
# 🧠 Prompt to GPT-4 to act as an interviewer
prompt = f"""
You are a professional English-speaking job interviewer.

Based on the candidate's resume and the job description below, ask the FIRST interview question only.

Be realistic and focus on the candidate's fit for the job.

Do not write the answer, only ask the question.

Resume:
{resume_text}

Job Description:
{job_description}

Now, ask your first interview question.
"""

# Create message history for ongoing conversation
chat_history = [
    {"role": "system", "content": "You are a professional English-speaking job interviewer."},
    {"role": "user", "content": prompt}
]

# 🔁 Get the first interview question
response = client.chat.completions.create(
    model="gpt-4",
    messages=chat_history,
    temperature=0.7
)

# 🗣️ Store and display the first question
first_question = response.choices[0].message.content
chat_history.append({"role": "assistant", "content": first_question})
print("🗨️ Interviewer:", first_question)


🗨️ Interviewer: Can you elaborate on your experience with GenAI tools like ChatGPT, Claude, or Perplexity in your previous roles, particularly in relation to boosting productivity at work?


# 🎧 Text-to-Speech - OpenAI

In [None]:
import base64
from IPython.display import Audio

# 🗣️ Generate audio from the first interview question using OpenAI TTS
speech_response = client.audio.speech.create(
    model="tts-1",
    voice="alloy",  # Other options: 'echo', 'fable', 'onyx', 'nova', 'shimmer'
    input=first_question
)

# 💾 Save audio to a file
audio_path = "first_question.mp3"
with open(audio_path, "wb") as f:
    f.write(speech_response.content)

# ▶️ Play the audio in notebook
display(Audio(audio_path))


In [None]:
from IPython.display import Javascript, display, Audio, FileLink
from google.colab import output
import base64
from IPython.display import HTML

# Updated JS with blinking and state handling
RECORD_JS = """
const startBtn = document.createElement("button");
const stopBtn = document.createElement("button");
const endBtn = document.createElement("button");

startBtn.textContent = "▶️ Start Recording";
stopBtn.textContent = "⏹️ Stop Recording";
endBtn.textContent = "🛑 End Interview";

startBtn.style = stopBtn.style = endBtn.style = "margin: 10px; padding: 10px; font-size: 16px;";
startBtn.style.animation = "";

document.body.appendChild(startBtn);
document.body.appendChild(stopBtn);
document.body.appendChild(endBtn);

const sleep = time => new Promise(resolve => setTimeout(resolve, time));
const b2text = blob => new Promise(resolve => {
  const reader = new FileReader();
  reader.onloadend = () => resolve(reader.result);
  reader.readAsDataURL(blob);
});

let stream;
let recorder;
let chunks = [];

startBtn.onclick = async () => {
  stream = await navigator.mediaDevices.getUserMedia({ audio: true });
  recorder = new MediaRecorder(stream);
  chunks = [];
  recorder.ondataavailable = e => chunks.push(e.data);
  recorder.start();

  startBtn.textContent = "🔴 Recording...";
  startBtn.style.animation = "blinker 1s linear infinite";
  startBtn.style.color = "red";
};

stopBtn.onclick = async () => {
  recorder.stop();
  await new Promise(resolve => recorder.onstop = resolve);
  stream.getTracks().forEach(track => track.stop());

  startBtn.textContent = "▶️ Start Recording";
  startBtn.style.animation = "";
  startBtn.style.color = "";

  let blob = new Blob(chunks);
  let base64 = await b2text(blob);
  google.colab.kernel.invokeFunction("notebook.audio_result", [base64], {});
};

endBtn.onclick = () => {
  google.colab.kernel.invokeFunction("notebook.end_interview", [], {});
};

// CSS blinking animation
const style = document.createElement("style");
style.textContent = `
@keyframes blinker {
  50% { opacity: 0.2; }
}`;
document.head.appendChild(style);
"""

# Callback: Process audio and continue the chat
def handle_audio(base64_audio):
    audio_data = base64.b64decode(base64_audio.split(',')[1])
    audio_path = "user_response.wav"
    with open(audio_path, "wb") as f:
        f.write(audio_data)
    print("✅ Audio recorded. Transcribing...")

    transcription = client.audio.transcriptions.create(
        model="whisper-1",
        file=open(audio_path, "rb"),
        response_format="text"
    )
    user_answer = transcription.strip()
    print("📝 You said:", user_answer)

    chat_history.append({"role": "user", "content": user_answer})

    response = client.chat.completions.create(
        model="gpt-4",
        messages=chat_history,
        temperature=0.7
    )
    next_question = response.choices[0].message.content
    chat_history.append({"role": "assistant", "content": next_question})
    print("\n🗨️ Interviewer:", next_question)

    speech = client.audio.speech.create(
        model="tts-1",
        voice="alloy",
        input=next_question
    )
    with open("next_question.mp3", "wb") as f:
        f.write(speech.content)

    display(Audio("next_question.mp3"))

# Callback: Export and offer full transcript
def end_interview():
    print("📄 Generating transcript...")

    md_lines = ["# 🧠 Interview Transcript\n"]
    for entry in chat_history:
        role = "👤 You" if entry["role"] == "user" else "🗨️ Interviewer"
        md_lines.append(f"**{role}:** {entry['content']}\n")

    md_text = "\n".join(md_lines)

    # Save the file locally
    with open("interview_transcript.md", "w") as f:
        f.write(md_text)

    # Encode for browser-safe download link
    b64 = base64.b64encode(md_text.encode()).decode()
    download_link = f'<a download="interview_transcript.md" href="data:text/markdown;base64,{b64}" target="_blank">📥 Click here to download your interview transcript</a>'

    display(HTML(download_link))

# Register both callbacks
output.register_callback("notebook.audio_result", handle_audio)
output.register_callback("notebook.end_interview", end_interview)

# Launch interface
display(Javascript(RECORD_JS))


<IPython.core.display.Javascript object>

✅ Audio recorded. Transcribing...
📝 You said: Just testing, I made a Shots Up delivery app and in the end I integrated OpenAI's Whisper and chat in GPT-4 to receive audios from the customers, interpret the audio and adding the prompt for the AI, respond according to the restaurant's menu.

🗨️ Interviewer: That's an excellent approach. How did you manage and measure the success of this integration, and what were the key performance indicators you used to ensure the system was functioning optimally?


📄 Generating transcript...
