<a href="https://colab.research.google.com/github/tailagos/SpeechModel/blob/main/AI_for_Interviews_HR_Usecase.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**1. Libraries and Dependencies**

In [15]:
!pip install numpy --quiet
!pip install scipy --quiet
!pip install faster-whisper transformers torchaudio TTS --quiet
!pip install networkx --quiet
!pip install pandas --quiet

**2. Load Models**

In [17]:
from faster_whisper import WhisperModel
whisper_model = WhisperModel("small", compute_type="float32")

!pip install -U jax --quiet # upgrade jax
!pip install -U tensorflow --quiet # upgrade tensorflow to ensure compatibility with numpy and jax

# Load FLAN-T5 for language generation
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

t5_tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-base")
t5_model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base")

# Load Coqui TTS for text-to-speech
from TTS.api import TTS
tts = TTS(model_name="tts_models/en/ljspeech/tacotron2-DDC", progress_bar=False, gpu=False)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/2.54k [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/2.20k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.40k [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/990M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

 > Downloading model to /root/.local/share/tts/tts_models--en--ljspeech--tacotron2-DDC
 > Model's license - apache 2.0
 > Check https://choosealicense.com/licenses/apache-2.0/ for more info.
 > Downloading model to /root/.local/share/tts/vocoder_models--en--ljspeech--hifigan_v2
 > Model's license - apache 2.0
 > Check https://choosealicense.com/licenses/apache-2.0/ for more info.
 > Using model: Tacotron2
 > Setting up Audio Processor...
 | > sample_rate:22050
 | > resample:False
 | > num_mels:80
 | > log_func:np.log
 | > min_level_db:-100
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:20
 | > fft_size:1024
 | > power:1.5
 | > preemphasis:0.0
 | > griffin_lim_iters:60
 | > signal_norm:False
 | > symmetric_norm:True
 | > mel_fmin:0
 | > mel_fmax:8000.0
 | > pitch_fmin:1.0
 | > pitch_fmax:640.0
 | > spec_gain:1.0
 | > stft_pad_mode:reflect
 | > max_norm:4.0
 | > clip_norm:True
 | > do_trim_silence:True
 | > trim_db:60
 | > do_sound_norm:False
 | > do_amp_to_db_linea

In [18]:
from IPython.display import Audio
import wave

**DOWNLOAD AND USE SAMPLE AUDIO (LJ Speech)**

In [19]:
# Replace this with your own uploaded file
AUDIO_PATH = "LJSpeech-1.1/wavs/LJ001-0001.wav"

# Get the audio file's sample rate
with wave.open(AUDIO_PATH, 'rb') as wf:
    rate = wf.getframerate()

# Listen to the candidate input, providing the sample rate
Audio(filename=AUDIO_PATH, rate=rate)

FileNotFoundError: [Errno 2] No such file or directory: 'LJSpeech-1.1/wavs/LJ001-0001.wav'

**TRANSCRIBE AUDIO INPUT**

In [20]:
segments, _ = whisper_model.transcribe(AUDIO_PATH)
transcription = " ".join([s.text for s in segments])
print("Candidate:", transcription)

FileNotFoundError: [Errno 2] No such file or directory: 'LJSpeech-1.1/wavs/LJ001-0001.wav'

**DEFINE STAR-STYLE HR PROMPT**

In [21]:
def build_star_prompt(candidate_input, job_title, stage="Situation"):
    return f"""
You are a professional HR interviewer conducting a structured interview using the STAR method.

The candidate is applying for the role of {job_title}.

Your task is to analyze the candidate's last response and provide a short follow-up question focusing on the '{stage}' part of STAR (Situation, Task, Action, Result).

Candidate said:
\"{candidate_input}\"

Respond in a natural, conversational tone.
"""

**STEP 5: GENERATE HR RESPONSE + TTS OUTPUT**

In [22]:
def generate_hr_response(candidate_input, job_title, stage):
    prompt = build_star_prompt(candidate_input, job_title, stage)
    input_ids = t5_tokenizer(prompt, return_tensors="pt").input_ids
    output = t5_model.generate(input_ids, max_new_tokens=80)
    response = t5_tokenizer.decode(output[0], skip_special_tokens=True)
    return response

def speak_text(text, filename="ai_response.wav"):
    tts.tts_to_file(text=text, file_path=filename)
    return Audio(filename)

**STEP 6: SINGLE INTERVIEW TURN FUNCTION**

In [23]:
def interview_turn(audio_path, job_title, stage="Result"):
    segments, _ = whisper_model.transcribe(audio_path)
    candidate_text = " ".join([s.text for s in segments])

    hr_response = generate_hr_response(candidate_text, job_title, stage)
    audio_response = speak_text(hr_response)

    return candidate_text, hr_response, audio_response

**TEST WITH ONE INTERVIEW TURN**

In [24]:
candidate_text, hr_reply, audio_out = interview_turn(AUDIO_PATH, job_title="Data Analyst", stage="Action")
print("Candidate:", candidate_text)
print("HR AI:", hr_reply)
audio_out

FileNotFoundError: [Errno 2] No such file or directory: 'LJSpeech-1.1/wavs/LJ001-0001.wav'

**FULL STAR ROUND**

In [25]:
star_stages = ["Situation", "Task", "Action", "Result"]
conversation_log = []

for stage in star_stages:
    print(f"\n🔄 STAR Stage: {stage}")
    candidate_text, hr_reply, audio_out = interview_turn(AUDIO_PATH, "Project Manager", stage)
    conversation_log.append({
        "stage": stage,
        "candidate": candidate_text,
        "hr_reply": hr_reply
    })
    display(audio_out)


🔄 STAR Stage: Situation


FileNotFoundError: [Errno 2] No such file or directory: 'LJSpeech-1.1/wavs/LJ001-0001.wav'

**SAVE LOG for HR Analysis Engine**

In [None]:
import json
with open("interview_log.json", "w") as f:
    json.dump(conversation_log, f, indent=2)

from google.colab import files
files.download("interview_log.json")