# ITI108 Assignment 1 - Nursharinah/6422706H - Student Answer Template

## **Assignment Overview**

You are tasked with developing an AI solution that can audit customer service calls by analyzing audio files against specific criteria related to customer service performance. This project will integrate speech recognition, LLM to assess the quality of interactions between agents and customers. The final product should provide insights into key elements of customer service and generate a detailed audit report.

You will be provided with the following audio files and text files contains the correct transcribes.

Read the separate document for the assignment detials

Please ensure all the results are printed on the cell for marking purpose.

In [None]:
#Code for your solution
#provide some comments for better understand while marking your code

# Audio Transcription - Groq (Whisper Speech-To-Text)

## Install Libraries

In [1]:

!pip install python-dotenv
!pip install gdown
!pip install groq




## Download the Groq API key from Google Drive

In [2]:

import gdown

# URL to download the .txt file containing the API key
file_id = '1GjVKNAQo1KmrZL-qDBVgNnS3rp6Btdn5'
url = f'https://drive.google.com/uc?id={file_id}'

# Specify the path where you want to save the file locally
output_path = 'groqAPIkey.txt'

# Download the file
gdown.download(url, output_path, quiet=False)


Downloading...
From: https://drive.google.com/uc?id=1GjVKNAQo1KmrZL-qDBVgNnS3rp6Btdn5
To: /content/groqAPIkey.txt
100%|██████████| 56.0/56.0 [00:00<00:00, 76.7kB/s]


'groqAPIkey.txt'

## Test the general-purpose LLM (for text-based tasks) with the following question


*   Which is the largest country by area in the world?

In [3]:

# Read the API key from the downloaded file
with open(output_path, 'r') as file:
    api_key = file.read().strip()

# Now use the API key to make requests
print(f"Your API Key: {api_key[:10]}...") # only show first 10 characters for security, to make sure key was properly loaded


Your API Key: gsk_nRqo1i...


In [4]:

import os
from groq import Groq

# Load the Groq API Key from the file
with open("groqAPIkey.txt", "r") as f:
    GROQ_API_KEY = f.read().strip()

# Initialize the Groq Client
client = Groq(api_key=GROQ_API_KEY)

# Select a Groq-compatible model (e.g., "llama3-8b" or "gemma-7b")
model = "llama3-8b-8192"
#model = "mixtral-8x7b-32768"

def get_completion_from_messages(messages, model=model, temperature=0):
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,  # Controls randomness
    )
    return response.choices[0].message.content

# Example query
print(get_completion_from_messages([{'role': 'user', 'content': 'Which is the largest country by area in the world?'}]))


The largest country by area in the world is Russia, which covers an area of approximately 17.1 million square kilometers (6.6 million square miles).


## Create function for Extracted audio transcription (Whisper Speech To Text)

In [5]:

#import os
#from groq import Groq

# Load the Groq API Key from the file
with open("groqAPIkey.txt", "r") as f:
    GROQ_API_KEY = f.read().strip()

# Initialize the Groq Client
client = Groq(api_key=GROQ_API_KEY)

# Whisper model for speech-to-text
model = "whisper-large-v3"

def transcribe_audio(audio_file, model=model, prompt="", temperature=0):
    with open(audio_file, "rb") as audio:
        response = client.audio.transcriptions.create(
            model=model,
            file=audio,
            prompt=prompt,
            temperature=temperature
        )
    return response.text  # Extract transcribed text


## Create function for Ground-truth audio transcription (WER - Original & Normalized)

In [6]:

# Install JiWER
!pip install jiwer




In [7]:

# WER calculation - Original
from jiwer import wer

def calculate_wer_Ori(ground_truth_file, predicted_text):
    """Compute Word Error Rate (WER) using JiWER."""
    with open(ground_truth_file, "r", encoding="utf-8") as f:
        reference_text = f.read().strip()  # Read ground truth transcription

    error_rate_Ori = wer(reference_text, predicted_text)
    return error_rate_Ori


In [8]:

# WER calculation - with Preprocessing (to remove punctuation/case sensitivity)
# Preprocessing by Normalization
from jiwer import wer, Compose, RemovePunctuation, ToLowerCase, RemoveWhiteSpace

# Define a transformation pipeline for normalization
transform = Compose([
    RemovePunctuation(),
    ToLowerCase(),
    RemoveWhiteSpace(replace_by_space=True),
])

def calculate_wer_Norm(ground_truth_file, predicted_text):
    """Compute Word Error Rate (WER) with normalization."""
    with open(ground_truth_file, "r", encoding="utf-8") as f:
        reference_text = f.read().strip()

    # Apply normalization to both reference and hypothesis
    reference_text_Norm = transform(reference_text)
    predicted_text_Norm = transform(predicted_text)

    error_rate_Norm = wer(reference_text_Norm, predicted_text_Norm)
    return error_rate_Norm


# Audit Transcription - AzureOpenAI (LLM Prompt Engineering)

## Install Libraries

In [12]:

!pip install langchain openai tiktoken chromadb python-dotenv langchain_community
!pip install U langchain_openai
!pip install docarray
#!pip install python-dotenv
#!pip install gdown




## Download the Azure OpenAI's .env into the colab virtual drive

In [13]:

#import gdown
url = 'https://drive.google.com/file/d/1U43HPiy3dOLAZNZcw6TNxWmVw4MwM_4F/view?usp=drive_link'
output_path = '.env'
gdown.download(url, output_path, quiet=False,fuzzy=True)


Downloading...
From: https://drive.google.com/uc?id=1U43HPiy3dOLAZNZcw6TNxWmVw4MwM_4F
To: /content/.env
100%|██████████| 222/222 [00:00<00:00, 631kB/s]


'.env'

## Load the .env to extract the API key, endpoint, deployment name, API version.

In [14]:

#import os
from dotenv import load_dotenv

# Load .env file to environment
load_dotenv()

OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
print(f"{OPENAI_API_KEY[:10]}...") # only show first 10 characters for security, to make sure key was properly loaded
AZURE_ENDPOINT = os.getenv('AZURE_ENDPOINT')
print(AZURE_ENDPOINT)
DEPLOYMENT_NAME = os.getenv('DEPLOYMENT_NAME')
print(DEPLOYMENT_NAME)
OPENAI_API_VERSION = os.getenv('OPENAI_API_VERSION')
print(OPENAI_API_VERSION)


1ewvrdGS12...
https://nypopenai2.openai.azure.com/
gpt-4o-global
2024-07-01-preview


## Test the general-purpose LLM

In [15]:

from langchain.chat_models import AzureChatOpenAI
from langchain.schema import HumanMessage

llm = AzureChatOpenAI(deployment_name=DEPLOYMENT_NAME, openai_api_version=OPENAI_API_VERSION, openai_api_key=OPENAI_API_KEY, openai_api_base=AZURE_ENDPOINT, temperature=0.9)

msg = HumanMessage(content="Explain step by step. How old is the president of USA?")
print(llm(messages=[msg]))

print(llm.invoke([{'role':'user', 'content':'Which is the largest country by area in the world?'}]).content)


  llm = AzureChatOpenAI(deployment_name=DEPLOYMENT_NAME, openai_api_version=OPENAI_API_VERSION, openai_api_key=OPENAI_API_KEY, openai_api_base=AZURE_ENDPOINT, temperature=0.9)
  print(llm(messages=[msg]))


content="To determine the current age of the President of the United States, follow these steps:\n\n1. **Identify the current President**: As of 2023, the President of the United States is Joe Biden.\n\n2. **Find the birthdate of the President**: Joe Biden was born on November 20, 1942.\n\n3. **Calculate the age**: \n   - Identify the current year: 2023.\n   - Calculate the difference between the current year and the birth year: 2023 - 1942 = 81 years.\n   - Check if the birthdate has occurred this year by comparing the current date to the birth date (November 20):\n     - If today’s date is before November 20, Biden is still 80.\n     - If today’s date is on or after November 20, Biden has turned 81.\n\nLet's assume today's date is October 1, 2023:\n- Since today's date is before November 20, Joe Biden would still be 80 years old.\n\nIf the date is after November 20, such as December 1, 2023:\n- Joe Biden would be 81 years old.\n\nThus, as of October 2023, Joe Biden is 80 years old. H

## Define the Audit Prompt Template

In [16]:

from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Define a prompt template for auditing customer service conversations
audit_prompt = PromptTemplate(
    input_variables=["transcription"],
    template=(
        "You are an AI auditor analyzing a conversation between a customer service agent and a customer. "
        "Your task is to label the speakers, check compliance with 9 audit criteria, and format the audit report in JSON.\n\n"
        "### Conversation:\n{transcription}\n\n"
        "### Audit Criteria:\n"
        "1. **Introduction** – Did the agent introduce themselves before the conversation?\n"
        "2. **Acquire Customer Information** – Did the agent gather customer details before engaging?\n"
        "3. **Politeness and Respect** – Was the agent polite and respectful during the interaction?\n"
        "4. **Empathy and Understanding** – Did the agent demonstrate empathy while addressing inquiries?\n"
        "5. **Gratitude** – Did the agent express gratitude when the customer showed interest?\n"
        "6. **Provide Conclusion from Customer Request** – Did the agent summarize the customer's request?\n"
        "7. **Clarifying Questions** – Did the agent ask clarifying questions if the customer was unclear?\n"
        "8. **Clarity of Language** – Was the agent’s language clear, concise, and easy to understand?\n"
        "9. **Relevance of Information** – Did the agent provide relevant information to the customer’s request?\n\n"
        "### Audit Output Format (JSON Example):\n"
        "{{{{\n"
        '  "audit_results": [\n'
        '    {{"criteria": "Introduction", "audit_reason": "The agent introduced themselves at the start.", "result": "Pass"}},\n'
        '    {{"criteria": "Politeness and Respect", "audit_reason": "The agent was polite and used professional language.", "result": "Pass"}}\n'
        "  ]\n"
        "}}}}\n\n"
        "### Task Instructions:\n"
        "1. **Label the speakers** as 'Agent:' and 'Customer:'.\n"
        "2. **Check each criterion** and provide an explanation (Audit Reason) for the decision.\n"
        "3. **Format the final audit results as JSON**.\n\n"
        "Now, generate the structured audit report in JSON format."
    ),
)


In [17]:

# Create a langchain pipeline to process the audit prompt using AzureChatOpenAI
audit_chain = audit_prompt | llm | StrOutputParser()


# Test with each of the given audio files

**Esure you printout the results in each cell for marking**

## 1. Custom-Home-Builder.mp3

1. Input audio file to generate the transcript - print the transcript

In [10]:

# Example Usage
audio_file1 = "Custom-Home-Builder.mp3"  # Path to audio file

# Step 1: Get transcribed text
predicted_transcription1 = transcribe_audio(audio_file1)

# Print Result
print("Predicted Transcription:", predicted_transcription1)


Predicted Transcription:  Call is now being recorded. Good afternoon, Elkins Builders. Yeah, hi. I'm calling to speak to someone about building a house on a property I'm looking to purchase. Oh, okay, great. Let me get your name. What's your first name, please? Kenny. And your last name? Lindstrom. It's L-I-N-D-S-T-R-O-M. Thank you. And may I have your callback number? It's 610- That's 610-265-1715. That's 610-265-1715? Yes. And where is the property that you're looking for an estimate on? It's in Westchester. I haven't purchased the land yet. I'd like to see if I could get an estimate or have them take a look at it before I do. Okay, no problem. Is there a good time to reach you with this number, or is that an anytime? time. That's my cell phone. If they could call me back today, that would be great. Okay, no problem. I'll pass your message along and somebody should be getting back to you this afternoon. Great. Thank you so much. You're welcome. And thank you for calling Elkins Builde

2. Word Error Rate(WER)

Use this module jiwer to compute the WER - print the WER result

In [11]:

# Example Usage
ground_truth_file1 = "Custom-Home-Builder.txt"  # Path to ground truth transcript

# Step 2: Compute WER - Original
wer_score1 = calculate_wer_Ori(ground_truth_file1, predicted_transcription1)

# Print Result
print("\nWord Error Rate (WER) - Original:", wer_score1)



Word Error Rate (WER) - Original: 0.05202312138728324


In [12]:

# Example Usage
ground_truth_file1 = "Custom-Home-Builder.txt"  # Path to ground truth transcript

# Step 2: Compute WER - Normalized
wer_score1 = calculate_wer_Norm(ground_truth_file1, predicted_transcription1)

# Print Result
print("\nWord Error Rate (WER) - Normalized:", wer_score1)



Word Error Rate (WER) - Normalized: 0.03468208092485549


3. Use the transcript with speakers label perform audit with the criteria then generate the report - print the result

In [19]:

# Run the Audit on predicted_transcription1
import json

# Use the predicted transcription from your speech-to-text model
audit_result_json1 = audit_chain.invoke({"transcription": predicted_transcription1})

#Check if output is empty (for debugging)
#print("Audit Result JSON Output:", audit_result_json1)


4. Export the report - print the result

In [20]:

#import json

# Raw output from the LLM
raw_output1 = audit_result_json1

# Clean the output by stripping code block markers
cleaned_json1 = raw_output1.strip("```json").strip("```").strip()

# Try parsing the cleaned JSON safely
try:
    audit_result = json.loads(cleaned_json1)
    print(json.dumps(audit_result, indent=4))  # Pretty print the output
except json.JSONDecodeError as e:
    print("JSON Decode Error:", e)
    print("Raw Output:", cleaned_json1)  # Print the raw output for debugging


{
    "audit_results": [
        {
            "criteria": "Introduction",
            "audit_reason": "The agent did not introduce themselves at the start of the call.",
            "result": "Fail"
        },
        {
            "criteria": "Acquire Customer Information",
            "audit_reason": "The agent gathered the customer's name, phone number, and property location before engaging.",
            "result": "Pass"
        },
        {
            "criteria": "Politeness and Respect",
            "audit_reason": "The agent used polite language throughout the conversation.",
            "result": "Pass"
        },
        {
            "criteria": "Empathy and Understanding",
            "audit_reason": "The agent acknowledged the customer's request and demonstrated understanding by ensuring follow-up.",
            "result": "Pass"
        },
        {
            "criteria": "Gratitude",
            "audit_reason": "The agent thanked the customer at the end of the call.",
 

## 2. Inbound-sales-audio-sample.mp3

1. Input audio file to generate the transcript - print the transcript

In [9]:

# Example Usage
audio_file2 = "Inbound-sales-audio-sample.mp3"  # Path to audio file

# Step 1: Get transcribed text
predicted_transcription2 = transcribe_audio(audio_file2)

# Print Result
print("Predicted Transcription:", predicted_transcription2)


Predicted Transcription:  Thank you for calling Brentburg. This is Jessica. How may I help you? Hi, Jessica. My name is John and I'm from Sydney, Australia, and I run a tech marketplace business, a startup, and I'm run off my feet at the moment. I'm looking for someone to help me get a virtual assistant. Is that something you can help me with? Yes, definitely, John. Thank you so much for that information. May I know what are the different tasks this virtual assistant would be doing? Yeah, I just really need basic administrative work. I need someone to do my email management. I need someone to manage my calendar, do some scheduling for me, maybe book some travel if I need to from time to time, some data entry, some pretty basic stuff just to help me with work that I shouldn't be focused on while I'm trying to launch this new tech company. Yes, definitely. I can help you out with that. Well, you said that what you're looking for is someone who can do email management, calendar management

2. Word Error Rate(WER)

Use this module jiwer to compute the WER - print the WER result

In [10]:

# Example Usage
ground_truth_file2 = "Inbound-sales-audio-sample.txt"  # Path to ground truth transcript

# Step 2: Compute WER - Original
wer_score2 = calculate_wer_Ori(ground_truth_file2, predicted_transcription2)

# Print Result
print("\nWord Error Rate (WER) - Original:", wer_score2)



Word Error Rate (WER) - Original: 0.05514705882352941


In [11]:

# Example Usage
ground_truth_file2 = "Inbound-sales-audio-sample.txt"  # Path to ground truth transcript

# Step 2: Compute WER - Normalized
wer_score2 = calculate_wer_Norm(ground_truth_file2, predicted_transcription2)

# Print Result
print("\nWord Error Rate (WER) - Normalized:", wer_score2)



Word Error Rate (WER) - Normalized: 0.014678899082568808


3. Use the transcript with speakers label perform audit with the criteria then generate the report - print the result

In [18]:

# Run the Audit on predicted_transcription2
import json

# Use the predicted transcription from your speech-to-text model
audit_result_json2 = audit_chain.invoke({"transcription": predicted_transcription2})

#Check if output is empty (for debugging)
#print("Audit Result JSON Output:", audit_result_json2)


4. Export the report - print the result

In [19]:

#import json

# Raw output from the LLM
raw_output2 = audit_result_json2

# Clean the output by stripping code block markers
cleaned_json2 = raw_output2.strip("```json").strip("```").strip()

# Try parsing the cleaned JSON safely
try:
    audit_result = json.loads(cleaned_json2)
    print(json.dumps(audit_result, indent=4))  # Pretty print the output
except json.JSONDecodeError as e:
    print("JSON Decode Error:", e)
    print("Raw Output:", cleaned_json2)  # Print the raw output for debugging


{
    "audit_results": [
        {
            "criteria": "Introduction",
            "audit_reason": "The agent introduced themselves at the start.",
            "result": "Pass"
        },
        {
            "criteria": "Acquire Customer Information",
            "audit_reason": "The agent gathered customer details by asking about the tasks the virtual assistant would be doing.",
            "result": "Pass"
        },
        {
            "criteria": "Politeness and Respect",
            "audit_reason": "The agent was polite and used professional language throughout the conversation.",
            "result": "Pass"
        },
        {
            "criteria": "Empathy and Understanding",
            "audit_reason": "The agent demonstrated understanding and attentiveness to the customer's needs and concerns.",
            "result": "Pass"
        },
        {
            "criteria": "Gratitude",
            "audit_reason": "The agent expressed gratitude when the customer shared i

## 3. Local-Plumber.mp3

1. Input audio file to generate the transcript - print the transcript

In [10]:

# Example Usage
audio_file3 = "Local-Plumber.mp3"  # Path to audio file

# Step 1: Get transcribed text
predicted_transcription3 = transcribe_audio(audio_file3)

# Print Result
print("Predicted Transcription:", predicted_transcription3)


Predicted Transcription:  Call is now being recorded. ABC Plumbing and Heating, this is Betty. Hi Betty, I'm having a problem with my sewer drain. Oh, I'm so sorry to hear that, sir. Would you like me to get a hold of the plumber for you? Do you know how much it's going to cost? Unfortunately, I wouldn't be able to quote prices, but I can get a hold of someone who would be able to give you a better idea. I'm getting a backup throughout the entire house. Are you a client of ABC? I did use them before to put in my garbage disposal, but nothing major like this. Okay. Let me get your name and number, and I can patch you right through to the plumber. Your name, sir? Mike Barry. Would you spell your last name for me, please? That's B-A-R-R-Y. Okay. And your callback number? 610-265-1714. Okay, that's 610-265-1714. Yes, will he call me right back? Actually, Mr. Berry, I'm going to call him right now and patch you directly through to the plumber. Would you stay on the line for a moment? Oh, su

2. Word Error Rate(WER)

Use this module jiwer to compute the WER - print the WER result

In [11]:

# Example Usage
ground_truth_file3 = "Local-Plumber.txt"  # Path to ground truth transcript

# Step 2: Compute WER - Original
wer_score3 = calculate_wer_Ori(ground_truth_file3, predicted_transcription3)

# Print Result
print("\nWord Error Rate (WER) - Original:", wer_score3)



Word Error Rate (WER) - Original: 0.05181347150259067


In [12]:

# Example Usage
ground_truth_file3 = "Local-Plumber.txt"  # Path to ground truth transcript

# Step 2: Compute WER - Normalized
wer_score3 = calculate_wer_Norm(ground_truth_file3, predicted_transcription3)

# Print Result
print("\nWord Error Rate (WER) - Normalized:", wer_score3)



Word Error Rate (WER) - Normalized: 0.0


3. Use the transcript with speakers label perform audit with the criteria then generate the report - print the result

In [19]:

# Run the Audit on predicted_transcription3
import json

# Use the predicted transcription from your speech-to-text model
audit_result_json3 = audit_chain.invoke({"transcription": predicted_transcription3})

#Check if output is empty (for debugging)
#print("Audit Result JSON Output:", audit_result_json3)


4. Export the report - print the result

In [20]:

#import json

# Raw output from the LLM
raw_output3 = audit_result_json3

# Clean the output by stripping code block markers
cleaned_json3 = raw_output3.strip("```json").strip("```").strip()

# Try parsing the cleaned JSON safely
try:
    audit_result = json.loads(cleaned_json3)
    print(json.dumps(audit_result, indent=4))  # Pretty print the output
except json.JSONDecodeError as e:
    print("JSON Decode Error:", e)
    print("Raw Output:", cleaned_json3)  # Print the raw output for debugging


{
    "conversation": [
        {
            "Agent": "Call is now being recorded. ABC Plumbing and Heating, this is Betty."
        },
        {
            "Customer": "Hi Betty, I'm having a problem with my sewer drain."
        },
        {
            "Agent": "Oh, I'm so sorry to hear that, sir. Would you like me to get a hold of the plumber for you?"
        },
        {
            "Customer": "Do you know how much it's going to cost?"
        },
        {
            "Agent": "Unfortunately, I wouldn't be able to quote prices, but I can get a hold of someone who would be able to give you a better idea."
        },
        {
            "Customer": "I'm getting a backup throughout the entire house."
        },
        {
            "Agent": "Are you a client of ABC?"
        },
        {
            "Customer": "I did use them before to put in my garbage disposal, but nothing major like this."
        },
        {
            "Agent": "Okay. Let me get your name and number, an

## 4. Property-Management-Office.mp3

1. Input audio file to generate the transcript - print the transcript

In [9]:

# Example Usage
audio_file4 = "Property-Management-Office.mp3"  # Path to audio file

# Step 1: Get transcribed text
predicted_transcription4 = transcribe_audio(audio_file4)

# Print Result
print("Predicted Transcription:", predicted_transcription4)


Predicted Transcription:  Call is now being recorded. Good evening, Kingswood Apartments. This is Alex. How may I help you? Hey, yeah, I'm in apartment 104 on the first floor. I'm calling to complain about my neighbors. Okay, what seems to be the problem, sir? It's very late. I just got my newborn baby to sleep, and they're being loud again. I brought this to their attention several times, but they never, you know, never stops. Okay, I'm very sorry about that. Just let me take down your contact information and I'll contact the landlord right away to get this all straightened out. Okay, my name is Jeff Matthews. Okay, Mr. Matthews, can you spell that for me? First name is Jeff, J-E-F-F. Last name is Matthews, M-A-T-T-H-E-W-S. Okay, Mr. Matthews, can I have the best number you can be reached at and also your apartment number again? Sure, it's 610-265-1714, and I'm in apartment 104 on the first floor. Okay, I have 610-265-1714 and apartment 104. Yes. All right, Mr. Matthews, I'm going to 

2. Word Error Rate(WER)

Use this module jiwer to compute the WER - print the WER result

In [10]:

# Example Usage
ground_truth_file4 = "Property-Management-Office.txt"  # Path to ground truth transcript

# Step 2: Compute WER - Original
wer_score4 = calculate_wer_Ori(ground_truth_file4, predicted_transcription4)

# Print Result
print("\nWord Error Rate (WER) - Original:", wer_score4)



Word Error Rate (WER) - Original: 0.018957345971563982


In [11]:

# Example Usage
ground_truth_file4 = "Property-Management-Office.txt"  # Path to ground truth transcript

# Step 2: Compute WER - Normalized
wer_score4 = calculate_wer_Norm(ground_truth_file4, predicted_transcription4)

# Print Result
print("\nWord Error Rate (WER) - Normalized:", wer_score4)



Word Error Rate (WER) - Normalized: 0.014218009478672985


3. Use the transcript with speakers label perform audit with the criteria then generate the report - print the result

In [18]:

# Run the Audit on predicted_transcription4
import json

# Use the predicted transcription from your speech-to-text model
audit_result_json4 = audit_chain.invoke({"transcription": predicted_transcription4})

#Check if output is empty (for debugging)
#print("Audit Result JSON Output:", audit_result_json4)


4. Export the report - print the result

In [19]:

#import json

# Raw output from the LLM
raw_output4 = audit_result_json4

# Clean the output by stripping code block markers
cleaned_json4 = raw_output4.strip("```json").strip("```").strip()

# Try parsing the cleaned JSON safely
try:
    audit_result = json.loads(cleaned_json4)
    print(json.dumps(audit_result, indent=4))  # Pretty print the output
except json.JSONDecodeError as e:
    print("JSON Decode Error:", e)
    print("Raw Output:", cleaned_json4)  # Print the raw output for debugging


{
    "audit_results": [
        {
            "criteria": "Introduction",
            "audit_reason": "The agent introduced themselves at the start by giving their name and the company they represent.",
            "result": "Pass"
        },
        {
            "criteria": "Acquire Customer Information",
            "audit_reason": "The agent gathered the customer's name, phone number, and apartment number before addressing the complaint.",
            "result": "Pass"
        },
        {
            "criteria": "Politeness and Respect",
            "audit_reason": "The agent maintained a polite and respectful tone throughout the interaction.",
            "result": "Pass"
        },
        {
            "criteria": "Empathy and Understanding",
            "audit_reason": "The agent apologized for the inconvenience and demonstrated understanding of the customer's situation.",
            "result": "Pass"
        },
        {
            "criteria": "Gratitude",
            "audit

## 5. Real-State-Lead-Gen-1.mp3

1. Input audio file to generate the transcript - print the transcript

In [9]:

# Example Usage
audio_file5 = "Real-State-Lead-Gen-1.mp3"  # Path to audio file

# Step 1: Get transcribed text
predicted_transcription5 = transcribe_audio(audio_file5)

# Print Result
print("Predicted Transcription:", predicted_transcription5)


Predicted Transcription:  Hello, Eloise speaking. Hi, Eloise. This is Sophia. Good day. I'm phoning from the realestateleadgeneration.com.au. We just want to... Yes? Our custom lead generation packages are now live. We're getting real estate agent leads as we speak. Now, I know you're busy, Eloise, But if your team is already set up with a stable stream of incoming leads, property appraisals, property management, and general inquiries, that's fine. We'll leave you to it. But we've been working with real estate agencies for over 10 years, and there's always a common theme. Where is your next lead going to come from? So my call today is just to book a time with one of our real estate lead specialists. We're based in both Sydney and Melbourne. They'll quickly introduce the package to you. Just run through a few of the finer details, Eloise. It'll only take 10 to 20 minutes to give you the background of the package. to let you ask any questions you may have. Are you free next week, Wednesd

2. Word Error Rate(WER)

Use this module jiwer to compute the WER - print the WER result

In [10]:

# Example Usage
ground_truth_file5 = "Real-State-Lead-Gen-1.txt"  # Path to ground truth transcript

# Step 2: Compute WER - Original
wer_score5 = calculate_wer_Ori(ground_truth_file5, predicted_transcription5)

# Print Result
print("\nWord Error Rate (WER) - Original:", wer_score5)



Word Error Rate (WER) - Original: 0.16121495327102803


In [12]:

# Example Usage
ground_truth_file5 = "Real-State-Lead-Gen-1.txt"  # Path to ground truth transcript

# Step 2: Compute WER - Normalized
wer_score5 = calculate_wer_Norm(ground_truth_file5, predicted_transcription5)

# Print Result
print("\nWord Error Rate (WER) - Normalized:", wer_score5)



Word Error Rate (WER) - Normalized: 0.05128205128205128


3. Use the transcript with speakers label perform audit with the criteria then generate the report - print the result

In [19]:

# Run the Audit on predicted_transcription5
import json

# Use the predicted transcription from your speech-to-text model
audit_result_json5 = audit_chain.invoke({"transcription": predicted_transcription5})

#Check if output is empty (for debugging)
#print("Audit Result JSON Output:", audit_result_json5)


4. Export the report - print the result

In [20]:

#import json

# Raw output from the LLM
raw_output5 = audit_result_json5

# Clean the output by stripping code block markers
cleaned_json5 = raw_output5.strip("```json").strip("```").strip()

# Try parsing the cleaned JSON safely
try:
    audit_result = json.loads(cleaned_json5)
    print(json.dumps(audit_result, indent=4))  # Pretty print the output
except json.JSONDecodeError as e:
    print("JSON Decode Error:", e)
    print("Raw Output:", cleaned_json5)  # Print the raw output for debugging


{
    "audit_results": [
        {
            "criteria": "Introduction",
            "audit_reason": "The agent introduced themselves at the start by saying 'Hello, Eloise speaking.'",
            "result": "Pass"
        },
        {
            "criteria": "Acquire Customer Information",
            "audit_reason": "The agent did not gather any customer details before engaging in the conversation. The customer initiated the identification.",
            "result": "Fail"
        },
        {
            "criteria": "Politeness and Respect",
            "audit_reason": "The agent was polite and addressed the customer by name. They also closed the conversation with a polite note.",
            "result": "Pass"
        },
        {
            "criteria": "Empathy and Understanding",
            "audit_reason": "The agent acknowledged the customer's busy schedule and showed understanding by offering alternative appointment times.",
            "result": "Pass"
        },
        {
    

## 6. Travel-Reservation.mp3

1. Input audio file to generate the transcript - print the transcript

In [9]:

# Example Usage
audio_file6 = "Travel-Reservation.mp3"  # Path to audio file

# Step 1: Get transcribed text
predicted_transcription6 = transcribe_audio(audio_file6)

# Print Result
print("Predicted Transcription:", predicted_transcription6)


Predicted Transcription:  Hi, thank you for calling Hotel California. This is Candice. How may I help you? Hi, Candice. This is Stephen. I'd like to book for four people, please. And that would be for August 18th. Sure, Stephen. I'd be happy to help you with that. So, which room do you have in mind? Actually, I got to tell you first that I am super busy. So I haven't been able to browse through your website. And I don't know, I don't really know the rooms you have right now. But I could give you my requirements and then you suggest the best rooms for me. Could we do that? Absolutely. What do you need? So there are four of us, including me. and we need something with a wi-fi and maybe a swimming pool got it will you four be staying in one room or separate rooms oh sorry i forgot to tell you um that would be separate because you see we two couples so we need two rooms I see So in that case I recommend two deluxe kings Each room has a king-size bed, which fits two people, a Wi-Fi, and acc

2. Word Error Rate(WER)

Use this module jiwer to compute the WER - print the WER result

In [10]:

# Example Usage
ground_truth_file6 = "Travel-Reservation.txt"  # Path to ground truth transcript

# Step 2: Compute WER - Original
wer_score6 = calculate_wer_Ori(ground_truth_file6, predicted_transcription6)

# Print Result
print("\nWord Error Rate (WER) - Original:", wer_score6)



Word Error Rate (WER) - Original: 0.11403508771929824


In [11]:

# Example Usage
ground_truth_file6 = "Travel-Reservation.txt"  # Path to ground truth transcript

# Step 2: Compute WER - Normalized
wer_score6 = calculate_wer_Norm(ground_truth_file6, predicted_transcription6)

# Print Result
print("\nWord Error Rate (WER) - Normalized:", wer_score6)



Word Error Rate (WER) - Normalized: 0.029239766081871343


3. Use the transcript with speakers label perform audit with the criteria then generate the report - print the result

In [18]:

# Run the Audit on predicted_transcription6
import json

# Use the predicted transcription from your speech-to-text model
audit_result_json6 = audit_chain.invoke({"transcription": predicted_transcription6})

#Check if output is empty (for debugging)
#print("Audit Result JSON Output:", audit_result_json6)


4. Export the report - print the result

In [19]:

#import json

# Raw output from the LLM
raw_output6 = audit_result_json6

# Clean the output by stripping code block markers
cleaned_json6 = raw_output6.strip("```json").strip("```").strip()

# Try parsing the cleaned JSON safely
try:
    audit_result = json.loads(cleaned_json6)
    print(json.dumps(audit_result, indent=4))  # Pretty print the output
except json.JSONDecodeError as e:
    print("JSON Decode Error:", e)
    print("Raw Output:", cleaned_json6)  # Print the raw output for debugging


{
    "audit_results": [
        {
            "criteria": "Introduction",
            "audit_reason": "The agent introduced themselves at the start.",
            "result": "Pass"
        },
        {
            "criteria": "Acquire Customer Information",
            "audit_reason": "The agent gathered the customer's name and requirements before proceeding with the booking.",
            "result": "Pass"
        },
        {
            "criteria": "Politeness and Respect",
            "audit_reason": "The agent was polite and used professional language throughout the conversation.",
            "result": "Pass"
        },
        {
            "criteria": "Empathy and Understanding",
            "audit_reason": "The agent demonstrated understanding and was accommodating to the customer's busy schedule and specific needs.",
            "result": "Pass"
        },
        {
            "criteria": "Gratitude",
            "audit_reason": "The agent expressed gratitude at the start and