<a href="https://colab.research.google.com/github/HarshitPanday/Call-Quality-Analyze/blob/main/Untitled3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [22]:
# 1️⃣ Install Required Libraries
!pip install -q yt-dlp
!pip install -q git+https://github.com/openai/whisper.git
!pip install -q textblob
!pip install -q pydub
!pip install -q noisereduce
!apt install -y ffmpeg

print("All libraries installed successfully!")


  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
ffmpeg is already the newest version (7:4.4.2-0ubuntu0.22.04.1).
0 upgraded, 0 newly installed, 0 to remove and 35 not upgraded.
All libraries installed successfully!


In [23]:
# 2️⃣ Download Audio from YouTube using yt-dlp
!yt-dlp -x --audio-format wav -o "call_audio.%(ext)s" https://www.youtube.com/watch?v=4ostqJD3Psc
print("Audio downloaded as call_audio.wav")


[youtube] Extracting URL: https://www.youtube.com/watch?v=4ostqJD3Psc
[youtube] 4ostqJD3Psc: Downloading webpage
[youtube] 4ostqJD3Psc: Downloading tv simply player API JSON
[youtube] 4ostqJD3Psc: Downloading tv client config
[youtube] 4ostqJD3Psc: Downloading tv player API JSON
[info] 4ostqJD3Psc: Downloading 1 format(s): 251
[download] call_audio.wav has already been downloaded
[ExtractAudio] Destination: call_audio.wav
Deleting original file call_audio.orig.wav (pass -k to keep)
Audio downloaded as call_audio.wav


In [24]:
# 3️⃣ Optional: Preprocess Audio (Noise Reduction)
import librosa
import noisereduce as nr
import soundfile as sf

y, sr = librosa.load("call_audio.wav", sr=None)
reduced_noise = nr.reduce_noise(y=y, sr=sr)
sf.write("call_audio_denoised.wav", reduced_noise, sr)
print("Noise reduction completed, file saved as call_audio_denoised.wav")


Noise reduction completed, file saved as call_audio_denoised.wav


In [25]:
# 4️⃣ Speech-to-Text using Whisper
import whisper

model = whisper.load_model("base")
result = model.transcribe("call_audio_denoised.wav")
transcript = result['text']
print("Transcription:\n", transcript)


100%|███████████████████████████████████████| 139M/139M [00:02<00:00, 51.8MiB/s]


Transcription:
  Thank you for calling Nissan. My name is Lauren. Can I have your name? How many is John Smith? Thank you John. How can I help you? I was just calling about to see how much you would cost to update the map in my car. I'd have to help you with that today. Did you receive a mail from us? I did. Do you need the customer number? Yes please. Okay. It's 152430. Thank you and the year making model of your vehicle. Yeah I have a 2009 Nissan Altima. Oh nice car. Yeah thank you. We really enjoy it. Okay I think I've got your profile here. Can I have to verify your address and phone number please? Yes. It's 1255 North Research Way that's an ORM Utah 84097. And my phone number is A-01-431-1000. Thanks John. I located your information. The newest person we have available for your vehicle is version 7.7 which was released in March 2012. The price of the new map is $99.00. Push the tax. Let's go ahead and set up the order for you. Well, in the waitress a second I'm not really sure if 

In [26]:
# 5️⃣ Estimate Talk-Time Ratio (Approx using Whisper segments)
# Whisper outputs timestamps for segments, we can approximate talk-time

segments = result.get('segments', [])
speaker0_duration = sum([seg['end'] - seg['start'] for seg in segments[::2]])  # approximate alternating speakers
speaker1_duration = sum([seg['end'] - seg['start'] for seg in segments[1::2]])

total_duration = sum([seg['end'] - seg['start'] for seg in segments])
if total_duration == 0:
    total_duration = sum([seg['end'] - seg['start'] for seg in segments]) + 1  # fallback

print(f"Approx Talk-time Ratio:")
print(f"Speaker 0: {speaker0_duration/total_duration*100:.2f}%")
print(f"Speaker 1: {speaker1_duration/total_duration*100:.2f}%")


Approx Talk-time Ratio:
Speaker 0: 61.98%
Speaker 1: 38.02%


In [27]:
# 6️⃣ Count Questions
questions = [s for s in transcript.split('.') if s.strip().endswith('?')]
num_questions = len(questions)
print(f"Number of questions asked: {num_questions}")


Number of questions asked: 1


In [28]:
# 7️⃣ Longest Monologue (Approx using longest segment)
longest_monologue = max([seg['end'] - seg['start'] for seg in segments], default=0)
print(f"Longest monologue duration: {longest_monologue:.2f} seconds")


Longest monologue duration: 9.36 seconds


In [29]:
# 8️⃣ Sentiment Analysis
from textblob import TextBlob

blob = TextBlob(transcript)
sentiment_score = blob.sentiment.polarity

if sentiment_score > 0:
    sentiment_label = "Positive"
elif sentiment_score < 0:
    sentiment_label = "Negative"
else:
    sentiment_label = "Neutral"

print(f"Call Sentiment: {sentiment_label}")


Call Sentiment: Positive


In [30]:
# 9️⃣ Actionable Insight
if sentiment_label == "Negative":
    insight = "Improve call tone and engagement."
elif num_questions < 5:
    insight = "Ask more questions to engage the customer."
else:
    insight = "Call is well-balanced and engaging."

print(f"Actionable Insight: {insight}")


Actionable Insight: Ask more questions to engage the customer.
