# Introduction to Spoken Language Processing with Python

## Introduction to audio data in Python

In [1]:
import wave

In [2]:
# Import audio file as wave object
good_morning = wave.open("good-morning.wav", "r")

In [3]:
# Convert wave object to bytes
soundwave_gm = good_morning.readframes(-1)

In [4]:
# View the wav file in byte form
print(soundwave_gm[:10])

b'\xfd\xff\xfb\xff\xf8\xff\xf8\xff\xf7\xff'


## Converting sound wave bytes to integers

In [5]:
import numpy as np

# Convert soundwave_gm from bytes to integers
signal_gm = np.frombuffer(soundwave_gm, dtype='int16')

# Show the first 10 items
signal_gm[:10]

array([ -3,  -5,  -8,  -8,  -9, -13,  -8, -10,  -9, -11], dtype=int16)

**frombuffer** turns a series of data into a 1-dimensional array of a specified data type.

In [6]:
# Get the frame rate
framerate_gm = good_morning.getframerate()

# Show the frame rate
framerate_gm

48000

**Frequency (Hz)** = length of wave object array/duration of audio file (seconds)
**Duration of audio file (seconds)** = length of wave object array/frequency (Hz) 

In [7]:
# Return evenly spaced values between start and stop
np.linspace(start=1, stop=10, num=10)

array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [8]:
# Get the timestamps of the good morning sound wave
time_gm = np.linspace(start=0,
                      stop=len(soundwave_gm)/framerate_gm,
                      num=len(soundwave_gm))

In [9]:
# View first 10 time stamps of the good morning sound wave
time_gm[:10]

array([0.00000000e+00, 2.08333750e-05, 4.16667500e-05, 6.25001250e-05,
       8.33335000e-05, 1.04166875e-04, 1.25000250e-04, 1.45833625e-04,
       1.66667000e-04, 1.87500375e-04])

We can leverage numpy's **linspace** method to figure out the timestamp where each sound wave value occurs. It takes _start_, _stop_, and _num_ integers as parameters. Calling it will return _num_ evenly spaced values between _start_ and _stop_.

## Visualizing sound waves

In [10]:
good_afternoon = wave.open("good-afternoon.wav", "r")
soundwave_ga = good_afternoon.readframes(-1)
signal_ga = np.frombuffer(soundwave_ga, dtype='int16')
framerate_ga = good_afternoon.getframerate()
time_ga = np.linspace(start=0,
                      stop=len(soundwave_ga)/framerate_ga,
                      num=len(soundwave_ga))

In [11]:
#import matplotlib.pyplot as plt
#%matplotlib inline
#
## Initialize figure and setup title
#plt.title("Good Afternoon vs. Good Morning")
#
## x and y axis labels
#plt.xlabel("Time (seconds)")
#plt.ylabel("Amplitude")
#
## Add good morning and good afternoon values
#plt.plot(time_ga, soundwave_ga, label="Good Afternoon")
#plt.plot(time_gm, soundwave_gm, label="Good Morning",
#         alpha=0.5)
#
## Create a legend and show our plot
#plt.legend()
#plt.show()

<img src="Good Afternoon vs. Good Morning.png">

# Using the Python SpeechRecognition library

## SpeechRecognition Python library

In [12]:
# Import the SpeechRecognition library
import speech_recognition as sr

# Create an instance of Recognizer
recognizer = sr.Recognizer()

# Set the energy threshold
recognizer.energy_threshold = 300

The **energy_threshold** can be thought of as the loudness of audio which is considered speech.

Values below the threshold are considered silence, values above are considered speech.

A silent room is typically between 0 and 100. SpeechRecognition's documentation recommends 300 as a starting value which covers most speech files.

The **energy_threshold** value will adjust automatically as the recognizer listens to an audio file.

In [13]:
clean_support_call = sr.AudioFile("clean-support-call.wav")
with clean_support_call as source:
    clean_support_call_audio = recognizer.record(source)

### Using the Recognizer class to recognize speech

**Recognizer** class of **SpeechRecognition** has functions built-in to work with many of the best speech APIs:
- _recognize_bing()_: accesses **Microsoft**'s cognitive services
- _recognize_google()_: uses **Google**'s free web speech API
- _recognize_google_cloud()_: accesses **Google**'s Cloud Speech API
- _recognize_wit()_: uses the **wit.ai** platform

They all accept an audio file and return text, which is hopefully the transcribed speech from the audio file.

In [14]:
# Import the SpeechRecognition library
import speech_recognition as sr

# Instantiate an instance of Recognizer class
recognizer = sr.Recognizer()

# Transcribe speech using Google web API
recognizer.recognize_google(audio_data=clean_support_call_audio,
                            language="en-US")

"hello I'd like to get some help setting up my account please"

## Reading audio files with SpeechRecognition

### The AudioFile class

In [15]:
import speech_recognition as sr

# Setup recognizer instance
recognizer = sr.Recognizer()

# Read in audio file
clean_support_call = sr.AudioFile("clean-support-call.wav")

# Check type of clean_support_call
type(clean_support_call)

speech_recognition.AudioFile

### From AudioFile to AudioData

In [16]:
# Convert from AudioFile to AudioData
with clean_support_call as source:
    # Record the audio
    clean_support_call_audio = recognizer.record(source)
# Check the type
type(clean_support_call_audio)

speech_recognition.AudioData

### Transcribing our AudioData

In [17]:
# Transcribe clean support call
recognizer.recognize_google(audio_data=clean_support_call_audio)

"hello I'd like to get some help setting up my account please"

### Duration and offset

There are two parameters of the **record** method you should know about: **duration** and **offset**.

The **record** method records up to **duration** second of audio from source starting at **offset**. They are both set to _None_ by default.

In [18]:
# Leave duration and offset as default
with clean_support_call as source:
    clean_support_call_audio = recognizer.record(source,
                                                 duration=None,
                                                 offset=None)
    
# Get first 2-seconds of clean support call
with clean_support_call as source:
    clean_support_call_audio = recognizer.record(source,
                                                 duration=2.0)
    
recognizer.recognize_google(audio_data=clean_support_call_audio)

"hello I'd like to"

## Dealing with different kinds of audio

### What language?

The **SpeechRecognition** library doesn't automatically detect languages. So, you will have to ensure that **language** parameter is set manually and make sure the API you're using has the capability to transcribe the language your audio files are in.

In [19]:
# Create a recognizer class
recognizer = sr.Recognizer()

# Read in audio file
japanense = sr.AudioFile("good-morning-japanense.wav")

# Convert from AudioFile to AudioData
with japanense as source:
    # Record the audio
    japanese_audio = recognizer.record(source)

In [20]:
# Create a recognizer class
recognizer = sr.Recognizer()

# Pass the Japanese audio to recognize_google
text = recognizer.recognize_google(japanese_audio, language="en-US")

# Print the text
print(text)

ohayo gozaimasu


In [21]:
# Create a recognizer class
recognizer = sr.Recognizer()

# Pass the Japanese audio to recognize_google
text = recognizer.recognize_google(japanese_audio, language="ja")

# Print the text
print(text)

おはようございます


### Non-speech audio

When there is no speech in the audio, the recognizer will return **UnknownValueError**.

We can prevent errors buy using the **show_all** parameter, which shows a list of all the potential transcriptions the **recognize_google()** function came up with.

In [22]:
# Create a recognizer class
recognizer = sr.Recognizer()

# Read in audio file
leopard = sr.AudioFile("leopard.wav")

# Convert from AudioFile to AudioData
with leopard as source:
    # Record the audio
    leopard_audio = recognizer.record(source)

In [23]:
# Pass the leopard roar audio to recognize_google
text = recognizer.recognize_google(leopard_audio, 
                                   language="en-US", 
                                   show_all=True)

# Print the text
print(text)

[]


In [24]:
# Create a recognizer class
recognizer = sr.Recognizer()

# Read in audio file
charlie = sr.AudioFile("charlie-bit-me-5.wav")

# Convert from AudioFile to AudioData
with charlie as source:
    # Record the audio
    charlie_audio = recognizer.record(source)

In [25]:
# Pass charlie_audio to recognize_google
text = recognizer.recognize_google(charlie_audio, 
                                   language="en-US")

# Print the text
print(text)

charlie bit me


It's worth noting the **recognize_google()** function is only going to return words, as in, it didn't return the baby saying 'ahhh!' because it doesn't recognize it as a word. Speech recognition has come a long way but it's far from perfect.

### Multiple speakers

The free Google Web API transcribes speech and returns it as a single block of text no matter how many speakers there are.

A returned single text block can still be useful, however, if your problem requires knowing who said what, you may want to consider the free API we're using as a proof of concept. And then use one of the paid versions for complex tasks.

The process of splitting more than one speaker from a single audio file is called **speaker diarization**, however, it is beyond the scope of this course.

To get around the multiple speakers problem, you could ensure your audio files are recorded separately for each speaker. Then transcribe the individual speakers audio.

In [26]:
# Create a recognizer class
recognizer = sr.Recognizer()

# Read in audio file
multiple = sr.AudioFile("multiple-speakers-16k.wav")

# Convert from AudioFile to AudioData
with multiple as source:
    # Record the audio
    multiple_speakers = recognizer.record(source)

In [27]:
# Recognize the multiple speaker AudioData
text = recognizer.recognize_google(multiple_speakers,
                                language="en-US")

# Print the text
print(text)

is that it doesn't recognize different speakers invoices it will just return it all as one block of text


### Noisy audio

If you have trouble hearing the speech, so will the APIs. To try and accommodate for background noise, the **Recognizer** class ha a built-in function, **adjust_for_ambient_noise**, which takes a parameter, **duration**.

The **Recognizer** class then listens for **duration** seconds at the start of the audio file and adjusts the **energy_threshold**, or the amount the **Recognizer** class listens, to a level suitable for the background noise.

How much space you have at the start of your audio file will dictate what you can set the duration value to. The **SpeechRecognition** documentation recommends somewhere between 0.5 to 1 second as a good starting point.

In [28]:
# Create a recognizer class
recognizer = sr.Recognizer()

# Read in audio file
clean_support_call = sr.AudioFile("clean-support-call.wav")

# Record the audio from the clean support call
with clean_support_call as source:
    clean_support_call_audio = recognizer.record(source)

# Transcribe the speech from the clean support call
text = recognizer.recognize_google(clean_support_call_audio,
                                   language="en-US")

print(text)

hello I'd like to get some help setting up my account please


In [29]:
# Create a recognizer class
recognizer = sr.Recognizer()

# Read in audio file
noisy_support_call = sr.AudioFile("2-noisy-support-call.wav")

# Record the audio from the noisy support call
with noisy_support_call as source:
    noisy_support_call_audio = recognizer.record(source)

# Transcribe the speech from the noisy support call
text = recognizer.recognize_google(noisy_support_call_audio,
                         language="en-US",
                         show_all=True)

print(text)

{'alternative': [{'transcript': "hello I'd like to get to help setting up my account", 'confidence': 0.93967402}, {'transcript': "hello I'd like to get to help finding out my account"}, {'transcript': "hello I'd like to get to help setting up my calendar"}, {'transcript': "hello I'd like to get to help sending out my account"}, {'transcript': "hello I'd like to get to help setting up my account."}], 'final': True}


Set the **duration** parameter of **adjust_for_ambient_noise()** to 1 (second) so **recognizer** adjusts for background noise.

In [30]:
# Create a recognizer class
recognizer = sr.Recognizer()

# Record the audio from the noisy support call
with noisy_support_call as source:
    # Adjust the recognizer energy threshold for ambient noise
    recognizer.adjust_for_ambient_noise(source, duration=1)
    noisy_support_call_audio = recognizer.record(noisy_support_call)

# Transcribe the speech from the noisy support call
text = recognizer.recognize_google(noisy_support_call_audio,
                                   language="en-US",
                                   show_all=True)

print(text)

{'alternative': [{'transcript': "I'd like to get to help setting up my account", 'confidence': 0.87440616}, {'transcript': "I'd like to get to help setting up my calendar"}, {'transcript': 'I like to get to help setting up my account'}, {'transcript': "I'd like to get to help setting up my Kelly"}, {'transcript': "I'd like to get to help setting up my account."}], 'final': True}


A **duration** of 1 was too long and it cut off some of the audio. Try setting **duration** to 0.5.

In [31]:
recognizer = sr.Recognizer()

# Record the audio from the noisy support call
with noisy_support_call as source:
    # Adjust the recognizer energy threshold for ambient noise
    recognizer.adjust_for_ambient_noise(source, duration=0.5)
    noisy_support_call_audio = recognizer.record(noisy_support_call)

# Transcribe the speech from the noisy support call
text = recognizer.recognize_google(noisy_support_call_audio,
                                   language="en-US",
                                   show_all=True)

print(text)

{'alternative': [{'transcript': "hello I'd like to get to help setting up my account", 'confidence': 0.93739474}, {'transcript': "hello I'd like to get to help setting up my calendar"}, {'transcript': "hello I'd like to get to help setting up my Kelly"}, {'transcript': "hello I'd like to get to help setting up my account."}, {'transcript': "hello I'd like to get to help setting up my account please"}], 'final': True}


# Manipulating Audio Files with PyDub

### PyDub's main class, AudioSegment

In [32]:
# Import PyDub main class
from pydub import AudioSegment

# Import an audio file
wav_file = AudioSegment.from_file(file="wav_file.wav", format="")

# Format parameter only for readability
wav_file = AudioSegment.from_file(file="wav_file.wav")

type(wav_file)



pydub.audio_segment.AudioSegment

### Playing an audio file

In [33]:
# Import play function
from pydub.playback import play

# Import audio file
wav_file = AudioSegment.from_file(file="wav_file.wav")

# Play audio file
play(wav_file)

### Audio parameters

In [34]:
# Import audio files
wav_file = AudioSegment.from_file(file="wav_file.wav")
two_speakers = AudioSegment.from_file(file="good-morning.wav")

# Check number of channels
wav_file.channels, two_speakers.channels

(2, 2)

In [35]:
# Find the sample rate
wav_file.frame_rate

48000

In [36]:
# Find the number of bytes for sample
wav_file.sample_width

2

In [37]:
# Find the max amplitude
wav_file.max

8484

In [38]:
# Duration of audio file in milliseconds
len(wav_file)

3284

When you import a file with **from_file**, PyDub automatically infers a number of parameters about the file. These are stored as attributes in the **AudioSegment** instance.

- Calling **channels** on AutioSegment will show you the number of channels, 1 for mono, 2 for stereo audio.
- Calling **frame_rate** gives you the sample rate of your AudioSegment in Hertz.
- **sample_width** tells you the number of bytes per sample, 1 means 8-bit, 2 means 16-bit.
- **max** will tell you the max amplitude of your audio file, which can be considered loudness and is useful for normalizing sound levels.
- Calling **len** on any AudioSegment will tell you the duration of the audio file in milliseconds.

### Changing audio parameters

#### Change ATTRIBUTENAME of AudioSegment to x
changed_audio_segment = audio_segment.set_ATTRIBUTENAME(x)

In [39]:
# Change sample width to 1
wav_file_width_1 = wav_file.set_sample_width(1)
wav_file_width_1.sample_width

1

In [40]:
# Change sample rate
wav_file_16k = wav_file.set_frame_rate(16000)
wav_file_16k.frame_rate

16000

In [41]:
# Change number of channels
wav_file_1_channel = wav_file.set_channels(1)
wav_file_1_channel.channels

1

## Manipulating audio files with PyDub

### Turning it down to 11

In [42]:
## Import audio file
wav_file = AudioSegment.from_file("volume_adjusted.wav")

# Minus 60 dB
quiet_wav_file = wav_file - 60

If you try to transcribe audio this quiet with **recognize_google()**, it would return an error.

In practice, you're more likely to want to increase the volume of your **AudioSegment**s.

In [43]:
# Increase the volume by 10 dB
louder_wav_file = wav_file + 10

If your audio files are too quiet or too loud, they may produce transcription errors. As you could imagine, speech transcriptions works best on clear, audible speech.

### This all sounds the same

In [44]:
# Import AudioSegment and normalize
from pydub import AudioSegment
from pydub.effects import normalize
from pydub.playback import play

The **normalize** function finds the highest level of audio throughout an **AudioSegment** and then boosts the rest of the audio up to match.

In [45]:
## Import uneven sound audio file
#loud_quiet = AudioSegment.from_file("ex3_datacamp_loud_then_quiet.wav")
#
## Normalize the sound levels
#normalized_loud_quiet = normalize(loud_quiet)
#
## Check the sound
#play(normalized_loud_quiet)

### Remixing your audio files

In [46]:
# Import audio with static at start
static_at_start = AudioSegment.from_file("volume_adjusted.wav")

# Remove the static via slicing
no_static_at_start = static_at_start[5000:]

# Check the new sound
play(no_static_at_start)

In [47]:
# Import two audio files
wav_file_1 = AudioSegment.from_file("wav_file.wav")
wav_file_2 = AudioSegment.from_file("volume_adjusted.wav")

# Combine the two audio files
wav_file_3 = wav_file_1 + wav_file_2

# Check the sound
play(wav_file_3)

### Exercise: Chopping and changing audio files

In [48]:
from pydub import AudioSegment

# Import part 1 and part 2 audio files
part_1 = AudioSegment.from_file("ex3_slicing_part_1.wav")
part_2 = AudioSegment.from_file("ex3_slicing_part_2.wav")

# Remove the first four seconds of part 1
part_1_removed = part_1[4000:]

# Add the remainder of part 1 and part 2 together
part_3 = part_1 + part_2

Operators on AudioSegments work in order of operation. So, "*wav_file_1* + *wav_file_2* + *10*" will combine *wav_file_1* and *wav_file_2* and increase the combination by *10 decibels*.

In [49]:
# Combine two files and make the combination louder
louder_wav_file_3 = wav_file_1 + wav_file_2 + 10

If your audio files have different characteristics, combining them like this automatically scales parameters such as frame rate to be equal to the higher quality audio file.

### Splitting your audio

In [50]:
# Import phone call audio
phone_call = AudioSegment.from_file("ex3_stereo_call.wav")

# Find number of channels
phone_call.channels

2

In [51]:
# Split stereo to mono
phone_call_channels = phone_call.split_to_mono()
phone_call_channels

[<pydub.audio_segment.AudioSegment at 0x216b8928190>,
 <pydub.audio_segment.AudioSegment at 0x216b89281c0>]

In [52]:
# Find number of channels of first list item
phone_call_channels[0].channels

1

### Exercise: Exporting and reformatting audio files

In [53]:
# Import AudioSegment
from pydub import AudioSegment

# Import stereo audio file and check channels
stereo_phone_call = AudioSegment.from_file("ex3_stereo_call.wav")
print(f"Stereo number channels: {stereo_phone_call.channels}")

# Split stereo phone call and check channels
channels = stereo_phone_call.split_to_mono()
print(f"Split number channels: {channels[0].channels}, {channels[1].channels}")

# Save new channels separately
phone_call_channel_1 = channels[0]
phone_call_channel_2 = channels[1]

Stereo number channels: 2
Split number channels: 1, 1


## Converting and saving audio files with PyDub

### Exporting audio files

In [54]:
from pydub import AudioSegment

# Import audio file
wav_file = AudioSegment.from_file("wav_file.wav")

# Increase by 10 decibels
louder_wav_file = wav_file + 10

# Export louder audio file
louder_wav_file.export(out_f="louder_wav_file.wav", format="wav")

<_io.BufferedRandom name='louder_wav_file.wav'>

### Reformatting and exporting multiple audio files

In [55]:
def make_wav(wrong_folder_path, right_folder_path):
    # Loop through wrongly formatted files
    for file in os.scandir(wrong_folder_path):
        # Only work with files with audio extensions we're fixing
        if file.path.endswith(".mp3") or file.path.endswith(".flac"):
            # Create the new .wav filename
            out_file = right_folder_path + os.path.splitext(os.path.basename(file.path))[0] + ".wav"
        # Read in the audio file and export it in wav format
        AudioSegment.from_file(file.path).export(out_file,
                                                 format="wav")
        print(f"Creating {out_file}")

In [56]:
## Call our new function
#make_wav("data/wrong_formats", "data/right_format")

### Manipulating and exporting

In [57]:
def make_no_static_louder(static_quiet, louder_no_static):
    # Loop through files with static and quiet (already in wav format)
    for file in os.scandir(static_quiet_folder_path):
        # Create new file path
        out_file = louder_no_static + os.path.splitext(os.path.basename(file.path))[0] + ".wav"
        # Read the audio file
        audio_file = AudioSegment.from_file(file.path)
        # Remove first three seconds and add 10 decibels and export
        audio_file = (audio_file[3100:] + 10).export(out_file, format="wav")
        
        print(f"Creating {out_file}")

In [58]:
## Remove static and make louder
#make_no_static_louder("data/static_quiet/", "data_louder_no_static/")

# Processing text transcribed from spoken language

## Creating transcription helper functions

### Preparing for the proof of concept

### File format conversion function

In [59]:
# Create a function to convert audio file to wav
def convert_to_wav(filename):
    "Takes an audio file of non .wav format and converts to .wav"
    # Import audio file
    audio = AudioSegment.from_file(filename)
    # Create new filename
    new_filename = filename.split(".")[0] + ".wav"
    # Export file as .wav
    audio.export(new_filename, format="wav")
    print(f"Converting {filename} to {new_filename}...")

### Attribute showing function

In [60]:
def show_pydub_stats(filename):
    "Returns different audio attributes related to an audio file."
    # Create AudioSegment instance
    audio_segment = AudioSegment.from_file(filename)
    # Print attributes
    print(f"Channels: {audio_segment.channels}")
    print(f"Sample width: {audio_segment.sample_width}")
    print(f"Frame rate (sample rate): {audio_segment.frame_rate}")
    print(f"Frame width: {audio_segment.frame_width}")
    print(f"Length (ms): {len(audio_segment)}")
    print(f"Frame count: {audio_segment.frame_count()}")

In [61]:
show_pydub_stats("ex4_call_1_stereo_formatted.wav")

Channels: 2
Sample width: 2
Frame rate (sample rate): 32000
Frame width: 4
Length (ms): 54888
Frame count: 1756416.0


### Transcribe function

In [62]:
# Create a function to transcribe audio
def transcribe_audio(filename):
    "Takes a .wav format audio file and transcribes it to text."
    # setup a recognizer instance
    recognizer = sr.Recognizer()
    
    # Import the audio file and convert to audio data
    audio_file = sr.AudioFile(filename)
    with audio_file as source:
        audio_data = recognizer.record(audio_file)
        
    # Return the transcribed text
    return recognizer.recognize_google(audio_data)

In [63]:
transcribe_audio("ex4_call_1_stereo_formatted.wav")

'hello welcome to Acme Studio support lawn mower name is Daniel how can I best help you this is John'

## Sentiment analysis on spoken language text

In [64]:
# Download required NLTK packages
import nltk
nltk.download("punkt")
nltk.download("vader_lexicon")

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\Melih\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package vader_lexicon to
[nltk_data]     C:\Users\Melih\AppData\Roaming\nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


True

We'll use NLTK's **VADER** (**Valence Aware Dictionary and sEntiment analyzeR**) as it has a pre-trained sentiment analysis model in it.

**VADER** works by analyzing each word in a piece of text and giving it a sentiment score. It was pre-trained on social media text passages but will lend itself well for our proof of concept.

### Sentiment analysis with VADER  

In [65]:
# Import sentiment analysis class
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Create sentiment analysis instance
sid = SentimentIntensityAnalyzer()

# Test sentiment analysis on negative text
print(sid.polarity_scores("This customer service is terrible."))

{'neg': 0.437, 'neu': 0.563, 'pos': 0.0, 'compound': -0.4767}


**compound** value can be thought of as the overall score, with _-1_ being most _negative_ and _1_ being most _positive_.

In [66]:
# Import phone call audio
call_2 = AudioSegment.from_file("ex4_call_2_stereo_native.wav")
# Split stereo to mono
call_channels = phone_call.split_to_mono()
call_channels[0].export(out_f="call_2_channel_1.wav", format="wav")
call_channels[1].export(out_f="call_2_channel_2.wav", format="wav")

<_io.BufferedRandom name='call_2_channel_2.wav'>

In [67]:
# Transcribe representative channel of call_3
call_2_channel_1_text = transcribe_audio("call_2_channel_1.wav")
print(call_2_channel_1_text)

hello this is Daniel from Acme Studios how can I best help you yeah sure thing what's your name and what's wrong with the device okay nice to meet you Josh what's the CR number of your device so I can track it down


In [68]:
# Sentiment analysis on customer channel of call_3
sid.polarity_scores(call_2_channel_1_text)

{'neg': 0.057, 'neu': 0.624, 'pos': 0.319, 'compound': 0.9062}

In [69]:
# Transcribe customer channel of call_3
call_2_channel_2_text = transcribe_audio("call_2_channel_2.wav")
print(call_2_channel_2_text)

I was just wondering if I could get some support my name is Josh and my device seems to not want to learn some of my my number is 176-4588


In [70]:
# Sentiment analysis on customer channel of call_3
sid.polarity_scores(call_2_channel_2_text)

{'neg': 0.04, 'neu': 0.827, 'pos': 0.132, 'compound': 0.4172}

### Sentence by sentence

In [71]:
call_3_paid_api_text = "Okay. Yeah. Hi, Diane. This is paid on this call and obviously the status of my orders at three weeks ago, and that service is terrible. Is this any better? Yes..."

# Import sent tokenizer
from nltk.tokenize import sent_tokenize

# Find sentiment on each sentence
for sentence in sent_tokenize(call_3_paid_api_text):
    print(sentence)
    print(sid.polarity_scores(sentence))

Okay.
{'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.2263}
Yeah.
{'neg': 0.0, 'neu': 0.0, 'pos': 1.0, 'compound': 0.296}
Hi, Diane.
{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
This is paid on this call and obviously the status of my orders at three weeks ago, and that service is terrible.
{'neg': 0.129, 'neu': 0.871, 'pos': 0.0, 'compound': -0.4767}
Is this any better?
{'neg': 0.0, 'neu': 0.508, 'pos': 0.492, 'compound': 0.4404}
Yes...
{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}


## Named entity recognition on transcribed text

spaCy works by turning blocks of text into **docs**. **Docs** are made up of **tokens** and **spans**. You can think of **tokens** as individual words and groups of tokens or sentences as **spans**.

In [72]:
import spacy

# Load spaCy language model
nlp = spacy.load("en_core_web_sm")

In [73]:
# Create a spaCy doc
doc = nlp("I'd like to talk about a smartphone I ordered on July 31st from your Sydney store, my order number is 40939440. I spoke to Georgia about it last week.")

You can see what tokens a doc contains and the index where they start using **.text** and **.idx** on objects in your doc.

The number returned by **.idx** indicates the index of the first letter in the token.

In [74]:
# Show different tokens and positions
for token in doc:
    print(token.text, token.idx)

I 0
'd 1
like 4
to 9
talk 12
about 17
a 23
smartphone 25
I 36
ordered 38
on 46
July 49
31st 54
from 59
your 64
Sydney 69
store 76
, 81
my 83
order 86
number 92
is 99
40939440 102
. 110
I 112
spoke 114
to 120
Georgia 123
about 131
it 137
last 140
week 145
. 149


You can see where the sentences are with **.sents**.

In [75]:
# Show sentences in docs
for sentence in doc.sents:
    print(sentence)

I'd like to talk about a smartphone I ordered on July 31st from your Sydney store, my order number is 40939440.
I spoke to Georgia about it last week.


### spaCy named entities

A **named entity** is an object which is given a name, such as a _person_, _product_, _location_ or _date_.

Some of spaCy's built-in named entities:
- **PERSON** People, including fictional.
- **ORG** Companies, agencies, institutions, etc.
- **GPE** Countries, cities, states.
- **PRODUCT** Objects, vehicles, foods, etc. (Not services.)
- **DATE** Absolute or relative dates or periods.
- **TIME** Times smaller than a day.
- **MONEY** Monetary values, including unit.
- **CARDINAL** Numerals that do not fall under another type.

You can access the named entities in a doc using **doc.ents**.

In [76]:
# Find named entities in doc
for entity in doc.ents:
    print(entity.text, entity.label_)

July 31st DATE
Sydney GPE
40939440 DATE
Georgia GPE
last week DATE


### Custom named entities

To create a custom entity recognizer, you can use spaCy's **pipeline** class **EntityRuler**.

A **pipeline** is what spaCy uses to parse text into a **doc**. You can see the current pipeline using by calling **pipeline** on **nlp**.

In [77]:
# Import EntityRuler class
from spacy.pipeline import EntityRuler

# Check spaCy pipeline
nlp.pipeline

[('tok2vec', <spacy.pipeline.tok2vec.Tok2Vec at 0x216c816ddb0>),
 ('tagger', <spacy.pipeline.tagger.Tagger at 0x216c819ef90>),
 ('parser', <spacy.pipeline.dep_parser.DependencyParser at 0x216c813a940>),
 ('ner', <spacy.pipeline.ner.EntityRecognizer at 0x216c813ab80>),
 ('attribute_ruler',
  <spacy.pipeline.attributeruler.AttributeRuler at 0x216c8225580>),
 ('lemmatizer', <spacy.lang.en.lemmatizer.EnglishLemmatizer at 0x216c820e5c0>)]

### Changing the pipeline

The **EntityRuler** class allows us to create another step in the **pipeline**.

In [78]:
# Create EntityRuler instance
ruler = EntityRuler(nlp)

# Add token pattern to ruler
ruler.add_patterns([{"label": "PRODUCT", "pattern": "smartphone"}])

# Add new rule to pipeline before ner
#nlp.add_pipe(ruler, before="ner")

# Check updated pipeline
nlp.pipeline

[('tok2vec', <spacy.pipeline.tok2vec.Tok2Vec at 0x216c816ddb0>),
 ('tagger', <spacy.pipeline.tagger.Tagger at 0x216c819ef90>),
 ('parser', <spacy.pipeline.dep_parser.DependencyParser at 0x216c813a940>),
 ('ner', <spacy.pipeline.ner.EntityRecognizer at 0x216c813ab80>),
 ('attribute_ruler',
  <spacy.pipeline.attributeruler.AttributeRuler at 0x216c8225580>),
 ('lemmatizer', <spacy.lang.en.lemmatizer.EnglishLemmatizer at 0x216c820e5c0>)]

In [79]:
# Test new entity rule
for entity in doc.ents:
    print(entity.text, entity.label_)

July 31st DATE
Sydney GPE
40939440 DATE
Georgia GPE
last week DATE


We start by making an instance of **EntityRuler** called **ruler**, passing it **nlp**. Then, we use **add_patterns** to add the token pattern we'd like spaCy to consider an entity. We can add this rule to the pipeline before **ner** so we be sure it gets used.

### Exercise: Named entity recognition in spaCy

In [80]:
import spacy

# Transcribe call 4 channel 2
call_4_channel_2_text = transcribe_audio("ex4_call_4_channel_2_formatted.wav")

# Create a spaCy language model instance
nlp = spacy.load("en_core_web_sm")

# Create a spaCy doc with call 4 channel 2 text
doc = nlp(call_4_channel_2_text)

# Check the type of doc
print(type(doc))

<class 'spacy.tokens.doc.Doc'>


In [81]:
# Show tokens in doc
for token in doc:
    print(token.text, token.idx)

hi 0
Daniel 3
my 10
name 13
is 18
Ann 21
and 25
I 29
've 30
recently 34
just 43
purchased 48
a 58
smart 60
front 66
buying 72
from 79
you 84
and 88
I 92
'm 93
very 96
happy 101
with 107
the 112
product 116
I 124
'd 125
like 128
to 133
order 136
another 142
one 150
from 154
my 159
friend 162
believes 169
in 178
Sydney 181
and 188
have 192
it 197
delivered 200
I 210
'm 211
pretty 214
sure 221
it 226
's 228
model 231
315 237
I 241
can 243
check 247
that 253
for 258
you 262
and 266
I 270
'll 271
give 275
you 280
more 284
details 289
if 297
you 300
'd 303
like 306
to 311
take 314
my 319
details 322
and 330
I 334
I 336
will 338
also 343
give 348
you 353
the 357
address 361
thank 369
you 375
excellent 379


In [82]:
# Show sentences in doc
for sentence in doc.sents:
    print(sentence)

hi Daniel my name is Ann and I've recently just purchased a smart front buying from you and I'm very happy with the product I'd like to order another one from my friend believes in Sydney and have it delivered I'm pretty sure it's model 315
I can check that for you and I'll give you more details if you'd like to take my details and I I will also give you the address thank you excellent


In [83]:
# Show named entities and their labels
for entity in doc.ents:
    print(entity.text, entity.label_)

Ann PERSON
Sydney GPE
315 CARDINAL


## Classifying transcribed speech with Sklearn

### Transcribing all phone call excerpts

In [84]:
# Transcribe text from wav files
def create_text_list(folder):
    text_list = []
    # Loop through foler
    for file in folder:
        # Check for .wav extension
        if file.endswith(".wav"):
            # Transcribe audio
            text = transcribe_audio(file)
            # Add transcribed text to list
            text_list.append(text)
    return text_list

### Building a text classifier

In [85]:
# Import text classification packages
import numpy as np
from sklearn.pipeline import Pipeline
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer
from sklearn.model_selection import train_test_split

In [86]:
pre_purchase_text = ['yeah hi John just calling in regards to a recent order I just placed I found a cheaper product online and I was wondering if I could cancel that',
                     "I was looking online it says that you're only size is available a large and small I was wondering if you'll have any mediums in soon",
                     'hi I was just wondering if you have the extra large tea and blue',
                     'yeah hey Steve just calling in regards to a recent order I just placed I was wondering if I could cancel that order',
                     'hi I just ordered a new phone and I was just wondering if I could cancel out order and organise a refund',
                     'hi I just ordered a new t-shirt and I was wondering if I could cancel an order and organise a refund',
                     'accidentally made some errors and order I recently just placed I was wondering if you could help me',
                     "I just placed an order online and I was just wondering when I'll get my confirmation email",
                     "hey mate I just finished paying for my order and I was just wondering when I'm going to get that email to confirm it",
                     'hey I was wondering if you know where my new phone is that I just recently ordered',
                     'do you currently offer any new promotions at the moment',
                     "hi I just pre-ordered the nudity and this is my order number but doctor I was just wondering if you know where abouts it isn't shipment",
                     'your hi Jacob looking to make an order but just have a few questions regarding some products that you have online',
                     'hi I just recently placed an order with your company I was just wondering if you know the status of my shipment',
                     "Archie thank god I'm free been on hold for the last 30 minutes yeah got a couple of complaints made about this order I just posted",
                     "hi just calling in regards to my order on November the 3rd I was just wondering when that's going to leave your office",
                     "just looking to get some more information on the current promotions you're offering right now before I place my order",
                     "hi I recently ordered a new phone and I'm just wondering where I could find my reference number for the delivery",
                     'hey mate just looking to make some alterations to my order I just placed',
                     'hey just looking to place this order but I see that you have a promotion still running can you give me some more details behind this promotion',
                     "hi I placed an order a couple days ago and I was just wondering why my tracking number isn't working",
                     'hi I just realised I ordered the wrong computer I was wondering if I could just cancel that and organise a refund',
                     "yeah I just placed an all this you guys and I was wondering if I could change a few things before it's shift out",
                     "how's it going after I just placed an order with you guys and I accidentally sent it to the wrong address can you please help me change this",
                     "hey Polly just looking to place an order but before I proceed I'm just wondering if this offer still stands",
                     'yeah hi Tommy I just placed an order with you guys but I use the wrong payment processing method I was wondering if I could change that',
                     'hi Michael just looking to enquire about a few things before I placed an order I was wondering if you could help me',
                     'hi I saw your new phone on your website I was wondering if you have any setup tips for',
                     "I just ordered the new remote control car off you website I was just didn't see how many horsepower it has can you tell me",
                     'hi just about to order these shoes online I was just wondering if you have any different sizes in store',
                     'I just placed an order and I was wondering if I could change my shipping time from standard business days to rush if possible',
                     'hey I just ordered the new phone and I was wondering if I could get airpods put into that order just before you guys send it',
                     'hi Jacob I just placed an order with you guys but I found the same product online it and other store for a cheaper price I was wondering if you could price match it or could I cancel this order',
                     'it says here you have the iPhone x l and X I was wondering if you still stock the iPhone 10',
                     'hey I was just looking online at your shoes and I was wondering if you have this brand in Pink',
                     'I just placed an order I was wondering how long shipping time would be expected to be',
                     "hey mate just have a few questions regarding the recent order I just posted it shows that it's coming from overseas however when I looked at the Australian soccer shop online it says that there's current stock in store for the Australian store",
                     'hi I just ordered some shoes and I was just wondering if I could cancel that order and make a refund',
                     'hey I just ordered the blue and yellow shoes off your website and I was wondering if I could cancel that order and organise a refund',
                     'hey so I just placed an order with your company and I was just wondering where I can find my reference number',
                     'hey I was just wondering about the sizing on your shirts it says us as how does that relate to AUD',
                     "hi Tony I just placed an order I'm currently having a few problems I was wondering if you could help me",
                     'yeah hi David I just placed an order online and I was wondering if I could make an alteration to that order before you send it off',
                     'hi I was just looking at finding a new phone I was wondering if you could recommend anything to me',
                     'I I just ordered the green and blue shoes off your website and I was wondering if I could add a shirt to my order before you send it']

post_purchase_text = ['hey man I just bought a product from you guys and I think is amazing but I leave a little help setting it up',
                      'these clothes I just bought from you guys too small is there any way I can change the size',
                      "I recently got these pair of shoes but they're too big can I change the size",
                      "I bought a pair of pants from you guys but they're way too small",
                      "I bought a pair of pants and they're the wrong colour is there any chance I can change that",
                      "hey mate how you doing I'm just calling in regards the product that god it's faulty and doesn't work",
                      "just wondering if there's any tutorials on how to set up my device I just received",
                      "hey I'm just not happy with the product that you guys send me there any chance I can swap it out for another one",
                      'I bought a pair of pants from you guys and they are just a bit too long do you guys do Hemi',
                      'is there anybody that can help me set up this product or any how to use',
                      "hey mate I just bought a product from you guys and I'm just unhappy with the pop the product can I return it",
                      "just received the product from you guys and it didn't meet my expectations can I please get a refund",
                      "what's the process I have to go through to send my product back for a swap",
                      "hey mate how are you doing just wanting to know if there's any support I can get on this device how to set it up",
                      "what's your refund policy on items that I've purchased from you guys",
                      "hey how we doing I just put a cat from you guys and it's just the Wrong Colours is there any chance I can change that",
                      "call me on to talk about a package I got yesterday it's I got it but I need to do I need some help with setting it up",
                      "I got my order yesterday and the order number is 1863 3845 I'm just calling up to to check some more details on that",
                      'I would have a couple of things from you guys the other day and two it two of them two of them and great and I love them but the other one is is not the right thing',
                      "yeah hello I'm just wondering if I can speak to someone about an order I received yesterday",
                      'wrong package delivered',
                      "hey I ordered something yesterday and it arrived it arrived this morning but it seems like there's a few a few extra things in there that I didn't really order is there someone that I can talk to you to fix this up",
                      "hey I bought something from your website the other day and it arrived but it's it's not the thing that I ordered is there someone I can talk to her to fix this up",
                      "hello someone from your team delivered my package today but it's it's got a problem with it",
                      "my shipment arrived this afternoon but it's wrong size is there anyone I can talk to you to change it",
                      'I just bought a item from you guys and ID want to know if I can swap it for a different colour',
                      "hey I received my order but it's the wrong size can I get a refund please",
                      "hey my order arrived today but it's it's there's a it's I don't think it's the one that I ordered I check the receipt and it doesnt match what what a right",
                      "hey I'm calling up to to see if I can talk to someone to help with her a shipment that I received yesterday",
                      "I just received this device and I'd love some supported to be able to set it up",
                      "I just bought a product from you guys and I wouldn't want to know if I can send it back to get a colour change",
                      "I purchase something from your online store yesterday but the receipt didn't come through can can I get another receipt emailed please",
                      'the product arrived and there was a few things in the box but two of them the wrong is there someone I can talk to about fixing up my order',
                      "I'm just happy with the colour that I got from you guys so is there any chance I can change it for a different one",
                      "a couple of days ago I got a message saying that my package have been delivered it wasn't delivered that day but it still hasn't arrived there someone I can talk to about my order",
                      "my shipment arrived yesterday but it's not the right thing is there someone I can talk to you to fix it up",
                      "my shipment arrived yesterday but it's not the right thing is there someone I can talk to you to fix it up",
                      "my package was supposed to be delivered yesterday but it it didn't arrive is there someone I can talk to about my order",
                      "my package was supposed to be delivered yesterday but it it didn't arrive is there someone I can talk to about my order",
                      "I bought a hat from you guys and it's just too big is there anyway I can get it down size and what's your policies on that",
                      'calling in regards to the order I just got would love some support',
                      "my order a 64321 arrived this morning but it's something wrong with it is there someone I can talk to to fix it",
                      "yeah hello someone this morning delivered a package but I think it's I think it's not the right one that I ordered is there someone I can talk to you too to change it",
                      "on the box that you sent me yesterday arrived but it's damaged the someone I can talk to her about replacement",
                      "I've just bought a product can you guys and I want to know what your return keys and Caesar",
                      "my order a 64321 arrived this morning but it's something wrong with it is there someone I can talk to to fix it",
                      "hey my name is Daniel I received my shipment yesterday but it's wrong can I change it",
                      "all the things I received the my order yesterday would damaged I'm not sure what happened to delivery is there someone that can give me a hand",
                      'the shipment I received is wrong',
                      "yeah hey I need I need some help with her with an order that I ordered the other day it it came and it wasn't it wasn't correct",
                      "yeah hello someone this morning delivered a package but I think it's I think it's not the right one that I ordered is there someone I can talk to you too to change it",
                      'the shipment I received is wrong',
                      "yeah hello I'm just wondering if I can speak to someone about an order I received yesterday",
                      "my shipment arrived this afternoon but it's wrong size is there anyone I can talk to you to change it",
                      "all the things I received the my order yesterday would damaged I'm not sure what happened to delivery is there someone that can give me a hand",
                      'hey mate the must have been a problem with the shipping because the product I just received from you is damaged',
                      "hey mate how you doing just calling in regards to the phone I just purchased from you guys faulty not working and now he's damaged on the way here"]

In [87]:
import pandas as pd

# Make dataframes with the text
post_purchase_df = pd.DataFrame({"label": "post_purchase",
                                 "text": post_purchase_text})
pre_purchase_df = pd.DataFrame({"label": "pre_purchase",
                                "text": pre_purchase_text})

# Combine DataFrames
df = pd.concat([post_purchase_df, pre_purchase_df])

# Print the combined DataFrame
df.head()

Unnamed: 0,label,text
0,post_purchase,hey man I just bought a product from you guys ...
1,post_purchase,these clothes I just bought from you guys too ...
2,post_purchase,I recently got these pair of shoes but they're...
3,post_purchase,I bought a pair of pants from you guys but the...
4,post_purchase,I bought a pair of pants and they're the wrong...


In [88]:
# Split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(df["text"], df["label"], test_size=0.3)

### Naive Bayes Pipeline

In [89]:
# Create text classifier pipeline
text_classifier = Pipeline([
    ("vectorizer", CountVectorizer()),
    ("tfidf", TfidfTransformer()),
    ("classifier", MultinomialNB())
])

In [90]:
# Fit the classifier pipeline on the training data
text_classifier.fit(X_train, y_train)

Pipeline(steps=[('vectorizer', CountVectorizer()),
                ('tfidf', TfidfTransformer()),
                ('classifier', MultinomialNB())])

### Not so Naive

In [91]:
# Make predictions and compare them to test labels
predictions = text_classifier.predict(X_test)
accuracy = 100 * np.mean(predictions == y_test)
print(f"The model is {accuracy:.2f}% accurate.")

The model is 100.00% accurate.
