What are DTMF signals? 
Whenever making a phone call, we utilize these dual-tone multi-frequency signals in order to transmit the digits pressed on the telephone keypad. Each digit corresponds to a unique pair of two associated frequencies which are then combined to form a sinusoidal waveform.

What is our goal?
Our goal is to compare two potential decoding methods that detect the digit based on a frequency waveform. 
The first is the traditional method which utilizes fourier transfers, called Welch’s method.
The second will be a deep learning model that utilizes neural networks to estimate the digit. 

Before that, however, we must write code that generates these signals. 
Furthermore, in order to evaluate how accurate these two methods are, we will add random noise to the generated signals.

We begin by importing all necessary libraries.

In [1]:
import numpy as np
import scipy.signal as signal
import matplotlib.pyplot as plt
import pandas as pd
import os
from scipy.io.wavfile import write

Define important variables

We define global variables that are called by the subsequent functions. PHONE_NUMBER represents the sequence of digits that are being used to generate the waveform. dtmf_table_freq is a dictionary that consists the corresponding frequencies to a digit and will be used when decoding the signals. Once we’ve identified the two frequencies used to generate the signal, we can easily identify the digit. Lastly, we define indexes which will be used later in creating the confusion matrix.

In [2]:
PHONE_NUMBER = "123A456B789C*0#D"


dtmf_table_freq = {
    (1209, 697): "1",
    (1336, 697): "2",
    (1477, 697): "3",
    (1633, 697): "A",
    (1209, 770): "4",
    (1336, 770): "5",
    (1477, 770): "6",
    (1633, 770): "B",
    (1209, 852): "7",
    (1336, 852): "8",
    (1477, 852): "9",
    (1633, 852): "C",
    (1209, 941): "*",
    (1336, 941): "0",
    (1477, 941): "#",
    (1633, 941): "D",
}

indexes = {
    '1' : 0,
    '2' : 1,
    '3' : 2,
    '4' : 3,
    '5' : 4,
    '6' : 5,
    '7' : 6,
    '8' : 7,
    '9' : 8,
    '0' : 9,
    'A' : 10,
    'B' : 11,
    'C' : 12,
    'D' : 13,
    '#' : 14,
    '*' : 15,
}

Our first function is used to generate DTMF signals based on a list of digits. It returns a numpy array for the values of the corresponding waveform. Duration represents the amount of time (ms) each digit should last for, gap represents the time between each digit, sampling_rate represents the amount of times we calculate the value of the function per second, amplitude represents the scale of the y-axis, and mean and standard deviation are used for generating the random noise. After using the dtmf_table to determine the two frequencies for the digit, we calculate how many samples we must make and use this knowledge to determine the waveform values for each digit. Finally, we concatenate the waveforms for each digit and add random noise based upon a mean and standard deviation. The noise is generated according to a normal distribution.

In [3]:
def waveform(phone_num, duration=100, gap=50, sampling_rate=8000, amplitude=1, graphic=False, mean=0, standard_deviation=0):
    '''Returns numpy array for the values of the corresponding waveform.
    There's also a 'graphic' option to include a graph.'''
    dtmf_table = {
        "1": [1209, 697],
        "2": [1336, 697],
        "3": [1477, 697],
        "A": [1633, 697],
        "4": [1209, 770],
        "5": [1336, 770],
        "6": [1477, 770],
        "B": [1633, 770],
        "7": [1209, 852],
        "8": [1336, 852],
        "9": [1477, 852],
        "C": [1633, 852],
        "*": [1209, 941],
        "0": [1336, 941],
        "#": [1477, 941],
        "D": [1633, 941],
    }
    phone_num = phone_num.upper()
    for digit in phone_num:
        if digit not in dtmf_table:
            raise ValueError('The only allowed digits are 0-9, A-D, #, and *')

    gap_amount_of_samples = gap * sampling_rate // 1000
    gap_values = np.zeros(gap_amount_of_samples)

    first = True
    ans = np.array([])
    amount_of_samples = (duration/1000)*sampling_rate
    samples = np.arange(0, duration/1000, (duration/1000)/amount_of_samples)
    for digit in phone_num:
        if not first:
            ans = np.append(ans, gap_values)
        else:
            first = False
        values = (amplitude/2) * (
                np.sin(samples * 2 * np.pi * dtmf_table[digit][0]) +
                np.sin(samples * 2 * np.pi * dtmf_table[digit][1])
        )
        ans = np.append(ans, values)
    # plotting
    if graphic:
        plt.title("Line graph")
        plt.xlabel("X axis")
        plt.ylabel("Y axis")
        plt.plot(np.arange(0, (len(phone_num)-1)*gap_amount_of_samples+len(phone_num)*amount_of_samples), ans, color="green")
        plt.show()
    ans = ans + np.random.normal(mean, standard_deviation, len(ans))

    return ans

Here we define a function which calls Welch’s method. Welch’s method returns an average periodogram over time. The peaks in the periodogram represent the two frequencies used to generate the signal. Furthemore, the periodograms are calculated using a discrete fourier transform, which is used to calculate a frequency of a sinusoidal function.

In [4]:
def welch(waveform, fs):
    f, Pwelch_spec = signal.welch(waveform, fs, scaling='spectrum')
    #plt.semilogy(f, Pwelch_spec)
    #plt.xlabel('frequency [Hz]')
    #plt.ylabel('PSD')
    #plt.grid()
    #plt.show()
    return [f, Pwelch_spec]

Function to closest frequency

In [5]:
def closest(lst, K):
    return lst[min(range(len(lst)), key=lambda i: abs(lst[i] - K))]

Each periodogram has two peaks, and these two peaks can tell us what the frequencies are. Based on the frequencies, we can figure out the digit. In this example, we assume a clean periodogram with no noise, thus the 100% accuracy. Since the frequencies read on the periodogram have some error, we use the closest function to predict the digit.

In [6]:
def frequencies(f, pwelch_spec):
    low_dict = {}
    high_dict = {}
    for i in range(len(f)):
      if f[i] < 1050:
        low_dict[pwelch_spec[i]] = f[i]
      elif f[i] < 1800:
        high_dict[pwelch_spec[i]] = f[i]

    low_range = list(low_dict.keys())
    high_range = list(high_dict.keys())
    low_freq = low_dict[max(low_range)]
    high_freq = high_dict[max(high_range)]
    lows = [697, 770, 852, 941]
    highs = [1209, 1336, 1477, 1633]
    return dtmf_table_freq[(closest(highs, high_freq), closest(lows, low_freq))]

Adds noise and attempts to find digit based on periodogram. Uses a confusion matrix to map out accuracy based on standard deviation. (Under progress)

In [7]:
PHONE_NUMBER = "1234567890ABCD#*"
for standard_deviation in range(180,504,54):
    standard_deviation = standard_deviation/100
    confusionMatrix = np.zeros(256).reshape(16,16)
    names = [_ for _ in '1234567890ABCD#*']
    for digit in PHONE_NUMBER:
        i = 1
        while i <= 10:
            temp = (2^10)*waveform(digit, standard_deviation=standard_deviation)
            if not os.path.isdir("Training data/" + str(standard_deviation)):
                os.mkdir("Training data/" + str(standard_deviation))
            write("Training data/"+str(standard_deviation)+"/"+str(i)+".wav",8000,temp.astype(np.int16))
            temp1 = welch(temp, 8000)
            guess = frequencies(temp1[0], temp1[1])
            confusionMatrix[indexes[digit]][indexes[guess]] += 1
            i += 1
    df = pd.DataFrame(confusionMatrix, index=names, columns=names)
    df.to_csv('std_dev' + str(standard_deviation) + '.csv', index=True, header=True, sep=',')
    print(standard_deviation)

1.8
2.34
2.88
3.42
3.96
4.5
