# Dynamic Time Warping

DTW is an algorithm for comparing two sequences. The sequences may be of different length. The algorithm will return that sequences are similar if they longer one has a subsequence that is very similar to the shorter one, even if some parts are stretched too much.

I will implement a simple version of DTW that performs $n \times m$ operations of calculating a distance, where $n$ and $m$ are lengths of compared sequences (lets call them $N$ and $M$ with elements $N_i$ and $M_j$, where indexes start at $0$). The algorithm uses a method of dynamic programming. I will keep whole arrays to visualize the results better, but you can keep only last two rows and compute distances row-wise. It keeps memory usage linear with $m$ and the algorithm remains simple. The final distance between sequences is in the bottom right corner.

The algorithm requires choosing a metric $d$ for measuring distance of single elements. 

The algorithms looks like this: We create a distance matrix $D$ of size $(N + 1) \times (M + 1)$ and set first column and first row all to $\infty$, except the corner which is set to $0$.
Then we fill all the remaining $N \times M$ cells going row-wise according to this rule:

$$D+{i+1, j+1} = min(D_{i+1, j}, D_{i, j+1}, D_{i, j}) + d(N_i, M_j)$$

* Taking $D_{i+1, j}$ means that our current result consists of $d(N_i, M_{j-1})$ and $d(N_i, M_j)$, so you compared same element of sequence N with two elements from pattern M. Eg. $[1, 20, 20, 20, 20], [1, 1, 1, 10]$ will probably take this case a few times.
* $D_{i, j+1}$ so your result will consist of $d(N_{i-1}, M_j)$ and $d(N_i, M_j)$, so you matched two elements from sequence with same element in pattern M. Eg. $[1, 1, 1, 1, 1, 20], [1, 10]$ will use this case to minimize the distance.
* $D_{i, j}$ so you used both the next element in sequence N and in pattern M. Eg. $[1, 2, 3, 4]$ and $[1, 2, 3, 4]$.

In [1]:
%load_ext autoreload

In [2]:
%autoreload 2

import concurrent.futures as cf
import functools as ft
import itertools as it
import json
import math
import operator as op
import os

import fastdtw
from IPython.display import display
from ipywidgets import interact, interact_manual, widgets
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy import interpolate, misc, optimize, spatial, stats
from sklearn import metrics

from paprotka.dataset import reddots
from paprotka.feature import cepstral

In [3]:
%autoreload 0

In [4]:
def calculate_dtw_full(metric, pattern, sequence):
    pattern_size = len(pattern)
    sequence_size = len(sequence)

    distances = np.zeros((sequence_size + 1, pattern_size + 1), dtype=np.float64)
    distances[0, :] = math.inf
    distances[:, 0] = math.inf
    distances[0, 0] = 0

    for i, sequence_window in enumerate(sequence):
        for j, pattern_window in enumerate(pattern):
            distance = metric(pattern_window, sequence_window)
            prev_distance = min(distances[i, j], distances[i, j + 1], distances[i + 1, j])
            distances[i + 1, j + 1] = prev_distance + distance

    return distances

dtw_full_norm = lambda pat, seq: calculate_dtw_full(lambda x, y: np.linalg.norm(x - y), pat, seq)

In [5]:
print('Matches same element in sequence with multiple elements in pattern')
print(dtw_full_norm([1,20,20,20,20], [1,1,1,10]))

Matches same element in sequence with multiple elements in pattern
[[  0.  inf  inf  inf  inf  inf]
 [ inf   0.  19.  38.  57.  76.]
 [ inf   0.  19.  38.  57.  76.]
 [ inf   0.  19.  38.  57.  76.]
 [ inf   9.  10.  20.  30.  40.]]


In [8]:
print('Matches multiple elements from sequence with the same element in pattern')
print(dtw_full_norm([1,10], [1,1,1,1,1,20]))

Matches multiple elements from sequence with the same element in pattern
[[  0.  inf  inf]
 [ inf   0.   9.]
 [ inf   0.   9.]
 [ inf   0.   9.]
 [ inf   0.   9.]
 [ inf   0.   9.]
 [ inf  19.  10.]]


In [9]:
print('Matches elements one to one')
print(dtw_full_norm([1, 2, 3, 4], [1, 2, 3, 4]))

Matches elements one to one
[[  0.  inf  inf  inf  inf]
 [ inf   0.   1.   3.   6.]
 [ inf   1.   0.   1.   3.]
 [ inf   3.   1.   0.   1.]
 [ inf   6.   3.   1.   0.]]


In [6]:
print('If the sequence is equal to pattern except some elements are repeated more or less times, the distance is still 0')
print(dtw_full_norm([1, 1, 2, 3, 3, 3, 4, 4], [1, 2, 2, 2, 3, 4]))

If the sequence is equal to pattern except some elements are repeated more or less times, the distance is still 0
[[  0.  inf  inf  inf  inf  inf  inf  inf  inf]
 [ inf   0.   0.   1.   3.   5.   7.  10.  13.]
 [ inf   1.   1.   0.   1.   2.   3.   5.   7.]
 [ inf   2.   2.   0.   1.   2.   3.   5.   7.]
 [ inf   3.   3.   0.   1.   2.   3.   5.   7.]
 [ inf   5.   5.   1.   0.   0.   0.   1.   2.]
 [ inf   8.   8.   3.   1.   1.   1.   0.   0.]]


In [11]:
print(dtw_full_norm([1, 10], [1, 1, 1, 1, 1]))
print(dtw_full_norm([1, 10], [10, 10, 10, 10, 10]))

[[  0.  inf  inf]
 [ inf   0.   9.]
 [ inf   0.   9.]
 [ inf   0.   9.]
 [ inf   0.   9.]
 [ inf   0.   9.]]
[[  0.  inf  inf]
 [ inf   9.   9.]
 [ inf  18.   9.]
 [ inf  27.   9.]
 [ inf  36.   9.]
 [ inf  45.   9.]]


In [12]:
print(dtw_full_norm([1, 2, 2, 2, 3, 4], [2, 3, 4, 4, 4, 5]))

[[  0.  inf  inf  inf  inf  inf  inf]
 [ inf   1.   1.   1.   1.   2.   4.]
 [ inf   3.   2.   2.   2.   1.   2.]
 [ inf   6.   4.   4.   4.   2.   1.]
 [ inf   9.   6.   6.   6.   3.   1.]
 [ inf  12.   8.   8.   8.   4.   1.]
 [ inf  16.  11.  11.  11.   6.   2.]]


# Using DTW on RedDots

To detect impostors we can't just return the closest pattern. We need some probability we can put threshold on. I will try [this](https://stackoverflow.com/questions/4934203/probability-of-a-k-nearest-neighbor-like-classification) idea.

In [13]:
root = reddots.get_root()
load_pcm = ft.partial(reddots.load_pcm, root)
load_mfcc = ft.partial(reddots.load_npy, root, 'mfcc_default')

def save_results(label, results):
    path = os.path.join(root, 'result', label)
    with open(path) as opened:
        pickle.dump(results, opened)
        
def load_results(label):
    path = os.path.join(root, 'result', label)
    with open(path) as opened:
        return pickle.load(opened)

In [14]:
enrollments_1 = reddots.load_enrollments(root + '/ndx/f_part_01.trn', root + '/ndx/m_part_01.trn')
print('Enrollments', enrollments_1.dtypes, sep='\n')

trials_1 = reddots.load_trials(root + '/ndx/f_part_01.ndx', root + '/ndx/m_part_01.ndx')
print('Trials', trials_1.dtypes, sep='\n')

display(enrollments_1.groupby(['is_male', 'speaker_id']).size())

Enrollments
is_male                  bool
pcm_path               object
sentence_id             int16
speaker_id              int16
timestamp      datetime64[ns]
dtype: object
Trials
correct_sentence                  bool
expected_is_male                  bool
expected_sentence_id             int16
expected_speaker_id              int16
pcm_path                        object
target_person                     bool
trial_is_male                     bool
trial_sentence_id                int16
trial_speaker_id                 int16
trial_timestamp         datetime64[ns]
dtype: object


is_male  speaker_id
False    2             30
         4             30
         5             30
         6             30
         8             24
         12            30
True     1             30
         2             30
         4             30
         5             30
         6             30
         7             30
         8              6
         9             30
         13            30
         14            24
         15            30
         16            30
         17            30
         18            30
         19            30
         20            30
         21            24
         22            30
         23            30
         26            30
         28            30
         29            30
         32            30
         38            24
         40            30
         41            30
         43            30
         47            30
         48             6
         51            30
         52             6
         53       

In [15]:
class DynamicTimeWarpingClassifier:
    def __init__(self):
        self.patterns = None
        self.labels = None

    def fit(self, features, labels):
        self.patterns = features
        self.labels = labels
        self.unique_labels = np.unique(labels)

    def predict(self, features, metric=spatial.distance.cosine):
        sequence_label_proba = self.predict_proba(features, metric)
        max_proba_index = sequence_label_proba.argmax(axis=1)
        return self.unique_labels[max_proba_index]
    
    def predict_proba(self, features, metric=spatial.distance.cosine):
        sequence_n = len(features) 
        pattern_n = len(self.patterns)
        
        sequence_label_proba = np.zeros((sequence_n, pattern_n), dtype=self.labels.dtype)
        for i, sequence in enumerate(features):
            sequence_label_proba[i, :] = self.predict_single_proba(sequence, metric)
            
        return sequence_label_proba
    
    def predict_single_proba(self, sequence, metric=spatial.distance.cosine):
        pattern_dists = np.zeros(len(self.patterns), dtype=np.float64)
        for i, pattern in enumerate(self.patterns):
            distance, _ = fastdtw.fastdtw(pattern, sequence, dist=metric)
            pattern_dists[i] = distance
            
        pattern_proba = np.exp(-pattern_dists)
        
        label_proba = np.zeros(len(self.unique_labels), dtype=np.float64)
        all_dim = tuple(range(1, self.labels.ndim))
        for i, label in enumerate(self.unique_labels):
            relevant = (self.labels == label).all(axis=all_dim)
            total_proba = pattern_proba[relevant].sum()
            label_proba[i] = total_proba
        
        return label_proba / label_proba.sum()

In [16]:
def perform_enrollment(classifier, enrollments):
    labels = enrollments[['is_male', 'speaker_id', 'sentence_id']].values
    features = [load_mfcc(path) for path in enrollments_1['pcm_path']]
    classifier.fit(features, labels)
    
def perform_trial(classifier, path):
    features = load_mfcc(path)
    return classifier.predict_single_proba(features)

def perform_trials(classifier, trials):
    paths = trials['pcm_path'].unique()
    results = {}
    for path in paths:
        results[path] = perform_trial(classifier, path)
#     with cf.ThreadPoolExecutor(max_workers=10) as executor:
#         future_to_path = {executor.submit(perform_trial, classifier, path): path for path in paths}
#         for future in cf.as_completed(future_to_path):
#             path = future_to_path[future]
#             result = future.result()
#             results[path] = result
    return results

In [17]:
classifier = DynamicTimeWarpingClassifier()

In [18]:
perform_enrollment(classifier, enrollments_1)

In [None]:
results_1 = perform_trials(classifier, trials_1)



In [None]:
save_results('dtw_1')

In [None]:
print('done')

In [None]:
for path in trials_1['pcm_path']:
    proba_per_label = results_1[path]
    max_index = result.argmax(axis=1)
    is_male, speaker_id, sentence_id = self.unique_labels[max_index]

def calculate_proba(classifier, label_checker, results, row):
    relevant_result = results[row.pcm_path]
    current_row_checker = ft.partial(label_checker, row)
    relevant_indexes = np.array(list(map(current_row_checker, classifier.unique_labels)))
    return relevant_result[relevant_indexes].sum()
    
check_target_right = lambda row, label: label[0] == row.expected_is_male and label[1] == row.expected_speaker_id
check_sentence_right = lambda row, label: label[2] == row.expected_sentence_id
check_both_right = lambda row, label: check_target_right(row, label) and check_sentence_correct(row, label)

calculate_proba_target_right = ft.partial(calculate_proba, classifier, results_1, check_target_right)
calculate_proba_sentence_right = ft.partial(calculate_proba, classifier, check_sentence_right)
calculate_proba_both_right = ft.partial(calculate_proba, classifier, check_both_right)

def equal_error_rate(fpr, tpr, thresholds):
    eer = optimize.brentq(lambda x : 1. - x - interpolate.interp1d(fpr, tpr)(x), 0., 1.)
    threshold = interpolate.interp1d(fpr, thresholds)(eer)
    return eer, threshold

def plot_roc(fpr, tpr, auc, eer):
    plt.figure()
    plt.plot(fpr, tpr, color='darkorange',
             lw=lw, label='ROC curve (area = %0.2f, EER = %0.2f)' % (auc, eer))
    plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
    plt.xlim([0.0, 1.0])
    plt.ylim([0.0, 1.05])
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title('Receiver operating characteristic example')
    plt.legend(loc="lower right")
    plt.show()
    
def visualize_roc(fpr, tpr, thresholds):
    roc_auc = metrics.auc(fpr, tpr)
    eer = equal_error_rate(fpr, tpr, thresholds)
    plot_roc(fpr, tpr, auc, eer)

In [135]:
# is target right, disregarding sentence
proba_target_right = results_1.apply(calculate_proba_target_right)
roc_target_right = metrics.roc_curve(results_1.target_person, proba_target_right)
visualize_roc(*roc_target_right)

# is sentence correct or wrong, disregarding target
proba_sentence_right = results_1.apply(calculate_proba_sentence_right)
roc_sentence_right = metrics.roc_curve(results_1.correct_sentence, proba_sentence_right)
visualize_roc(*roc_sentence_right)

# is target right, when the sentence is correct
proba_sentence_right = results_1.apply(calculate_proba_sentence_right)
roc_both_right = metrics.roc_curve(results_1.target_person and results_1.correct_sentence, proba_both_right)
visualize_roc(*roc_both_right)

array(['f0001/20150302171021680_f0001_31.pcm',
       'f0001/20150302171023147_f0001_34.pcm',
       'f0001/20150302171024582_f0001_39.pcm', ...,
       'm0067/20150701174119663_m0067_40.pcm',
       'm0067/20150701174124592_m0067_40.pcm',
       'm0067/20150701174134717_m0067_40.pcm'], dtype=object)