# Assignment 2: Costs and States

_See [Assignment 2: Costs and States](https://sikoried.github.io/sequence-learning/02/cost-and-states/)._


## Keyboard Aware Auto-Correct

In the previous assignment, we applied uniform cost to all substitutions.
This does not really make sense if you look at a keyboard: the QWERTY layout will favor certain substitutions (eg. _a_ and _s_), while others are fairly unlikely (eg. _a_ and _k_).

- Implement a distance metric that computes a weight for a given character substitution.
	Hint: Think of the keyboard as a checkerboard, where the keys are uniformly distributed; a simple heuristic is to map the keys to their offsets on the board, and compute the euclidean distance (compare eg. [here](https://github.com/wsong/Typo-Distance/blob/master/typodistance.py)).
- Integrate this distance to the basic edit distance from last week.


## Discuss your Implementation

- What could be better heuristics for cost of substitution than the one above?
- What about swipe-to-type?
- What about capitalization, punctiation and special characters?

In [4]:
gem_doppel = [
    ("GCGTATGAGGCTAACGC", "GCTATGCGGCTATACGC"),
    ("kühler schrank", "schüler krank"),
    ("the longest", "longest day"),
    ("nicht ausgeloggt", "licht ausgenockt"),
    ("gurken schaben", "schurkengaben")
]

import numpy as np

def edit(x, y):
    D = np.zeros((len(x) + 1, len(y) + 1), dtype=int)

    # for the empty word, costs match the length of the other string
    D[0, 1:] = range(1, len(y) + 1)
    D[1:, 0] = range(1, len(x) + 1)
    
    for i in range(1, len(x) + 1):
        for j in range(1, len(y) + 1):
            delta = 0 if x[i-1] == y[j-1] else 1
            D[i, j] = min(
                D[i-1, j] + 1,
                D[i, j-1] + 1,
                D[i-1, j-1] + delta
            )

    return D[len(x), len(y)]

def edit5(x, y):
    return 0


from IPython.display import display
from tabletext import to_text

print(to_text([(a, b, edit(a, b), edit5(a, b)) for (a, b) in gem_doppel]))


┌───────────────────┬───────────────────┬───┬───┐
│ GCGTATGAGGCTAACGC │ GCTATGCGGCTATACGC │ 3 │ 0 │
├───────────────────┼───────────────────┼───┼───┤
│ kühler schrank    │ schüler krank     │ 6 │ 0 │
├───────────────────┼───────────────────┼───┼───┤
│ the longest       │ longest day       │ 8 │ 0 │
├───────────────────┼───────────────────┼───┼───┤
│ nicht ausgeloggt  │ licht ausgenockt  │ 4 │ 0 │
├───────────────────┼───────────────────┼───┼───┤
│ gurken schaben    │ schurkengaben     │ 7 │ 0 │
└───────────────────┴───────────────────┴───┴───┘


## Isolated Word Recognition using Dynamic Time Warping

Acutally, using [dynamic time warping](https://en.wikipedia.org/wiki/Dynamic_time_warping) (i.e. edit distance with uniform cost) for isolated word recognition is [really old](https://ieeexplore.ieee.org/document/1171695/), but worth the exercise!

_Note: This assignment was originally implemented in Java using JSTK._


### Extracting Features

For the audio i/o and feature extraction, we'll use [librosa](https://librosa.github.io/librosa/index.html).
Due to the relatively large sample number (e.g. 8kHz), performing DTW on the raw audio signal is not advised (feel free to try!).
A better solution is to compute a set of features; here we will [extract mel-frequency cepstral coefficients](https://librosa.github.io/librosa/generated/librosa.feature.mfcc.html) over windows of 25ms length, shifted by 10ms.

You can record your own, or uses (a fraction of) this dataset: http://download.tensorflow.org/data/speech_commands_v0.02.tar.gz

### Algorithm

- Compute the features for the sample words and store in array
- For any new word, compute the features and the distances to the sample words
- Decide for the word with the smallest distance


### Dynamic Programming and States: DTMF Decoding

_Still todo._

https://fairyonice.github.io/decode-the-dial-up-sounds-using-spectrogram.html

In [5]:
x --------------------------------------------------
sil   /
1    -->
2    |\
3    | \
4
5
.
.
.





SyntaxError: invalid syntax (<ipython-input-5-33d3670c801c>, line 1)