Exercise-1
---

Create a class Trellis that
- takes in four arguments: `match_weight`, `delete_weight`, `add_weight`, and `scoring_func`.
    - `scoring_func` is a function that computes the distance or score between two `values`.
    - `match_weight`, `delete_weight`, `add_weight` are floats that weigh a diagonal, horizontal, and vertical transitions, respectively.
- contains a method `match(X, Y)` where X and Y are arrays of `values`; the `values` can be characters, scalars, or even vectors that returns the minimum-edit-distance/matching-score between X and Y and the shortest path (as an array of 2-tuples).

In [56]:
class Trellis:
    
    def __init__(self, scoring_func, match_weight=1.0, delete_weight=2.0, add_weight=2.0):
        self.scoring_func = scoring_func
        self.match_weight = match_weight
        self.delete_weight = delete_weight
        self.add_weight = add_weight
    
    def match(self, X, Y, normalize_score=True):
        scoring_func = self.scoring_func
        match_weight, delete_weight, add_weight = self.match_weight, self.delete_weight, self.add_weight
    
        score = 0
        back_pointer = []
        path_length = 0
        back_pointer_desc = []
        
        ii = jj = 0
        while ii < len(X) and jj < len(Y):
            if ii == jj and ii == 0:
                score += scoring_func(X[ii], Y[ii])
                back_pointer.append((ii, jj))
            match_score = delete_score = add_score = 1e9
            if ii + 1 < len(X):
                if jj + 1 < len(Y):
                    match_score = match_weight*scoring_func(X[ii+1], Y[jj+1])
                delete_score = delete_weight*scoring_func(X[ii+1], Y[jj])                
            if jj + 1 < len(Y):
                add_score = add_weight*scoring_func(X[ii], Y[jj+1])
            min_score = min(match_score, delete_score, add_score)
            if min_score == 1e9:
                break
            if match_score == min_score:
                ii += 1
                jj += 1
            elif delete_score == min_score:
                ii += 1                
            elif add_score == min_score:
                jj += 1
            score += min_score
            path_length += 1
            back_pointer.append((ii, jj))

        if ii == len(X) - 1 and jj < len(Y) - 1:
            while jj < len(Y):
                jj += 1
                score += add_weight*scoring_func(X[ii], Y[jj])
                path_length += 1
                back_pointer.append((ii, jj))
        if jj == len(Y) - 1 and ii < len(X) - 1:
            while ii < len(X):
                ii += 1
                score += add_weight*scoring_func(X[ii], Y[jj])
                path_length += 1
                back_pointer.append((ii, jj))

        if normalize_score:
            score = score/path_length
        return score, back_pointer

['TEST', 'TES'] 0.6666666666666666
['geek', 'gesek'] 0.6
['ISLANDER', 'SLANDER'] 0.14285714285714285
['MART', 'KARMA'] 1.0


In [None]:
if __name__=="__main__":
    trellis = Trellis(lambda x, y: 0.0 if x == y else 1.0)

    test_cases = [
        ['TEST', 'TES'],
        ['geek', 'gesek'],
        ['ISLANDER', 'SLANDER'],
        ['MART', 'KARMA']
    ]

    for case in test_cases:
        print(case, trellis.match(case[0], case[1], normalize_score=True)[0])

Exercise-2
---

Use the matching algorithm written above to correct spellings of words intput thru the keyboard.
I.e. create your own spell checker! (albiet it being quite inefficient...)

A list of english words has been given to you in `words.txt`

In [59]:
if __name__=="__main__":
    dictionary = list(filter(None, open("words2.txt","r").read().split("\n")))

    trellis = Trellis(lambda x, y: 0.0 if x == y else 1.0, delete_weight=4.0)
    print("Enter /quit to quit")
    while True:
        x = input("word>> ")
        x = x.lower()
        if x == "/quit":
            print("Goodbye!")
            break
        if x in dictionary:
            print("word found")
            continue
        min_sc = 1e9
        match = x
        for el in dictionary:
            sc = trellis.match(el, x, normalize_score=False)[0]
            if sc < min_sc:
                min_sc = sc
                match = el
        print("closest match: ", match)

Enter /quit to quit
word>> waer
closest match:  water
word>> /quit
Goodbye!
