# Coding Exercise

You must correctly implement the function described in the prompt below.

Feel free to test out pieces of code to help you write the solution.

Please thoroughly test that the final code implements the function correctly.

## Prompt

**Function signature:** `decompose(s: str, A: List[str]) -> int`

    Little Bonnie and her friends were dismayed to learn that their parents were reading all of their private communications.  They decided to invent a new language that would allow them to talk freely.  What they finally came up with was a language where sentences are built using a special method. 

    All the valid words that can be used in the new language are given in the String[] validWords.  A sentence is a concatenation (with no spaces) of a sequence of valid words.  Each valid word can appear 0 or more times in the sentence.  What makes the language special is that each word can be transformed by rearranging its letters before being used.  The cost to transform a word is defined as the number of character positions where the original word and the transformed word differ.  For example, "abc" can be transformed to "abc" with a cost of 0, to "acb", "cba" or "bac" with a cost of 2, and to "bca" or "cab" with a cost of 3. 

    Although several different sequences of valid words can produce the same sentence in this language, only the sequence with the least total transformation cost carries the meaning of the sentence.  The advantage of the new language is that the parents can no longer understand what the kids are saying.  The disadvantage is that the kids themselves also do not understand.  They need your help. 

    Given a String sentence, return the total cost of transformation of the sequence of valid words which carries the meaning of the sentence, or -1 if no such sequence exists.
    Notes
    -If a word is used multiple times in a sentence, each occurrence can be transformed differently.
 
    Constraints
    -sentence will contain between 1 and 50 lowercase letters ('a'-'z'), inclusive.
    -validWords will contain between 1 and 50 elements, inclusive.
    -Each element of validWords will contain between 1 and 50 lowercase letters ('a'-'z'), inclusive.
 
    Examples
    0)
        "neotowheret"
    {"one", "two", "three", "there"}

    Returns: 8
    The following transformations can be made:

    "one" -> "neo" with a cost of 3 
    "two" -> "tow" with a cost of 2 
    "three" -> "heret" with a cost of 3 
    "there" -> "heret" with a cost of 5 

    So the sequence {"one", "two", "three"} is the one carrying the meaning of "neotowheret". Its total transformation cost is 3 + 2 + 3 = 8.

    1)
        "abba"
    {"ab", "ac", "ad"}

    Returns: 2
    The word "ab" is used twice, and each time, it is transformed differently.

    2)
        "thisismeaningless"
    {"this", "is", "meaningful"}

    Returns: -1

    3)
        "ommwreehisymkiml"
    {"we", "were", "here", "my", "is", "mom", "here", "si", "milk", "where", "si"}

    Returns: 10

    4)
        "ogodtsneeencs"
    {"go", "good", "do", "sentences", "tense", "scen"}

    Returns: 8

    5)
        "sepawaterords"
    {"separate","words"}

    Returns: -1
    You are only allowed to rearrange letters within words, and not in the entire sentence.

    

 After spending some time thinking, I believe that the best approach is to solve this problem recursively by dividing the initial string $s$ in a left and a right part.
 The left part will have size <code> len(A[i]) </code>, where $i$ is the $i$-th valid word, and the right part will be the remaining.
 
 First I will implement a function to return the cost for two strings s1 and s2. The cost will be -1 if we cannot mutate a string into the other.

In [1]:
def cost(s1, s2):
    s1 = list(s1)
    s2 = list(s2)
    
    c = sum([ a == b for a, b in zip(s1, s2) ])
    
    if s1.sort() == s2.sort():
        return c
    else:
        return -1
    
cost('aaa', 'aaa')

3

The cost should be 0. We should sum if a != b instead of a == b

In [2]:
def cost(s1, s2):
    s1 = list(s1)
    s2 = list(s2)
    
    c = sum([ a != b for a, b in zip(s1, s2) ])
    
    if s1.sort() == s2.sort():
        return c
    else:
        return -1
    
cost('aaa', 'aaa')

0

In [3]:
cost('aaa', 'aab')

1

In [4]:
s1 = list('aaa').sort()
s2 = list('aab').sort()
print(s1)
print(s2)

None
None


sort does not return any value, therefore we must sort and then compare s1 and s2, or use sorted

In [5]:
def cost(s1, s2):
    s1 = list(s1)
    s2 = list(s2)
    
    if sorted(s1) != sorted(s2):
        return -1
    
    c = sum([ a != b for a, b in zip(s1, s2) ])
    return c
    
cost('aaa', 'aab')

-1

Now lets build the recursive algorithm.

The idea is that we will split the message, solve the left string and call the function recursively on the right part of the string.

In [6]:
from typing import List

def decompose(s: str, A: List[str]) -> int:
    cost_list = []
    for w in A:
        sl = s[:len(w)]
        cl = cost(sl, w)
        if cl == -1:
            continue
            
        sr = s[len(w):]
        cr = decompose(sr, A)
        if cr == -1:
            continue
        cost_list.append( cl + cr )
    
    return min(cost_list, default=-1)
        
s = "neotowheret"
A = ["one", "two", "three", "there"]
decompose(s, A)

-1

The result above should be 8. Let us add a few prints to debug.

In [7]:
def decompose(s: str, A: List[str]) -> int:
    cost_list = []
    for w in A:
        sl = s[:len(w)]
        cl = cost(sl, w)
        print(sl, w, cl)
        
        if cl == -1:
            continue
            
        sr = s[len(w):]
        cr = decompose(sr, A)
        print(sl, w, cr)
        if cr == -1:
            continue
        cost_list.append( cl + cr )
    
    return min(cost_list, default=-1)
        
s = "neotowheret"
A = ["one", "two", "three", "there"]
decompose(s, A)

neo one 3
tow one -1
tow two 2
her one -1
her two -1
heret three 3
 one -1
 two -1
 three -1
 there -1
heret three -1
heret there 5
 one -1
 two -1
 three -1
 there -1
heret there -1
tow two -1
towhe three -1
towhe there -1
neo one -1
neo two -1
neoto three -1
neoto there -1


-1

The problem seems to be an empty string. Let us return 0 if the string is == '' (the constrainsts say that the initial string cannot have size = 0)

In [8]:
def decompose(s: str, A: List[str]) -> int:
    if s == '':
        return 0
    
    cost_list = []
    for w in A:
        sl = s[:len(w)]
        cl = cost(sl, w)
        print(sl, w, cl)
        
        if cl == -1:
            continue
            
        sr = s[len(w):]
        cr = decompose(sr, A)
        print(sl, w, cr)
        if cr == -1:
            continue
        cost_list.append( cl + cr )
    
    return min(cost_list, default=-1)
        
s = "neotowheret"
A = ["one", "two", "three", "there"]
decompose(s, A)

neo one 3
tow one -1
tow two 2
her one -1
her two -1
heret three 3
heret three 0
heret there 5
heret there 0
tow two 3
towhe three -1
towhe there -1
neo one 5
neo two -1
neoto three -1
neoto there -1


8

Now it is working. Let me add more tests. 

In [9]:
def decompose(s: str, A: List[str]) -> int:
    if s == '':
        return 0
    
    cost_list = []
    for w in A:
        # solve the left side
        sl = s[:len(w)]
        cl = cost(sl, w)
        if cl == -1:
            continue
            
        # solve the right side
        sr = s[len(w):]
        cr = decompose(sr, A)
        if cr == -1:
            continue
        
        cost_list.append( cl + cr )
    
    # we are only interested in the mininum cost
    return min(cost_list, default=-1)

s = "abba"
A = ["ab", "ac", "ad"]
assert decompose(s, A) == 2

s = "thisismeaningless"
A = ["this", "is", "meaningful"]
assert decompose(s, A) == -1

s = "ommwreehisymkiml"
A = ["we", "were", "here", "my", "is", "mom", "here", "si", "milk", "where", "si"]
assert decompose(s, A) == 10

s = "ogodtsneeencs"
A = ["go", "good", "do", "sentences", "tense", "scen"]
assert decompose(s, A) == 8

s = "sepawaterords"
A = ["separate","words"]
assert decompose(s, A) == -1


print('Everything ok')

Everything ok


In [10]:
s = 'a'*50
A = ['z']*49+['a']
assert decompose(s, A) == 0
print('correct!')

correct!


In [11]:
s = 'a'
A = ['z']*49+['a']
assert decompose(s, A) == 0
print('correct!')

correct!


In [12]:
s = 'a'
A = ['z']*50
assert decompose(s, A) == -1
print('correct!')

correct!


In [13]:
s = 'qwertyuioplkjhgfdsazxcvbnm'
A = s[:10]
assert decompose(s, A) == -1
print('correct!')

correct!


In [14]:
s = 'qwertyuioplkjhgfdsazxcvbnm'
A = list(s)[3:]+['weq']
assert decompose(s, A) == 3
print('correct!')

correct!


In [15]:
s = 'qwertyuioplkjhgfdsazxcvbnm'
A = list(s)[:-5]+['vbnmc']
assert decompose(s, A) == 5
print('correct!')

correct!


In [16]:
assert (a:=decompose("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", ["a", "b", "aaa", "ab"])) == 0, a

KeyboardInterrupt: 

Our implementation is running indefinitely in the case above. 

To fix it, I will try to add dynamic programming to the solution decompose.
Since the solution that I'm planning needs to carry the memory of previous results as input, I will implement is as another function.

In [17]:
def decomp(s: str, A: List[str], prev_sol=None) -> int:
    if s == '':
        return 0
    
    if prev_sol is None:
        prev_sol = dict()
    
    cost_list = []
    for w in A:
        # solve the left side
        sl = s[:len(w)]
        cl = cost(sl, w)
        if cl == -1:
            continue
            
        # solve the right side
        sr = s[len(w):]
        
        if sr not in prev_sol.keys():
            cr = decomp(sr, A, prev_sol=prev_sol)
            prev_sol[sr] = cr
        else:
            cr = prev_sol[sr]
        
        if cr == -1:
            continue
        
        cost_list.append( cl + cr )
    
    # we are only interested in the mininum cost
    return min(cost_list, default=-1)

In [18]:
assert (a:=decomp("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", ["a", "b", "aaa", "ab"])) == 0, a
assert (a:=decomp("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaxy", ["a", "aa", "yx"])) == 2, a
print('done')

done


Let us copy decompose to be the same as decomp, but with no inputs, and test everything again.

In [21]:
def decompose(s: str, A: List[str]) -> int:
    return decomp(s, A)

s = "abba"
A = ["ab", "ac", "ad"]
assert decompose(s, A) == 2

In [22]:
s = "thisismeaningless"
A = ["this", "is", "meaningful"]
assert decompose(s, A) == -1

In [23]:
s = "ommwreehisymkiml"
A = ["we", "were", "here", "my", "is", "mom", "here", "si", "milk", "where", "si"]
assert decompose(s, A) == 10

In [24]:
s = "ogodtsneeencs"
A = ["go", "good", "do", "sentences", "tense", "scen"]
assert decompose(s, A) == 8

In [25]:
s = "sepawaterords"
A = ["separate","words"]
assert decompose(s, A) == -1

In [26]:
s = 'a'*50
A = ['z']*49+['a']
assert decompose(s, A) == 0
print('correct!')

correct!


In [27]:
s = 'a'
A = ['z']*49+['a']
assert decompose(s, A) == 0
print('correct!')

correct!


In [28]:
s = 'a'
A = ['z' for i in range(50)]
assert (a:=decompose(s, A)) == -1,a
print('correct!')

correct!


In [29]:
A = ['q','w','e','r','t','y','u','i','o','p']
assert decompose('qwertyuioplkjhgfdsazxcvbnm', ['q','w','e','r','t','y','u','i','o','p']) == -1
print('correct!')

correct!


In [30]:
A = list('qwertyuioplkjhgfdsazxcvbnm')[3:]+['weq']
assert decompose('qwertyuioplkjhgfdsazxcvbnm', A) == 3
print('correct!')

correct!


In [31]:
s = 'qwertyuioplkjhgfdsazxcvbnm'
A = list(s)[:-5]+['vbnmc']
assert decompose(s, A) == 5
print('correct!')

assert (a:=decomp("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", ["a", "b", "aaa", "ab"])) == 0, a
assert (a:=decomp("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaxy", ["a", "aa", "yx"])) == 2, a
print('done')

correct!
done
