There are **two approaches** one can take.

1) Find every possible substring and see if it's in the list of words.

2) For every word in the list, see if it's substring of the name.

**APPROACH #1**

We want every possible substring with gaps.

For example, given 'abc': [a,ab,abc,ac,b,bc,c]. *Note that 'ac' is allowed, hence the 'substring with gaps'*

**How many of these exist?** Each letter has the option of being present or absent in the substring, so if the length of the string is N, we would have **2^N substrings**.

If we need N elements, we can run a for loop. If we need N^2 elements, we will do two nested for loops and so on. But going that route, we can't possibly generate 2^N. But we can do a **for loop going up to 2^N elements**.

So let's do 'for i in range(2**n):'

Now we need generate a substring for each i. Remember, **a substring is generated by using a 0 or 1 flag on each character**. So we could **convert i to binary** and use the binary representation as the 0/1 flag.

Once we have the substrings, it's just a matter of validating them against the dictionary and choosing the longest ones.


In [1]:
import re

In [2]:
words = []
f = open('enable1.txt')
for line in f:
    words.append(line[:-2])

In [3]:
def dank(text):
    text = text.lower()
    text = re.sub("[^a-z]", "", text)
    n = len(text)
    mylist = []
    for i in range(1,2**n):
        b = "{0:b}".format(i)
        b = '0'*(n-len(b))+b
        w = ''.join([text[j] for j in range(n) if b[j]=='1'])
        if w in words:
            mylist.append((len(w),w))
    m = max(mylist)[0]
    return set([mylist[i][1] for i in range(len(mylist)) if mylist[i][0]==m])

In [4]:
dank('Alan Turing')

{'alanin', 'anting', 'luring'}

**APPROACH #2**

This is more straightforward. For every word in the list, we check if it's a substring of the name. We just do it character by character.

In [5]:
def dank2(text):
    text = text.lower()
    text = re.sub("[^a-z]", "", text)
    mylist = []
    for w in words:
        i=0;j=0
        while i<len(w) and j<len(text):
            if w[i]==text[j]:
                i+=1;j+=1
            else:
                j+=1
        if i==len(w):
            mylist.append((len(w),w))
    m = max(mylist)[0]
    return [v[1] for v in mylist if v[0]==m]

In [6]:
dank2('Alan Turing')

['alanin', 'anting', 'luring']

In [7]:
%time dank('Alan Turing')

CPU times: user 1.94 s, sys: 12.9 ms, total: 1.95 s
Wall time: 1.96 s


{'alanin', 'anting', 'luring'}

In [8]:
%time dank2('Alan Turing')

CPU times: user 418 ms, sys: 5.04 ms, total: 423 ms
Wall time: 426 ms


['alanin', 'anting', 'luring']

In [9]:
%time dank2('Jean Claude Van Damme')

CPU times: user 751 ms, sys: 15.4 ms, total: 767 ms
Wall time: 814 ms


['academe', 'enclave']

In [10]:
%time dank2('Alan Turing Jean Claude Vandame')

CPU times: user 1.04 s, sys: 12.6 ms, total: 1.06 s
Wall time: 1.1 s


['annealed',
 'antecede',
 'antennae',
 'antigene',
 'auricled',
 'included',
 'laureled']