# Find longest word in dictionary that is a subsequence of a given string

## Problem

*From [Google](https://techdevguide.withgoogle.com/paths/foundational/find-longest-word-in-dictionary-that-subsequence-of-given-string/#code-challenge)*

Given a string ``S`` and a set of words ``D``, find the longest word in ``D`` that is a subsequence of ``S``.

Word ``W`` is a subsequence of ``S`` if some number of characters, possibly zero, can be deleted from ``S`` to form ``W``, without reordering the remaining characters.

Note: ``D`` can appear in any format (list, hash table, prefix tree, etc.)

For example, given the input of ``S = "abppplee"`` and ``D = {"able", "ale", "apple", "bale", "kangaroo"}`` the correct output would be ``"apple"``

- The words "able" and "ale" are both subsequences of S, but they are shorter than "apple".
- The word "bale" is not a subsequence of S because even though S has all the right letters, they are not in the right order.
- The word "kangaroo" is the longest word in D, but it isn't a subsequence of S.

**Learning objectives**

This question gives you the chance to practice with algorithms and data structures. It’s also a good example of why careful analysis for Big-O performance is often worthwhile, as is careful exploration of common and worst-case input conditions.

## Solution

In [1]:
import numpy as np

def find_longest_subsequence(S, D):
    
    # Length of the string (number of characters)
    LS = len(S)
    
    # Number of words in the word set
    ND = len(D)
    
    # Initialize an array with zeros for every word 
    # to find the one which is the longest subsequence
    A = np.zeros(ND)
    
    # Loop over the words
    for i in range(0, ND):
        
        # Current word and its length
        W = D[i]
        LW = len(W)
        
        # Initialized index for the current word
        k = 0
        
        # Loop over the characters of the string
        for j in range(0, LS):
            
            # If the end of the word has not been reached yet
            if k < LW:
            
                # If the string has the same character as the word
                if S[j] == W[k]:
                    
                    # Update the length in the array for the current word
                    A[i] = A[i]+1
                    
                    # Update the index for the word
                    k = k+1
            
            # If the end of the word has been reached
            else:
                
                # Break the loop
                break
        
        # If the length in the array for the current word is shorter than the word
        if A[i] < LW:
            
            # It is not a subsequence so make it zero
            A[i] = 0
                
    # Return the longest word which is a subsequence of the string
    return D[np.argmax(A)]

In [2]:
S = "abppplee"
D = ["able", "ale", "apple", "bale", "kangaroo"]
find_longest_subsequence(S,D)

'apple'

## Explanation

The idea here is to loop over each word in the set, and for each word, loop over the characters in the string, check if the current character in the string matches the current character in the current word - if so, count it - and then move forward in the word until the end of the string or the end of the word has been reached; in the latter case, we will then break the loop over the characters in the string. Once we have counted the number of characters in the string which appears in the words, we zero any number which is shorter than the length of the corresponding word as this means that the whole word does not appear in the string so it is therefore not a subsequence of the string. We finally return the word with the longest number of characters appearing in the string.