# LeetCode 792
![lc-792](./assets/question.jpg)
![lc-792](./assets/constraints.jpg)

> Observations:
> - Similar to LeetCode 392, a word is only a subsequence of s if the letters in words[i] are in s and maintain the same ordering
> - The length of each word will always be at least 1 letter long, and there will always be at least 1 word in the words array
> - Each word in the words array and s consist of only lowercase English letters
> - Noticeably, the words array is quite long, and so we may have to consider more efficient algorithms
> - We have to return the number of strings that are subsequences to s

![lc-792-ex1](./assets/ex1.jpg)
![lc-792-ex2](./assets/ex2.jpg)

> Notes:
> - For example, there are 3 subsequences to s since "a", "acd", and "ace" all consist of letters that are ordered consistently with s and have letters from s
> - The same can be said for example 2
> - Since we are dealing with 1 to 5000 possible words per test case, we need to determine a better way for checking whether a word is a subsequence to s
> - Although using a hashmap to store the amount of appearances per letter is good idea, it fails at satisfying the condition for subsequences to also consider order of the letters
> - Since order matters, we could use a hashmap to store all the indices of each letter in string s
> - Then, for each word, and each letter, if that letter is present in the hashmap, then we need to find the next smallest index in order
> - But if the result of looking for the next index comes out to be out of bounds for a particular letter, then it must be the case that it is not a subsequence
>   - In addition, if a letter in words[i] is not in the hashmap, then the word must also not be a subsequence of string s
> - Otherwise, if every letter can be found in the string s, in order, and not out of bounds, then we it must be true that the word is a subsequence

> ### Algorithm
> - We need a variable to store the number of valid subsequences
> - We need a hashmap to store key-value pairs where keys are the letters of the word s, and values are the arrays storing the indices of each letter's occurance in string s
> - Then we must loop through each word and check whether they are valid subsequences using a helper
>   - To check, we must have an index value to store the previous index of the last checked letter to ensure order is kept (this should be initially -1)
>   - In addition, if the letter is present in the hashmap, then set the variable for previous index to the updated index from the hashmap
>   - If the suggested index goes out of bounds or the letter is not in the hashmap, then return False
> - Once all the letters of the word have passed, then we return True - this means we increment the "valid_subsequences" variable

In [40]:
from bisect import bisect_left

class Solution:
    def numMatchingSubseq(self, s, words):
        valid_subsequences = 0
        s_indices = {}
        for i in range(len(s)):
            if (s[i] not in s_indices):
                s_indices[s[i]] = [i]
            else:
                s_indices[s[i]].append(i)
            
        def is_valid_subsequence(word):
            prev_index = -1
            for letter in word:
                if (letter not in s_indices):
                    return False
                index = bisect_left(s_indices[letter], prev_index)
                if (index == len(s_indices[letter])):
                    return False
                prev_index = s_indices[letter][index] + 1
            return True

        for word in words:
            if (is_valid_subsequence(word)):
                valid_subsequences += 1
            
        return valid_subsequences

In [41]:
sol = Solution()
print('Ex 1:')
print(' Result:', sol.numMatchingSubseq(s = "abcde", words = ["a","bb","acd","ace"]))
print(' Desire: 3')
print('Ex 2:')
print(' Result:', sol.numMatchingSubseq(s = "dsahjpjauf", words = ["ahjpjau","ja","ahbwzgqnuk","tnmlanowax"]))
print(' Desire: 2')

Ex 1:
 Result: 3
 Desire: 3
Ex 2:
 Result: 2
 Desire: 2


> ### Final Verdict
> - Notice that we move through each word in words, each character in word, and use bisect_left to find the indices, then this gives a time complexity of O(m * n * log(a)), where m, n, and a are the lengths of words array, the length of each word, and the length of string s, respectively
> - In terms of space complexity, since we only use a hashmap to store the indices of each letter in arrays, the memory allocated is the length of string s, and so the space complexity of O(s)