## Trie

### Introduction to Trie

#### What is Trie?

* A Trie is a special form of a Nary tree. Typically, a trie is used to store strings. 
* Each Trie node represents a string (a prefix). 
* Each node might have several children nodes while the paths to different children nodes represent different characters. 
* The strings the child nodes represent will be the origin string represented by the node itself plus the character on the path
* The structure of a Trie is shown below
![image.png](attachment:image.png)
  + root node is an empty string
  + all the descendants of a node have a common prefix of the string associated with the node. That is why Trie is called a prefix tree
* widely used in various applications, such as autocomplete, spell checker

#### How to represent a Trie?
* use Array
  + if we know all the chars in the string are lower case letters, we can use a 26 element array to define each node
* use Hashmap
  + if the alphabeta of the strings are not well defined, we use hashmap to define each node, which might be a little bit slower than using arrays, but can save more space since we only need to assign space to letters appearing in the string
* more
  + If we store strings in trie, we can declare a boolean in each node as a flag to indicate if the string represented by this node is a word or not


### Basic Operations

#### Insertion in Trie
* If we insert a string S into Trie, we start with the root node. We will choose a child or add a new child node depending on S\[0\], the first character in S. Then we go down to the second node and we will make a choice according to S\[1\]. Then we go down to the third node, so on and so for. Finally, we traverse all characters in S sequentially and reach the end. The end node will be the node which represents the string S.
* pseudocode

```
1. Initialize: cur = root
2. for each char c in target string S:
3.      if cur does not have a child c:
4.          cur.children[c] = new Trie node
5.      cur = cur.children[c]
6. cur is the node which represents the string S
```
* Building a trie is actually to call the insertion function several times. But remember to initialize a root node before you insert the strings.

#### Search in Trie

* pseudo code for searching a trie

```
1. Initialize: cur = root
2. for each char c in target string S:
3.   if cur does not have a child c:
4.     search fails
5.   cur = cur.children[c]
6. search successes
```

* search word
  + search a word consists of two steps
    + If search fails which means that no words start with the target word, the target word is definitely not in the Trie.
    + If search succeeds, we need to check if the target word is only a prefix of words in Trie or it is exactly a word. To solve this problem, you might want to modify the node structure a little bit.
      + we usually use a boolean flag in each node to indicate if it is a word node

#### Leetcode 208. Implement Trie (Prefix Tree)
* Overview
  + A trie (pronounced as "try") or prefix tree is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker.
  + Implement the Trie class:
    + Trie() Initializes the trie object.
    + void insert(String word) Inserts the string word into the trie.
    + boolean search(String word) Returns true if the string word is in the trie (i.e., was inserted before), and false otherwise.
    + boolean startsWith(String prefix) Returns true if there is a previously inserted string word that has the prefix prefix, and false otherwise.
    
* Algorithm
  + define TrieNode class with
    + child attribute as an array of 26 elements and each element is initialized to be None
    + or you can just define child attribute as an empty directionary
    + isWord attribute initialized as False
  + in Trie class
    + define the self.root as a TrieNode
    + for insert, search and prefix, apply the Trie traverse template
      + first set the node = self.node
      + get the index of current character and check if the index of the child is None or if the ch is in the directionary key
        + if not, create a TrieNode for that child index or key
      + assign node = child(index) or child(key) and continue the for loop
      + out of the for loop, assign or check the word flag if necessary
* Time complexity
  + O(N) where N is the longest string in array implementation, the hashmap implementation should be the same for search, insert and startswith
* Space complexity
  + O(N) where N is the longest string in array implementation. The hashmap implementation also depends on how many unique letters are used to define the width of the tree      

In [1]:
# array implementation of Trie
class TrieNode:
    def __init__(self):
        self.child = [None] * 26
        self.is_word = False

class Trie:

    def __init__(self):
        self.root = TrieNode()        

    def insert(self, word: str) -> None:
        node = self.root
        for c in word:
            index = ord(c) - ord('a')
            if node.child[index] is None:
                node.child[index] = TrieNode()
            node = node.child[index]
        node.is_word = True    
            
        

    def search(self, word: str) -> bool:
        node = self.root
        for c in word:
            index = ord(c) - ord('a')
            if node.child[index] is None:
                return False
            node = node.child[index]
        return node.is_word       

    def startsWith(self, prefix: str) -> bool:
        node = self.root
        for c in prefix:
            index = ord(c) - ord('a')
            if node.child[index] is None:
                return False
            node = node.child[index]
        return True    
        


# Your Trie object will be instantiated and called as such:
# obj = Trie()
# obj.insert(word)
# param_2 = obj.search(word)
# param_3 = obj.startsWith(prefix)

# hashmap implementation of Trie
class TrieNode:
    def __init__(self):
        self.child = {}
        self.is_word = False

class Trie:

    def __init__(self):
        self.root = TrieNode()        

    def insert(self, word: str) -> None:
        node = self.root
        for c in word:
            if c not in node.child:
                node.child[c] = TrieNode()
            node = node.child[c]
        node.is_word = True    
            
        

    def search(self, word: str) -> bool:
        node = self.root
        for c in word:            
            if c not in node.child:
                return False
            node = node.child[c]
        return node.is_word       

    def startsWith(self, prefix: str) -> bool:
        node = self.root
        for c in prefix:
            if c not in node.child:
                return False
            node = node.child[c]
        return True  

#### Leetcode 677. Map Sum Pairs
* Overview
  + Design a map that allows you to do the following:
    + Maps a string key to a given value.
    + Returns the sum of the values that have a key with a prefix equal to a given string.
  + Implement the MapSum class:
    + MapSum() Initializes the MapSum object.
    + void insert(String key, int val) Inserts the key-val pair into the map. If the key already existed, the original key-value pair will be overridden to the new one.
    + int sum(string prefix) Returns the sum of all the pairs' value whose key starts with the prefix.
    ![image.png](attachment:image.png)
    
* Algorithm
  + using hashmap to track the value in the keys 
    + these keys are only those used to input values, not including the keys in the paths when inserting these keys
    + find the difference between the insert value and the stored value in hashmap (default as 0)
    + store the value in the hashmap with its key
    + update nodes along the path to the key and add the existing value by the difference
    + since each time when a key's value is updated, all the node along the path will be updated by the difference, each node stores the total values until itself. For example, when you set ap and apx, value of the node p will be added for both ap and apx and therefore returns the sum of these two settings
    + the hashmap is useful when you reset a value of the given key. Otherwise, we just add the value to all the nodes along the path
    + time complexity
      + O(K) K is the length of the key for each sum/search operation
    + space complexity
      + Linear to the size of the total input
  + using dfs and trie
    + in this implemetation, we only add the value to the end node of the give key
    + when searching the sum, we calucalte the sum of all nodes until the end node. The intermediate node Values are also added, but since their values are 0, it won't affect the results
    + time complexity and space complexity
      + O(K) for searching and linear to input string size for space 

In [2]:
# use Trie and Hashmap. Use Hashmap to store the most recent value of the end node of the key
class TrieNode:
    def __init__(self):
        self.child = {}
        self.val = 0

class MapSum:

    def __init__(self):
        self.root = TrieNode()  
        self.map = defaultdict(int)

    # get the difference between val and the current
    # record in map (default as 0). Each time when
    # a value is set, all the node along the path
    # of the key will be updated, and thus corresponding
    # to the sum of all the keys set along them
    def insert(self, key: str, val: int) -> None:
        diff = val - self.map[key]
        self.map[key] = val
        node = self.root
        
        for c in key:
            if c not in node.child:
                node.child[c] = TrieNode()
            node = node.child[c]  
            node.val += diff        

    # return the end node value as the sum
    def sum(self, prefix: str) -> int:
        node = self.root
        
        for c in prefix:
            if c not in node.child:
                return 0
            node = node.child[c]
        return node.val    
        


# Your MapSum object will be instantiated and called as such:
# obj = MapSum()
# obj.insert(key,val)
# param_2 = obj.sum(prefix)

# use Trie only, and only store the value in the end node of trie
class TrieNode:
    def __init__(self):
        self.child = {}
        self.val = 0

class MapSum:

    def __init__(self):
        self.root = TrieNode()        

    def insert(self, key: str, val: int) -> None:
        node = self.root
        
        for c in key:
            if c not in node.child:
                node.child[c] = TrieNode()
            node = node.child[c]
        node.val = val
        
    # get the sum of all the child node of the query key
    def dfs(self, node) -> int:
        if node is None:
            return 0
        
        rs = node.val
        for child in node.child.values():
            rs += self.dfs(child)
            
        return rs    
        

    def sum(self, prefix: str) -> int:
        rs = 0
        node = self.root
        
        # traverse to the node of prefix
        # if prefix node doesn't exist, return 0
        for c in prefix:
            if c not in node.child:
                return 0
            node = node.child[c]
         
        # recursively call dfs to return the sum of all 
        # child node values
        return self.dfs(node)    
        
        

#### Leetcode 648. Replace Words
* Overview
  + In English, we have a concept called root, which can be followed by some other word to form another longer word - let's call this word successor. For example, when the root "an" is followed by the successor word "other", we can form a new word "another".
  + Given a dictionary consisting of many roots and a sentence consisting of words separated by spaces, replace all the successors in the sentence with the root forming it. If a successor can be replaced by more than one root, replace it with the root that has the shortest length.
  + Return the sentence after the replacement.
  
* Algorithm
  + build a Trie to store all the words in dictionary and mark each word using self.word attribute
  + traverse each word in sentence and seach the word in trie. If not exists, return word itself, otherwise, the word value of the first end node with word attribute set will be returned

In [None]:
class TrieNode:
    def __init__(self):
        self.child = {}
        self.word = ""
        
class Solution:
    def replaceWords(self, dictionary: List[str], sentence: str) -> str:
        if not dictionary or not sentence:
            return sentence
        
        # define the Trie includes all words from dictionary
        root = TrieNode()
        
        for word in dictionary:
            node = root
            for l in word:
                if l not in node.child:
                    node.child[l] = TrieNode()
                node = node.child[l]  
            # add word in the end node
            node.word = word
            
        
        rs = []
        
        # for each word in sentence, if the char doesn't
        # not exist in the Trie and no chars before it 
        # was a key word in the dictionary, return word itself
        # othewise, if the char exists in the trie and corresponds
        # to a key word, return the key word
        def process(word: str) -> str:
            
            node = root
            for l in word:
                if l not in node.child:
                    break
                node = node.child[l]
                if node.word:
                    return node.word
            return word
        
        # split sentence and process each word
        for word in sentence.split():
            rs.append(process(word))
            
        # convert the string list to a concatenated string
        return " ".join(rs)              

#### Leetcode 642. Design Search Autocomplete System
* Overview
  + Design a search autocomplete system for a search engine. Users may input a sentence (at least one word and end with a special character '#').
  + You are given a string array sentences and an integer array times both of length n where sentences\[i\] is a previously typed sentence and times\[i\] is the corresponding number of times the sentence was typed. For each input character except '#', return the top 3 historical hot sentences that have the same prefix as the part of the sentence already typed.
  + Here are the specific rules:
    + The hot degree for a sentence is defined as the number of times a user typed the exactly same sentence before.
    + The returned top 3 hot sentences should be sorted by hot degree (The first is the hottest one). If several sentences have the same hot degree, use ASCII-code order (smaller one appears first).
    + If less than 3 hot sentences exist, return as many as you can.
    + When the input is a special character, it means the sentence ends, and in this case, you need to return an empty list.
  + Implement the AutocompleteSystem class:
    + AutocompleteSystem(String\[\] sentences, int\[\] times) Initializes the object with the sentences and times arrays.
    + List<String> input(char c) This indicates that the user typed the character c.
      + Returns an empty array [] if c == '#' and stores the inputted sentence in the system.
      + Returns the top 3 historical hot sentences that have the same prefix as the part of the sentence already typed. If there are fewer than 3 matches, return them all.
  ![image.png](attachment:image.png)

* Algorithm
  + build TrieNode class to include 
    + attributes in `__init()__`
      + child as a dictionary
      + time to count the repeated times
      + s to store the sentence content
      + hot to store the hot list of nodes
    + `__ls__()`
      + compare based on the time in descendent order
      + if times are the same, return s in ascendent order
    + update(self, node)
      + add the node to the current node's hot list
      + sort the hot list and maintain the length of the list to be <= 3
  + Build AutocompleteSysteme
    + attributes in dunder init()
      + self.root = TrieNode()
      + self.query = ""
        + this is necessary since all the query characters are chained until hashtag across multiple invocations of input()
      + self.curr = self.root
        + the same reason for self.query. We need to keep track of the node position across mutliple input queries. This will be re-initialized to self.root when hashtag is typed in query
      + add the sentence and time to trie
    + add(senctence, time)
      + traverse the trie structure and add the characters of the sentence
      + update self.time by incremented by input time
      + store the sentence content to self.s to the end node
      + add each TrieNode to nodelist during the traversal
      + update each node in the node list by the current end node so all these nodes as the parent nodes of the end node, will have the end node in their hot list if applicable
    + input(c: str)
      + if c == #
        + add the current self.query to trie with time == 1
        + reset self.curr, self.query and return empty list
      + attach c to self.query (even if the query result is None, and self.curr is None)
      + if self.curr, traverse the trie structure and check if the node corresponding to c is in the trie, if so return the sentences from its hot list
      + otherwise, set self.curr = None 
      + return empty list if self.curr is None or c is not in the Trie

In [4]:
from typing import List
class TrieNode:
    def __init__(self):
        self.child = {}
        self.time = 0
        self.s = ""
        self.hot = []
        
    def __lt__(self, other: 'TrieNode') -> bool:
        if self.time != other.time:
            return other.time < self.time
        return self.s < other.s
    
    def update(self, node: 'TrieNode') -> None:
        if node not in self.hot:
            self.hot.append(node)
            
        self.hot.sort()
        if len(self.hot) > 3:
            self.hot.pop()
            
class AutocompleteSystem:

    def __init__(self, sentences: List[str], times: List[int]):
        self.root = TrieNode()
        self.query = ""
        self.curr = self.root
        
        for sentence, time in zip(sentences, times):
            self.add(sentence, time)
        
    def add(self, sentence: str, time: int) -> None:
        node = self.root
        node_list = []
        
        for c in sentence:
            if c not in node.child:
                node.child[c] = TrieNode()
            node = node.child[c]
            node_list.append(node)
        node.time += time
        node.s = sentence
        
        for n in node_list:
            n.update(node)

    def input(self, c: str) -> List[str]:
       
        if c == "#":
            self.add(self.query, 1)
            self.query = ""
            self.curr = self.root
            return []
        
        # we must add the query character c to the query string
        # even if there is no record in the trie, since we will
        # add the entire query until hash tag to trie
        self.query += c
        
        if self.curr:                    
            if c in self.curr.child:
                self.curr = self.curr.child[c]
                return [node.s for node in self.curr.hot] 
            else:
                self.curr = None
        
        # if self.curr is None. one of the previous
        # queries are not in the trie, return []
        # for all the following queries. We only
        # udate self.query to add the entire query 
        # unitl hash tag to trie. 
        return []            
        


# Your AutocompleteSystem object will be instantiated and called as such:
# obj = AutocompleteSystem(sentences, times)
# param_1 = obj.input(c)

#### Leetcode 211. Design Add and Search Words Data Structure
* Overview
  + Design a data structure that supports adding new words and finding if a string matches any previously added string.
  + Implement the WordDictionary class:
    + WordDictionary() Initializes the object.
    + void addWord(word) Adds word to the data structure, it can be matched later.
    + bool search(word) Returns true if there is any string in the data structure that matches word or false otherwise. word may contain dots '.' where dots can be matched with any letter.
* Algorithm
  + This is a TrieNode problem. TrieNode contains the following elements:
    + child, which can use an array of 26 elements or a dictionary. Each child can be another TrieNode.
  + isWord, which tells if this node is the end of the chain, and word if isWord is True.
  + For WordDictionary class. It contains the root as a TrieNode. 
    + When adding word, root node works as a container. By traversing the word string, if the root node does not contain the current letter, create a TrieNode and assign it to its corresponding child. Then assign node to it. So the node will point to the TrieNode corresponding to the current letter.       
    + After the for loop, the node points to the TrieNode of the last letter. So assign its isWord as True and assign its word as the word string. 
  + we use a dfs algorithm for recusive searcining. If index is the end of the string, check if the corresponding child node is a word if child node exists.if the current letter is ".", then check if the node has any child that isword. Otherwise, check if the  node has child coresponding to the current letter
* Time and space complexity
  + O(N) where N is the max lenght of the word for adding and searching
  + space complexity is O(KMN) where M is the number of words added adn K is the number of characters such as 26 if only lower case letters are used.
    + the idea is we will have to use MN nodes, and each node will maintain K child slots

In [None]:
class TrieNode:
    def __init__(self):
        self.child = {}
        self.is_word = False
        self.word = ""

class WordDictionary:

    def __init__(self):
        self.root = TrieNode()        

    def addWord(self, word: str) -> None:
        node = self.root
        
        for c in word:
            if c not in node.child:
                node.child[c] = TrieNode()
            node = node.child[c]
        node.is_word = True
        node.word = word        

    def search(self, word: str) -> bool:
                
        n = len(word)
        
        # index is the index of the current character
        # node is its parent node
        def dfs(index: int, node: TrieNode) -> bool:
            
            c = word[index]
            if index == n-1:
                
                # if c != ".", check if c is in the node's child, if not, return Fasle
                # otherwise check if the child node is an end node
                if c != ".":
                    if c not in node.child:
                        return False
                    return node.child[c].is_word
                
                # otherwise check if any child node is an end node, if so, return True
                # if no child nodes are end nodes, return False
                for child in node.child.values():
                    if child.is_word:
                        return True
                return False            
            
            # if the current char is not the last char
            # check if c is in node's child nodes, if not, return False
            # if so, recursively call the next index
            if c != ".":
                if c not in node.child:
                    return False
                return dfs(index+1, node.child[c])            
           
            # if the char is ".", if any of its child node
            # returns True for the next index, return True
            # any of these nodes can be the parent node of the next index
            # otherwise return Fasle
            for child in node.child.values():
                if dfs(index+1, child):
                    return True
            return False
        
        return dfs(0, self.root)     
        


# Your WordDictionary object will be instantiated and called as such:
# obj = WordDictionary()
# obj.addWord(word)
# param_2 = obj.search(word)

#### Leetcode 421. Maximum XOR of Two Numbers in an Array
* Overview
  + Given an integer array nums, return the maximum result of nums\[i\] XOR nums\[j\], where 0 <= i <= j < n.
* Overview
  + Hashmap
    + this is based on the fact that if a^b = c then a = b^c or b = a^c (based on x^0 = x)
    + initialize max\_xor = 0, and L = len(bin(max(nums))) - 2
    + for i in range(L-1, -1, -1)
      + let max\_xor <<= 1 so that we focus on the next right position of max\_xor
      + initialize curr\_xor = max\_xor | 1 to generate a "faked" representation of max\_xor with its right most position as 1
      + set up prefix set containing all the num >> i and test if any of two of them can generate curr\_xor by xor operation using any(curr\_xor ^ p in prefix for p in prefix). If such combination exists, we can set 1 on the right most position of max\_xor, otherwise, we set it to zero
      + we then update rs = max(rs, max\_xor)
      + we repeat this step for all L positions and return rs
    + we are sure this operation will result a max\_xor as the combination of two specific numbers in nums, as we test it for each possible positions. If we can only get one 1 in xor of any of the two numbers, this 1 will have the highest position
    + time and space complexity
      + O(N) for both time and space complexity if we set L as a constant
      
  + Trie
    + convert each number to a reversed list of int
    
    + construct trie containing all of the numbers with the more significant positions closer to the root
      + initialize node = opp\_node = root, and curr\_xor = 0
      + we use curr\_xor to trace the max possible xor result of current number with existing numbers in the trie at each bit position, by checking if its opposite value ( if the current bit is 0, then check if there is a 1, otherwise, if there is a 0) is avaiable in opp\_node. Note that opp\_node represents the a number that can match the current num in the trie, starting from the begining poistion of the iteration for each num
      + update curr_xor and opp\_node depending on the availability of opposite bit at each position
    + update rs = max(rs, curr_rs) for each num 
    + return rs

In [5]:
class Solution:
    def findMaximumXOR(self, nums: List[int]) -> int:
        if not nums:
            return 0
        
        L = len(bin(max(nums))) -2
        max_xor = 0
        
        # check the combination of any two numbers to see if we 
        # can keep 1 on each specific position with the existing
        # max_xor obtained from the last iteration
        for i in range(L-1, -1, -1):
            max_xor <<= 1
            prefix = set()
            
            curr_xor = max_xor | 1
            
            # taking advantage of the fact that if a^b = c then
            # c^a = b and c^b = a. Here we use c to test if there
            # is any a, b in nums to satisfy c^a = b, if so, we
            # assign c (curr_xor) to max_xor
            for num in nums:
                prefix.add(num >> i)
            max_xor |= any(p^curr_xor in prefix for p in prefix)
            
            
        return max_xor    

# implemented by Trie
class Solution:
    def findMaximumXOR(self, nums: List[int]) -> int:
        if not nums:
            return 0
        
        L = len(bin(max(nums))) -2
        rs = 0
        
        nums = [[num >> i & 1 for i in range(L)][::-1] for num in nums]
        
        root = {}
        
        for num in nums:
            curr_xor = 0
            
            # both opp_node and node point to root trie
            opp_node = node = root
            
            for bit in num:
                curr_xor <<= 1
                
                # insert the node to trie
                if bit not in node:
                    node[bit] = {}
                node = node[bit] 
                
                # check if the oppsite bit exists in opp_node's child nodes
                # if so, it can match node and we get a 1 in the curr_xor bit
                # note now opp_node points to the parent node of the current node
                opp_bit = 1- bit
                if opp_bit in opp_node:
                    opp_node = opp_node[opp_bit]
                    curr_xor |= 1
                else:
                    opp_node = opp_node[bit]
            rs = max(rs, curr_xor) 
            
        return rs                     

#### Leetcode 212. Word Search II
* Overview
  + Given an m x n board of characters and a list of strings words, return all words on the board.
  + Each word must be constructed from letters of sequentially adjacent cells, where adjacent cells are horizontally or vertically neighboring. The same letter cell may not be used more than once in a word.
* Algorithm
  + build a trie as root and insert all the words in to root
  + define a dfs(i, j, parent) function where i, j are the coordination of the character on the board that has been found from a word, and parent is the TrieNode that is the parent node of the char node
    + get the char value from i, j
    + get the node object char
    + if node correspond to a word, add the word to the result list, and change the isWord to False
    + modify board(i, j) = "#" to aviod tracking back and infinite cycles
    + traverse the four directions to move and check the boundary conditions and if board(x, y) is in the node.child,if so dfs(x, y, node)
    + change board(i, j) back to char
    + if node.child is None
      + parent.child.pop(char)
  + by setting `is_word` to False after finding a TrieNode corresponding to a word, we prevent the duplication of finding the same word multiple times. In addition, since each word corresponds to a leaf node, by popping up the leaf node from the Trie, the search path will never visit the same word multiple times    
  + traverse the board, and if board(i, j) is in root.child, call dfs(i, j, root)
  + return rs
  + time complexity
    + O(M43^(L-1) M is the number of cells in the board and L is the maxinum length of words
    ![image-3.png](attachment:image-3.png)
  + space complexity
    + O(N) N is the number of letters in dictionary
 ![image-4.png](attachment:image-4.png)

In [None]:
class TrieNode:
    def __init__(self):
        self.child = {}
        self.is_word = False
        self.word = ""

class Solution:
    def findWords(self, board: List[List[str]], words: List[str]) -> List[str]:
        if not board or not words:
            return []
        
        # get the dimensions of board
        m, n = len(board), len(board[0])
        
        # construct root as a TrieNode object
        root = TrieNode()
        rs = []
        
        # insert words to root
        for word in words:
            node = root
            
            for l in word:
                if l not in node.child:
                    node.child[l] = TrieNode()
                node = node.child[l]
            node.is_word = True
            node.word = word
            
        # define dfs function to traverse the board and TrieNode
        # and add words found to rs list
        def dfs(i: int, j: int, parent: TrieNode) -> None:
            
            # find char value and assign the node object
            ch = board[i][j]
            node = parent.child[ch]
            
            # if the node corresponds to a word
            # add the word to rs list, and set is_word to False
            if node.is_word:
                rs.append(node.word)
                node.is_word = False
                
            # set board[i][j] to # sign to avoid infinite loop back
            # and explore the 4 directions on board
            moves = [(-1, 0), (0, 1), (1, 0), (0, -1)]
            board[i][j] = "#"
            
            # check if x and y are within the boundary and the value is in the TrieNode,
            # recursively call dfs
            for move in moves:
                x, y = move[0] + i, move[1] + j
                if -1 < x < m and -1 <y < n and board[x][y] in node.child:
                    dfs(x, y, node)
                    
            # set board[i][j] back to ch for other search paths on board
            board[i][j] = ch
            
            # reduced the space consumption by popping the node itself
            # if the current path has gone
            # to the end and has completed search
            if node.child is None:
                parent.child.pop(ch)
                
        # traverse the board and explore the words by dfs if
        # the char is a staring char of any of the words in root TrieNode
        for i in range(m):
            for j in range(n):
                if board[i][j] in root.child:
                    dfs(i, j, root)
                    
        return rs                     

#### Leetcode 425. Word Squares
* Overview
  + Given an array of unique strings words, return all the word squares you can build from words. The same word from words can be used multiple times. You can return the answer in any order.
  + A sequence of strings forms a valid word square if the kth row and column read the same string, where 0 <= k < max(numRows, numColumns).
    + For example, the word sequence \["ball","area","lead","lady"\] forms a word square because each word reads the same both horizontally and vertically.
* Algorithm (Trie + backtracking)
  + the goal is to find the combinations of words that can form the word squares with length n, which is the length of each word. Therefore, we need to find all combinations of n words that can form word squares
  + the basic idea is that in a word square, we only need to compare the non-diagonal elements to mirror each other, not the diagonal elements
  + to do this, we use trie to store each word in the vertical direction, which provide the prefix of the words at a specific index as the horizontal part of the mirror images, this prefix needs to find the corresponding vertical matching mirrored image, which is done by searching the Trie. All the words having the corresponding prefix matching the horizontal part will be added into a list, which further forms the horizontal image of the next index for the next dfs recursion
  + we start from index == 1, since index == 0 is on diagonal and no need to check. So we start the horizontal part from index 1 of each word and check the list of other words that can match it in the vertical direction using Trie
  + when index == n, we have obtained n words in a sqaure, we just add the word list to the result list and return. This is the base case of dfs
  ![image.png](attachment:image.png)
  + time complexity
    + O(N 26^L L) where N is the number of input words and L is the length of a single word. We need to traverse L steps to finally get the prefix index list. For each node, we have 26 options
  + space complexity
    + O(NL + NL/2) = O(NL)
      + the first part is used for Trie to store all the words
      + the second part is used to store prefixes. Since prefixes are the horizontal mirror part in the lower left triangle of the the word matrix, we only need L/2 as the length, and we have N of them for each recursive call

In [1]:
from typing import List
class TrieNode:
    def __init__(self):
        self.child = {}
        self.index_list = []

class Solution:
    def wordSquares(self, words: List[str]) -> List[List[str]]:
        if not words or not words[0]:
            return []
        
        n = len(words[0])
        root = TrieNode()
        rs = []
        
        # traverse words list and add each word to
        # root. In addition, for each TrieNode, add
        # the index of the word passing through it  
        # to its index_list. This is useful, since
        # we will try prefixes with different lengths
        # and get the index list of words matching prefix
        # note that only when words having the same prefix will
        # have multiple indices in the TrieNode
        for i, word in enumerate(words):
            node = root
            for l in word:
                if l not in node.child:
                    node.child[l] = TrieNode()
                node = node.child[l]
                node.index_list.append(i)
                
        # search from the root and get the candidate word index list
        # this is the vertical part for matching the mirror image
        def get_prefix_word_indices(prefix: str) -> Optional[List[int]]:
            node = root
            for c in prefix:
                if c not in node.child:
                    return []
                node = node.child[c]
            return node.index_list
        
        # the dfs function provides the horizontal part of the mirror
        # image as prefixes, and send the prefix to get_prefix_word_indices
        # for indices of words that provide vertical matches 
        def dfs(index: int, index_list: List[int]) -> List[str]:
            # if index ==n, no further test is needed since chars at this
            # position is on diagonal position and mirror itself. In addition
            # we have already collected n words
            if index == n:
                rs.append([words[i] for i in index_list])
                return
            
            prefix = "".join([words[i][index] for i in index_list])
            for j in get_prefix_word_indices(prefix):
                dfs(index+1, index_list+[j])
                
        for i in range(len(words)):
            dfs(1, [i])
            
        return rs                

#### Leetcode 336. Palindrome Pairs
* Overview
  + You are given a 0-indexed array of unique strings words.
  + A palindrome pair is a pair of integers (i, j) such that:
    + 0 <= i, j < words.length,
    + i != j, and
    + words[i] + words[j] (the concatenation of the two strings) is a palindrome
  + Return an array of all the palindrome pairs of words.
* Algorithm
  + There are three cases that will satitify the requirements. To handle these cases, we first build a dictionary with the word as key and its index as the value. 
  + Then we handle the three cases
    1. the word's reverse is in the word list. In addition, we check that the word's reverse is not the word itself by checking the idx of the reverse word 
    2. the word has two parts, one part (prefix or suffix) has its reverse in word list, and the remaining part is a palindrome by itself
    3. to handle these two cases, we traverse each word in the word list for each letter, and extract its suffix and prefix that can have the remaining parts as palidrome. We created the functions to generate all the possible such suffix and perfix and return them in a list. We traverse from 0 to length -1 for prefix, and 1 to string length for suffix
      + for prefix, we can have empty string and the reverse of the word before the last char
      + for suffix, we can have substring starting from index 1 to empty string
      + both prefix and suffix cover from empty strings to the reverse of substring with the length of n-1
    4. suffix include the reverse of the substring starting from the second letter (index == 1) until empty string. 
    5. Trie will have time limit error. Need to use dictionary to speed up the process
    6. the test for reverse of the entire word is done in the traverse test
  + procedure
    + establish the word dictionary using word as key and its index as value
    + define functions to return the list of prefixes and suffixes for a given word
    + traverse the word list, 
      + first test if the word itself is a panlindrome, if not, find if its reversed version is in the word dictionary. If so, append (i, index of reverse) to rs
      + get all the prefixes and check if any of them exists in the word list, if so, append the index pairs (i, index) to rs. note that the prefix is the non-palindrome part at the begining of word
      + get all the suffixes and check if any of them exists in the word list, if so, append the index paris (index, i) to rs. Note that the suffix is the non-palindorme part at the end of word
* time complexity
  + O(nk^2)
    + building word dicrionary takes O(nk). Each word takes O(k) time to insert
    + build prefix and suffix takes O(nk^2) since we traverse k indices and for each traversal, we compare up to k chars
  + space complexity
    + build hash table takes O(nk)
    + look up prefix or suffix (which can be up to k prefix/suffix to compare and each compare takes k steps) is O(k^2)

In [2]:
from typing import List, Optional
class Solution:
    def palindromePairs(self, words: List[str]) -> List[List[int]]:
        
        if not words:
            return [[]]
        
        word_list = {word: i for i, word in enumerate(words)}
        
        # return the list of reverse of the prefix that are not palindrome
        # in the begining part of the word. This can range
        # from empty string up to the substring before the 
        # last char (not including the reverse of the string itself)
        def get_prefix(word: str) -> List[str]:
            rs = []
            for i in range(len(word)):
                if word[i:] == word[i:][::-1]:
                    rs.append(word[:i][::-1])
            return rs
        
        # return the list of reverse of the suffix that are not panlidrome
        # in the end of the word. This can range from the substring from
        # the second char dwon to empty string. Not include the reverse of
        # the string itself
        def get_suffix(word: str) -> List[str]:
            rs = []
            for i in range(1, len(word) + 1):
                if word[:i] == word[:i][::-1]:
                    rs.append(word[i:][::-1])
            return rs
        
        rs = []
        # if the word itself is panlidrome, skip
        for i, word in enumerate(words):
            if word != word[::-1]:
                # if the reverse is in list, add the pair
                index = word_list.get(word[::-1], -1)
                if index != -1:
                    rs.append([i, index])
                
            # if the reverse of the prefix part is in word list
            # balance it after word
            for prefix in get_prefix(word):
                index = word_list.get(prefix, -1)
                if index != -1:
                    rs.append([i, index])
                    
            # if the reverse of the suffix part is in word list
            # balance it before word
            for suffix in get_suffix(word):
                index = word_list.get(suffix, -1)
                if index != -1:
                    rs.append([index, i])
                    
        return rs            
        
        
# implemeted by Trie instead of using a word list for reference (Time limit exceeds)    
class TrieNode:
    def __init__(self) -> None:
        self.children = {}
        self.word_index = -1
        self.is_word = False
        self.word = ""
        
class Solution:
    def palindromePairs(self, words: List[str]) -> List[List[int]]:
        rs = []
        if not words:
            return rs
        
        root = TrieNode()
        
        def buildTrie():            
            for i, word in enumerate(words):
                node = root
                if word =="":
                    node.children[""] = TrieNode()
                    node = node.children[""]
                for c in word:
                    if c not in node.children:
                        node.children[c] = TrieNode()
                    node = node.children[c]
                node.is_word = True
                node.word_index = i
                node.word = word
                
        def searchTrie(word: str) -> int:
            node = root
            if word =="":
                node = node.children.get(word, None)
                if not node:
                    return -1                
            for c in word:
                if c not in node.children:
                    return -1
                node = node.children[c]
            return node.word_index if node.is_word else -1    
            
        def getPrefix(word: str) -> List[str]:
            rs = []
            for i in range(len(word)):
                if word[i:] == word[i:][::-1]:
                    rs.append(word[:i][::-1])
            return rs
        
        def getSuffix(word: str) -> List[str]:
            rs = []
            for i in range(1, len(word) +1):
                if word[:i] == word[:i][::-1]:
                    rs.append(word[i:][::-1])
            return rs
        
        buildTrie()
        
        for i, word in enumerate(words):
            
            if word != word[::-1]:
                index = searchTrie(word[::-1])
                
                if index != -1:
                    rs.append([i, index])
                    
            for prefix in getPrefix(word):
                index = searchTrie(prefix)
                if index != -1:
                    rs.append([i, index])
                    
            for suffix in getSuffix(word):
                index = searchTrie(suffix)
                if index != -1:
                    rs.append([index, i])
                    
        return rs            
                