# Tries

This approach is suitable for applications where a series of queries is performed on a ﬁxed text, so that the initial cost of preprocessing the text is compensated by a speedup in each subsequent query (for example, a Web site that offers pattern matching in Shakespeare’s Hamlet or a search engine that offers Web pages on the Hamlet topic).

A **trie** (pronounced “try”) is a tree-based data structure for storing strings in order to support fast pattern matching. The main application for tries is in information retrieval. 

Indeed, the name “trie” comes from the word “retrieval.”

---

## Trie Data Structure - Prefix Tree:

Tries (prefix trees) are used to efficiently store and retrieve strings, making them suitable for tasks like autocomplete and spell check.

Another kind of tree, each node usually represents a character.

Each node can have up to 26 children usually.

You can use the existing prefix within the tree, you do not have to add all characters yourself.

In [None]:
#           A
#         / | \
#        p  n  t
#      /        \
#     e          e

# when we want to add "ANT"

# All we have to do is to add a T

#           A
#         / | \
#        p  n  t
#      /    |   \
#     e     T    e

We can also search for a word in $O(N)$ time.

Use cases:

1. **Efficient Pattern Matching:**
- One of the main applications of suffix tries is in pattern matching. Given a pattern, the trie allows for quick determination of whether the pattern exists in the original string.
2. **Longest Common Substring:**
- Suffix tries can be used to find the longest common substring between two or more strings efficiently.
3. **Substring Retrieval:**
- It allows for fast retrieval of all occurrences of a substring within the original string.

# Q u e s t i o n s ! 🎃

In [3]:
r"""
Trie - Prefix Tree

A trie (pronounced as "try") or prefix tree is 
a tree data structure used to efficiently store 
and retrieve keys in a dataset of strings. 

There are various applications of this data 
structure, such as autocomplete and spellchecker.

Implement the Trie class:

    Trie(): Initializes the trie object.

    void insert(String word): 
        
        Inserts the string word into the trie.

    boolean search(String word): Returns true if the string 
        word is in the trie (i.e., was inserted before), and 
        false otherwise. 

    boolean startsWith(String prefix): Returns true if there 
        is a previously inserted string word that has the 
        prefix prefix, and false otherwise.
 
Example 1:

    Input:

        ["Trie", "insert", "search", "search", 
            "startsWith", "insert", "search"]
    
        [[], ["apple"], ["apple"], ["app"], 
            ["app"], ["app"], ["app"]]

    Output:
    
        [null, null, true, false, true, null, true]

    Explanation:

        Trie trie = new Trie();
        trie.insert("apple");
        trie.search("apple");   // return True
        trie.search("app");     // return False
        trie.startsWith("app"); // return True
        trie.insert("app");
        trie.search("app");     // return True

Constraints:

    1 <= word.length, prefix.length <= 2000
    
    Word and prefix consist only of 
        lowercase English letters.
    
    At most 3 * 10^4 calls in total will be 
    made to insert, search, and startsWith.

Takeaway:

    We can start with making a TrieNode for the class.
    Each node will have child nodes, and at 
    each word ending we will have a Flag 
    indicating the end.
    
    Using a Trie data structure, we have efficient 
    string search and prefix matching.
    
    The time complexity for inserting, searching, and 
    checking prefixes in a Trie is O(L), where L is 
    the length of the word or prefix, making it 
    an efficient choice for string-related tasks.
    
    For different words we will be using 
    a lot of the same nodes
    
    .
     \ 
      a
       \ 
        p 
         \
          p 
         / \ 
        e   l 
       /      \ 
      n        e

"""

class TrieNode:
    def __init__(self):
        # A dictionary to store child nodes.
        self.children = {}  
        # A flag to indicate the end of a word.
        self.is_end_of_word = False 

class Trie:

    def __init__(self):
        # root is a TrieNode
        self.root = TrieNode()

    def insert(self, word: str) -> None:
        """Insert a word in the Trie
        
        Start at the root node.
        For each character in the word:
        Check if the character is a child of the current node.
        If not, create a new node for the character.
        Move to the child node corresponding to the character.
        After processing all characters, mark the 
        last node as the end of a word.
        """
        current = self.root
        for char in word:
            if char not in current.children:
                current.children[char] = TrieNode()
            # to the next node
            current = current.children[char]
        # word is over after word ends.
        # ending node should have is "is_end_of_word" 
        # attribute set to True
        current.is_end_of_word = True
        
    def search(self, word: str) -> bool:
        """
        Search for a node

        Start at the root node.
        For each character in the word:
        Check if the character is a child of the current node. 
        If not, the word is not in the Trie.
        Move to the child node corresponding to the character.
        After processing all characters, check if the last node 
        is marked as the end of a word.
        """
        current = self.root
        for char in word:
            if char not in current.children:
                # if no complete match, return False
                return False
            # to the next node
            current = current.children[char]
        # if the word is not ended, return False
        return current.is_end_of_word

    def starts_with(self, prefix: str) -> bool:
        """
        Check if a node starts with a given prefix

        Start at the root node.
        For each character in the prefix:
        Check if the character is a child of the current node. 
        If not, there are no words with the given prefix.
        Move to the child node corresponding to the character.
        After processing all characters, the prefix exists if 
        the search doesn't indicate the end of a word.
        """
        current = self.root
        for char in prefix:
            if char not in current.children:
                return False
            # on to the next node
            current = current.children[char]
        # if we made it this far, word starts 
        # with the given prefix
        return True

my_trie = Trie()

my_trie.insert("apple")
my_trie.insert("banana")
my_trie.insert("app")

print(my_trie.search("apple"))  # True
print(my_trie.search("banana"))  # True
print(my_trie.search("app"))  # True
print(my_trie.search("ap"))  # False

print(my_trie.starts_with("app"))  # True
print(my_trie.starts_with("ban"))  # True
print(my_trie.starts_with("or"))  # False

True
True
True
False
True
True
False


In [5]:
"""
Design a data structure that supports adding 
new words and finding if a string matches any 
previously added string.

Implement the WordDictionary class:

    WordDictionary(): 
        Initializes the object.

    void addWord(word): 
        Adds word to the data structure, it 
            can be matched later.

    bool search(word): 
        Returns true if there is any string in the 
        data structure that matches word or false otherwise. 
        word may contain dots '.' where dots can be matched
        with any letter.

Example:

    Input:
        ["WordDictionary","addWord","addWord","addWord",
            "search","search","search","search"]
        
        [[],["bad"],["dad"],["mad"],["pad"],
            ["bad"],[".ad"],["b.."]]

    Output:
        [null,null,null,null,false,true,true,true]

    Explanation:
        
        WordDictionary wordDictionary = new WordDictionary();
        wordDictionary.addWord("bad");
        wordDictionary.addWord("dad");
        wordDictionary.addWord("mad");
        wordDictionary.search("pad"); // return False
        wordDictionary.search("bad"); // return True
        wordDictionary.search(".ad"); // return True
        wordDictionary.search("b.."); // return True

Constraints:

    1 <= word.length <= 25
    
    word in addWord consists of lowercase English letters.
    
    word in search consist of '.' or lowercase English letters.
    
    There will be at most 2 dots in word for search queries.
    
    At most 10^4 calls will be made to addWord and search.

Takeaway:

    This is obviously a Trie (Prefix Tree) Question
    
    Because we are looking for all words starting 
    with some characters "ab." or "b.."
    
    A root and 26 children in the Trie
    
    "." character is a wildcard. It can be 
    used instead any character
    
    We should use end of the word to show that word ended
    
    The Trie solution gives us time limit exceeded
    
    SO a hashmap solution is added.

"""

class TrieNode:
    def __init__(self):
        self.children = {}
        self.word_ends = False

class WordDictionary:

    def __init__(self):
        self.root = TrieNode()

    def addWord(self, word: str) -> None:
        
        current = self.root

        for char in word:
            if char not in current.children:
                current.children[char] = TrieNode()
            # go on to the next character
            current = current.children[char]
        # last character saying the word ended
        current.word_ends = True
        
    def search(self, word: str) -> bool:
        
        def dfs(j, root):
            current = self.root

            for i in range(len(word)):
                c = word[i]

            if c == ".":
                for child in current.children.values():
                    if dfs(i + 1, child):
                        # we found a match
                        return True
                return False

            else:
                if c not in current.children:
                    return False
                # onto next
                current = current.children[c]
            # if we have no "." in the word
            return current.word_ends
        return dfs(0, self.root)


class WordDictionary:

    def __init__(self):
        # Initialize the WordDictionary with an 
        # empty root node,
        # which is represented as a dictionary.
        self.root = {}

    def addWord(self, word: str) -> None:
        # Add a word to the WordDictionary.
        # Traverse the WordDictionary's tree 
        # structure, creating nodes for each 
        # character in the word.
        
        # Mark the end of a word with 
        # the '*' key in the dictionary.
        curr_node = self.root
        for ch in word:
            if ch not in curr_node:
                curr_node[ch] = {}
            curr_node = curr_node[ch]
        curr_node['*'] = False

    def search(self, word: str) -> bool:
        # Search for a word in the WordDictionary.
        # Use a depth-first search (DFS) 
        # approach to traverse
        # the tree and match characters in the word.
        def dfs(node, index):
            if not node:
                # If we reach a null node, 
                # the word cannot be found.
                return False
            if index == len(word):
                # If we have reached the end of 
                # the word, check if the '*' key 
                # exists to indicate a complete word.
                return '*' in node
            if word[index] != '.':
                if word[index] not in node:
                    # If the character in the word 
                    # is not in the tree,
                    # the word cannot be found.
                    return False
                # Continue searching for the 
                # next character in the word.
                return dfs(node[word[index]], index + 1)
            for n in node.values():
                # If the character is a dot ('.'), 
                # explore all possible
                # branches and check if any path 
                # leads to a valid word.
                if dfs(n, index + 1):
                    return True
            return False

        return dfs(self.root, 0)

# Your WordDictionary object will be 
# instantiated and called as such:

# obj = WordDictionary()
# obj.addWord(word)
# param_2 = obj.search(word)

In [6]:
"""
Given an m x n board of characters and a 
list of strings words, return all words 
on the board.

Each word must be constructed from letters 
of sequentially adjacent cells, where adjacent 
cells are horizontally or vertically neighboring. 

The same letter cell may not be used 
more than once in a word.

Example 1:

    Input: board = [["o","a","a","n"],
                    ["e","t","a","e"],
                    ["i","h","k","r"],
                    ["i","f","l","v"]], 

        words = ["oath","pea","eat","rain"]
    
    Output: ["eat","oath"]

Example 2:

    Input: board = [["a","b"],
                    ["c","d"]], 
                    
            words = ["abcb"]
    
    Output: []

Constraints:

    m == board.length
    
    n == board[i].length
    
    1 <= m, n <= 12
    
    board[i][j] is a lowercase English letter.
    
    1 <= words.length <= 3 * 104
    
    1 <= words[i].length <= 10
    
    words[i] consists of lowercase English letters.
    
    All the strings of words are unique.

Takeaway:

    we need to make a data structure where 
    we can see every possible word in the board
    we can use a Trie
            
    Brute force would be,
    starting from each tile, run a depth first search
    and check if you can make the words
            
    we can check every word at the same time
    because our main condition is based on prefix
    So a Trie is great!
    
    lets make a Trie for our words
    in order to not check the words list every time 
    we go down in out dfs
    Also we do not have to check tiles that our words 
    does not start with.
    
    Base case for the DFS is pretty big:
    
    out of bounds
    already visited position
    maybe the character we are working 
    on is not in out Trie

"""

class TrieNode:
    # streamline memory usage
    __slots__ = "children", "end_of_word"
    def __init__(self):
        self.children = {}
        self.end_of_word = False 

    def add_word(self, word):
        # root node
        cur = self

        for c in word:
            # If that character does not exist
            if c not in cur.children:
                # make a new Node
                cur.children[c] = TrieNode()
            # go on forward
            cur = cur.children[c]
        # word ended
        cur.end_of_word = True

class Solution:

    def findWords_(self, board: list[list[str]], 
                words: list[str]) -> list[str]:
        # This is my first try
        
        # we need to make a data structure where 
        # we can see every possible word in the board
        # we can use a Trie
        
        # Brute force would be,
        # starting from each tile, run a depth first search
        # and check if you can make the words
        
        # we can check every word at the same time
        # because our main condition is based on prefix
        # So a Trie is great!

        # lets make a Trie for our words
        # in order to not check the words list every time 
        # we go down in out dfs
        # also we do not have to check tiles that our words 
        # does not start with 
        
        root = TrieNode()

        # add all words to our Trie
        for w in words:
            root.add_word(w)
        
        ROWS, COLS = len(board), len(board[0])
        
        # result - we want the result to be unique
        # visit - we do not want to repeat same character
        result, visit = set(), set()

        def dfs(r, c, node, word):
            # out of bounds
            # already visited position
            # maybe the character we are working 
            # on is not in out Trie
            if (r < 0 or c < 0 or 
                r == ROWS or c == COLS or
                (r, c) in visit or 
                board[r][c] not in node.children):
                return
            
            # add the position to visited
            visit.add((r,c))

            # onto the next tile
            node = node.children[board[r][c]]
            
            # add the character to our current word
            word += board[r][c]

            # is the current result a Word?
            if node.end_of_word:
                result.add(word)

            # go to every position
            dfs(r - 1, c, node, word)
            dfs(r + 1, c, node, word)
            dfs(r, c - 1, node, word)
            dfs(r, c + 1, node, word)

            # after we are done, we can remove 
            # the position from being visited
            visit.remove((r,c))

        for r in range(ROWS):
            for c in range(COLS):
                dfs(r, c, root, "")

        return list(result)
    

    def findWords(self, board: list[list[str]], 
                words: list[str]) -> list[str]:
        # we can solve the question 
        # in a single method too
        
        def dfs(x, y, root):
            temp = board[x][y]
            curr = root[temp]
            
            # Check if a word ends at this node, add 
            # it to the result
            word = curr.pop('#', False)
            if word:
                res.append(word)
            board[x][y] = '.'

            # Define the possible directions to move
            dirs = [(-1, 0), (1, 0), (0, 1), (0, -1)]
            for dx, dy in dirs:
                new_x = x + dx
                new_y = y + dy

                # Check if the new position is valid and the 
                # character is in the trie
                if (
                    0 <= new_x < m
                    and 0 <= new_y < n
                    and board[new_x][new_y] in curr
                ):
                    dfs(new_x, new_y, curr)
            
            # Restore the original cell value and remove 
            # the current node if it's a leaf
            board[x][y] = temp
            if not curr:
                root.pop(temp)
        
        trie = {}
        for word in words:
            curr = trie
            # for each character of the word
            for ch in word:
                if ch not in curr:
                    # If the character is not a 
                    # child, add it as a 
                    # child node with an empty 
                    # dictionary as its value. 
                    curr.setdefault(ch, {})
                # Move the current node pointer 
                # to the child node 
                # corresponding to the current character.
                curr = curr[ch]
            # end of word
            curr['#'] = word
        
        m, n = len(board), len(board[0])
        res = []

        # Start DFS from each cell on the board
        for i in range(m):
            for j in range(n):
                if board[i][j] in trie:
                    dfs(i, j, trie)
        return res