### Tries / Tree

The `trie` is a kind of tree ideal for implementing autocomplete.  
The trie is derived from the word `retrival`.  
Most people pronounce trie as `try`.  

### Nodes / Hash Table

A trie is not a binary tree, it can have `any number` of child nodes.  
In this implementation, each trie node contains a `hash` table.  
The `keys` are English characters and the values are other nodes of the trie.  

### Insert Node

We `populate` our trie with data.

In [9]:
class TrieNode:
    def __init__(self):
        self.children = {}

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        currentNode = self.root

        for c in word:

            # If the current node has child key with current character
            if currentNode.children.get(c):

                # Follow the child node:
                currentNode = currentNode.children[c]

            else:

                # Add the character as a new child node
                newNode = TrieNode()
                currentNode.children[c] = newNode

                # Follow this new node
                currentNode = newNode

T = Trie()
T.insert('ace')
T.insert('bad')
T.insert('act')

print(T.root.children)
print("a", T.root.children['a'].children)
print("a:c", T.root.children['a'].children['c'].children)

{'a': <__main__.TrieNode object at 0x7f80ec1acdf0>, 'b': <__main__.TrieNode object at 0x7f80ec1ace80>}
a {'c': <__main__.TrieNode object at 0x7f80ec1acee0>}
a:c {'e': <__main__.TrieNode object at 0x7f80ec1f5e50>, 't': <__main__.TrieNode object at 0x7f80ec1f5670>}


### Asterisk

We need to indicate when parts of a `word` are also words themselves.


In [25]:
class TrieNode:
    def __init__(self):
        self.children = {}

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        currentNode = self.root

        for c in word:
            if currentNode.children.get(c):
                currentNode = currentNode.children[c]
            else:
                newNode = TrieNode()
                currentNode.children[c] = newNode
                currentNode = newNode

        # After inserting the entire word in the trie, add * key at the end
        currentNode.children["*"] = None

T = Trie()
T.insert('bat')
T.insert('batter')

print(T.root.children)
print('b', T.root.children['b'].children)
print('b:a', T.root.children['b'].children['a'].children)
print('b:a:t', T.root.children['b'].children['a'].children['t'].children)

{'b': <__main__.TrieNode object at 0x7f80ec1b0880>}
b {'a': <__main__.TrieNode object at 0x7f80ec1b03d0>}
b:a {'t': <__main__.TrieNode object at 0x7f80ec1b0a30>}
b:a:t dict_items([('*', None), ('t', <__main__.TrieNode object at 0x7f80ec1b06a0>)])


### Search

We check if the current node has any children with the `current` character as the key.  
If there is such a child, we `update` the current node to be the child node.  
We return the node and not just true in order to help us with the `autocomplete` feature.

In [24]:
class TrieNode:
    def __init__(self):
        self.children = {}

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        currentNode = self.root

        for c in word:
            if currentNode.children.get(c):
                currentNode = currentNode.children[c]
            else:
                newNode = TrieNode()
                currentNode.children[c] = newNode
                currentNode = newNode

        currentNode.children["*"] = None

    def search(self, word):
        currentNode = self.root

        for c in word:
            if currentNode.children.get(c):
                currentNode = currentNode.children.get(c)
            else:
                return None

        return currentNode

T = Trie()
T.insert('ace')
T.insert('bad')
T.insert('act')
T.insert('cat')
T.insert('bat')
T.insert('batter')

print(T.search('cat').children.items())
print(T.search('bat').children.items())

dict_items([('*', None)])
dict_items([('*', None), ('t', <__main__.TrieNode object at 0x7f80ec1aceb0>)])


### Efficiency / O(K)

The trie `search` is incredibly efficient O(K).  
It takes as many steps as the number of `characters` in our search term.  
The `insert` is O(K+1), by adding * at the end, so we drop the constant and is the same O(K).  


### Autocomplete

The collect method returns all the trie's words `starting` from a particular node.

In [56]:
class TrieNode:
    def __init__(self):
        self.children = {}

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        currentNode = self.root

        for c in word:
            if currentNode.children.get(c):
                currentNode = currentNode.children[c]
            else:
                newNode = TrieNode()
                currentNode.children[c] = newNode
                currentNode = newNode

        currentNode.children["*"] = None

    def search(self, word):
        currentNode = self.root

        for char in word:
            if currentNode.children.get(char):
                currentNode = currentNode.children.get(char)
            else:
                return None

        return currentNode

    def collectWords(self, node=None, word="", words=[]):
        currentNode = node or self.root
        
        for key, node in currentNode.children.items():
            if key == '*':
                words.append(word)
            else: 
                # If we're still in the middle of the word
                self.collectWords(node, word + key, words)

        return words

T = Trie()

for word in ['ace', 'bad', 'act', 'cat', 'bat', 'batter']:
    T.insert(word)

words1 = T.collectWords(T.root.children['b'], 'b', words=[])
words2 = T.collectWords(T.search('bat'), 'bat', words=[])

print("Words for 'b' = ", words1)
print("Words for 'bat' = ", words2)

Words for 'b' =  ['bad', 'bat', 'batter']
Words for 'bat' =  ['bat', 'batter']
