## TRIE 
Trie is an efficient information re*Trie*val data structure.  Search complexities can be brough to optimal limit (key length)

Using Trie, we can search the key in O(M) time.  The penalty is on Trie storage requirements.

![TRIE](https://media.geeksforgeeks.org/wp-content/cdn-uploads/Trie.png)

In [3]:
from nltk.tokenize import sent_tokenize

In [4]:
paragraph = """Republican Sen. Rand Paul objected Wednesday to an attempt to pass the bill funding 9/11 first responders' health care unanimously, arguing that passing such a long-term bill without offsetting the cost would contribute to the national debt.  The delay presents another hurdle in the dramatic fight to secure funding for the September 11th Victim Compensation Fund, despite Senate Majority Leader Mitch McConnell's continued reassurances that the fund would be fully funded. After comedian and fund advocate Jon Stewart gave emotional testimony last month accusing lawmakers of failing to support the bill, the measure was swiftly approved for to a floor vote in the House and passed the lower chamber last week on an overwhelmingly bipartisan 402-12 vote.
When presidential candidate and Democratic Sen. Kirsten Gillibrand of New york requested unanimous consent -- a procedural move that allows a bill to skip several steps to pass unanimously, without senators casting an individual vote -- on the bill on the Senate floor Wednesday so that it be accelerated to a vote without debate, Paul objected.
"It has long been my feeling that we need to address our massive debt in this country -- we have a $22 trillion debt, we're adding debt at about a trillion dollars a year," he said. "And therefore any new spending that we are approaching, any new program that's going to have the longevity of 70, 80 years, should be offset by cutting spending that's less valuable."
"We need to at the very least have this debate. I will be offering up an amendment if this bill should come to the floor, but until then I will object," added Paul, who voted in favor of President Donald Trump's $1.5 trillion tax cut. That tax cut is helping drive a deficit increase.
Paul was not the only senator who objected to the attempt to pass the bill by unanimous consent on Wednesday. Sen. Mike Lee of Utah "alerted the cloakroom that he objected to the bill passing without a vote," Lee's communications director Conn Carroll told CNN.
Though he did not object on the Senate floor in response to Gillibrand's proposal, "he is seeking a vote to ensure the fund has the proper oversight in place to prevent fraud and abuse," Carroll added."""

In [127]:
class TrieNode:
    def __init__(self, character, prev=None):
        self.character = character
        self.parent_node = prev
        self.finished = False
        self.childrens = {}

    def __repr__(self):
        return f"{self.parent_node.character if self.parent_node else 'root'} ==> {self.character} ==> {' '.join(list(self.childrens.keys()))}"
    
    def insert(self, word):
        if word and len(word) > 0:
            first_letter = word[0]
            if self.childrens.get(first_letter) is None:
                self.childrens[first_letter] = TrieNode(first_letter, self)
            self.childrens[first_letter].insert(word[1:])
        else:
            self.finished = True
    
    def get_node(self, key):
        if key and len(key) > 0 and self.childrens.get(key[0]):
            return self.childrens[key[0]].get_node(key[1:])
        return self
    
    @property
    def word(self):
        ptr = self.parent_node
        word = [self.character]
        while ptr != None:
            word.append(ptr.character)
            ptr = ptr.parent_node
        return ''.join(word[::-1])
    
    def get_words(self):
        words = []
        for key, item in self.childrens.items():
            words += item.get_words()
            
        if self.finished:
            words.append(self.word)
        return words

In [128]:
class Trie:
    def __init__(self):
        self.childrens = {}
    
    def __repr__(self):
        return f"{' '.join(list(self.childrens.keys()))}"
    
    def insert(self, word):
        if word and len(word) > 0:
            first_letter = word[0]
            if self.childrens.get(first_letter) is None:
                self.childrens[first_letter] = TrieNode(first_letter)
            self.childrens[first_letter].insert(word[1:])
        
    def get_node(self, key):
        if key and len(key) > 0 and self.childrens.get(key[0]):
            return self.childrens[key[0]].get_node(key[1:])
        return None
    
    def add_paragraph(self, paragraph):
        for t in sent_tokenize(paragraph):
            for w in t.split():
                self.insert(w.lower())

In [129]:
t = Trie()
t.add_paragraph(paragraph)
t

r s p o w t a b f 9 h c u l n d i 1 v m j g e 4 k y - " $ 7 8

In [132]:
item = t.get_node('re')
item

r ==> e ==> p s a q

In [133]:
item.get_words()

['republican', "responders'", 'response', 'reassurances', 'requested']