## Trie

It is tree like data structure where every node can have more any number of children.

In binary tree, every node have left,right and data element.
Here every node will have children and is_end bool flag in its structure.

class Node:
    def __init__(self):
        self.children = {}    # Keep it hash map and keep inserting elements one by one
        self.end = False      # Initially it will be False, but at the end of word it will be True 

(or)

class Node:
    def __init__(self):
        self.children = [None] * 26     # Keep it as array where 'a' will be at ind 0 and 'z' will be at ind 25
        self.end = False 

### 374. Construct a trie from scratch

A trie (pronounced as "try") or prefix tree is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker.

Implement the Trie class:

Trie() Initializes the trie object.
void insert(String word) Inserts the string word into the trie.
boolean search(String word) Returns true if the string word is in the trie (i.e., was inserted before), and false otherwise.
boolean startsWith(String prefix) Returns true if there is a previously inserted string word that has the prefix prefix, and false otherwise.
 

Example 1:

Input
["Trie", "insert", "search", "search", "startsWith", "insert", "search"]
[[], ["apple"], ["apple"], ["app"], ["app"], ["app"], ["app"]]
Output
[null, null, true, false, true, null, true]

Explanation
Trie trie = new Trie();
trie.insert("apple");
trie.search("apple");   // return True
trie.search("app");     // return False
trie.startsWith("app"); // return True
trie.insert("app");
trie.search("app");     // return True

In [2]:
# Here we took children as hash_map

class Node:
    def __init__(self):
        self.children = {}
        self.end = False

class Trie:
    def __init__(self):
        self.root = Node()

    # check whether curr is pointing to ith element or not,
    # If pointing then move to that location
    # else create new node and assign it as child of curr node and move to new node
    # Mark end = True to last node 
    def insert(self, word):
        
        curr = self.root
        for i in word:
            if i in curr.children:
                curr = curr.children[i]
            else:
                new = Node()
                curr.children[i] = new
                curr = curr.children[i]
        curr.end = True
        return None

    # Starting from root, keep checking all the char,
    # If at the end we are pointing to end of word node then return True else return False
    def search(self, word):
        curr = self.root
        for i in word:
            if i in curr.children:
                curr = curr.children[i]
            else:
                return False
        
        if curr.end == True:
            return True
        return False

    # Same as search, just no need to check end of last node
    def startsWith(self, prefix):
        
        curr = self.root
        for i in prefix:
            if i in curr.children:
                curr = curr.children[i]
            else:
                return False
        
        return True

# Insert: (For one word)
# Time comp: O(key_size)
# Space comp:O(26 * Key_size)

# Search: (For one word)
# Time comp: O(key_size)
# Space comp:O(1)

# Starts With: (For one word)
# Time comp: O(key_size)
# Space comp:O(1)

In [4]:
trie = Trie()
print(trie.insert("apple"))
print(trie.search("apple"))
print(trie.search("app"))
print(trie.startsWith("app"))
print(trie.insert("app"))
print(trie.search("app"))

None
True
False
True
None
True


In [None]:
# Here children taken as array

def insert(root,key):
    curr = root
    for i in key:
        index = ord(i)-ord('a')            # Find ind
        if curr.children[index] != None:
            curr = curr.children[index]
        else:
            new = TrieNode()
            curr.children[index] = new
            curr = curr.children[index]
    curr.isEndOfWord = True
    return None

def search(root, key):
    curr = root
    for i in key:
        index = ord(i)-ord('a')
        if curr.children[index] != None:
            curr = curr.children[index]
        else:
            return False
    
    if curr.isEndOfWord == True:
        return True
    return False

Variation of above implementation:


1) Trie(): Ninja has to initialize the object of this “TRIE” data structure.

2) insert(“WORD”): Ninja has to insert the string “WORD”  into this “TRIE” data structure.

3) countWordsEqualTo(“WORD”): Ninja has to return how many times this “WORD” is present in this “TRIE”.

4) countWordsStartingWith(“PREFIX”): Ninjas have to return how many words are there in this “TRIE” that have the string “PREFIX” as a prefix.

5) erase(“WORD”): Ninja has to delete one occurrence of the string “WORD” from the “TRIE”.

Note:
1. If erase(“WORD”) function is called then it is guaranteed that the “WORD” is present in the “TRIE”.


Sample Input 1:
1
5
insert coding
insert ninja
countWordsEqualTo coding
countWordsStartingWith nin
erase coding

Sample Output 1:
1
1   

In [None]:
class Node:
    def __init__(self):
        self.children = {}
        self.end_count = 0
        self.prefix_count = 0

class Trie:
    def __init__(self):
        self.root = Node()
        pass

    def insert(self, word):
        curr = self.root
        for i in word:
            if i in curr.children:
                curr = curr.children[i]
                curr.prefix_count += 1
            else:
                new = Node()
                curr.children[i] = new
                curr = curr.children[i]
                curr.prefix_count += 1
            
        curr.end_count += 1
        pass

    def countWordsEqualTo(self, word):
        curr = self.root
        
        for i in word:
            if i in curr.children:
                curr = curr.children[i]
            else:
                return 0
            
        return curr.end_count

    def countWordsStartingWith(self, word):
        curr = self.root
        
        for i in word:
            if i in curr.children:
                curr = curr.children[i]
            else:
                return 0
        return curr.prefix_count

    def erase(self, word):
        curr = self.root
        for i in word:
            if i in curr.children:
                curr = curr.children[i]
                curr.prefix_count -= 1
        curr.end_count -= 1
        pass

### 375. Find shortest unique prefix for every word in a given list

Given an array of words, find all shortest unique prefixes to represent each word in the given array. Assume that no word is prefix of another.

Example 1:

Input: 
N = 4
arr[] = {"zebra", "dog", "duck", "dove"}
Output: z dog du dov
Explanation: 
z => zebra 
dog => dog 
duck => du 
dove => dov 
Example 2:

Input: 
N = 3
arr[] =  {"geeksgeeks", "geeksquiz",
                       "geeksforgeeks"};
Output: geeksg geeksq geeksf
Explanation: 
geeksgeeks => geeksg 
geeksquiz => geeksq 
geeksforgeeks => geeksf

In [8]:
"""
In a trie node store children as well as number of children.
Make trie by inserting every word
during finding prefix for each word keep traversing trie and append that char in ans string.
If children count is <= 1 then return ans string.
"""

class Node:
    def __init__(self):
        self.children = {}
        self.children_count = 0

class Solution:
    def __init__(self):
        self.root = Node()
        
    def insert(self,word):
        curr = self.root
        for i in word:
            if i in curr.children:
                curr.children_count += 1
                curr = curr.children[i]
            else:
                curr.children_count += 1
                new = Node()
                curr.children[i] = new
                curr = curr.children[i]
        
    def findPrefix(self,word):
        str = ""
        curr = self.root
        
        for i in word:
            if curr.children_count <= 1:
                return str
            else:
                str = str + i
                curr = curr.children[i]
        return str
    
    def findPrefixes(self, arr, N):
        if N == 1:
            return []
        
        for i in arr:
            self.insert(i)
        
        ans = []
        for i in arr:
            ans.append(self.findPrefix(i))
        
        return ans
    
# Time comp:O(N * Avg len of words)
# Soace comp:O(N * Avg len of words)

In [9]:
s = Solution()
s.findPrefixes(["zebra", "dog", "duck", "dove"],4)

['z', 'dog', 'du', 'dov']

### 376. Word Break Problem | (Trie solution)

Given a string A and a dictionary of n words B, find out if A can be segmented into a space-separated sequence of dictionary words. 

Example 1:

Input:
n = 12
B = { "i", "like", "sam", "sung", "samsung","mobile","ice","cream", "icecream", "man","go", "mango" }, 
A = "ilike"
Output: 1
Explanation: The string can be segmented as "i like".

Example 2:

Input: 
n = 12 
B = { "i", "like", "sam", "sung", "samsung","mobile","ice","cream", "icecream", "man", "go", "mango" }, 
A = "ilikesamsung" 
Output: 1
Explanation: The string can be segmented as "i like samsung" or "i like sam sung".

In [12]:
class Node:
    def __init__(self):
        self.children = {}
        self.end = False

class Solution:
    def __init__(self):
        self.root = Node()
    
    # Insert every word
    def insert(self,word):
        curr = self.root
        for i in word:
            if i in curr.children:
                curr = curr.children[i]
            else:
                new = Node()
                curr.children[i] = new
                curr = curr.children[i]
        
        curr.end = True

    # Check whether given word is stored in trie as well as both ends together
    def isLeaf(self,word):
        curr = self.root
        for i in word:
            if i in curr.children:
                curr = curr.children[i]
            else:
                return False
        
        # After reading entire word, if curr.end is True it means that such individual word is exist.
        if curr.end == True:
            return True
        return False

    def check(self,word):
        n = len(word)
        if n == 0:
            return True
        
        for i in range(1,n+1):
            # Check for every partisan like [:i] is a separate word and remaining string is also valid
            if self.isLeaf(word[:i]) and self.check(word[i:]):
                return True
        return False

    def wordBreak(self, A, B):
        # Insert all the words in trie
        for i in B:
            self.insert(i)
        
        if self.check(A):
            return True
        return False
    
# Time comp: O(n * avg len + |A|^2)
# Space comp:O(|A|+k))    # |A| due to recursion, k = sum of length of all strings present in B

In [14]:
s = Solution()
s.wordBreak("ilike",["i", "like", "sam", "sung", "samsung","mobile","ice","cream", "icecream", "man","go", "mango"])

True

### 377. Given a sequence of words, print all anagrams together

Given an array of strings, return all groups of strings that are anagrams. The groups must be created in order of their appearance in the original array. Look at the sample case for clarification.
Note: The final output will be in lexicographic order.

Example 1:

Input: N = 5, words[] = {act,god,cat,dog,tac}
Output:
act cat tac 
god dog
Explanation:there are 2 groups of anagrams "god", "dog" make group 1. "act", "cat", "tac" make group 2.

In [None]:
"""
Sort a word's char and insert in trie
Al the end insert a index of original array

Again sort every word one by one and find them in trie, at the end will get list of index
Make group of all those index value and make sure then did's repeate so for that keep one visited array as well
"""

class Node:
    def __init__(self):
        self.children = {}
        self.array = []

class Solution:
    def __init__(self):
        self.root = Node()
        
    def insert(self,word,index):
        curr = self.root
        for i in word:
            if i in curr.children:
                curr = curr.children[i]
            else:
                new = Node()
                curr.children[i] = new
                curr = curr.children[i]
        
        curr.array.append(index)
        
    def find(self,word):
        curr = self.root
        for i in word:
            if i in curr.children:
                curr = curr.children[i]
            else:
                return []
        
        return curr.array
        
    def Anagrams(self, words, n):
        for i in range(len(words)):
            word = list(words[i])
            word.sort()
            word = "".join(word)
            
            self.insert(word,i)
        
        visited = [0] * len(words)
        ans = []
        for i in range(len(words)):
            if visited[i] == 0:
                word = list(words[i])
                word.sort()
                word = "".join(word)
                l = self.find(word)
                temp = []
                for j in l:
                    temp.append(words[j])
                    visited[j] = 1
                
                if len(temp):
                    ans.append(list(temp))
        
        return ans
    
# Time comp: O(N * S * logS)    S = avg length of each string
# Space comp:O(N * S)

### 378. Phone directory

Given a list of contacts contact[] of length n where each contact is a string which exist in a phone directory and a query string s. The task is to implement a search query for the phone directory. Run a search query for each prefix p of the query string s (i.e. from  index 1 to |s|) that prints all the distinct contacts which have the same prefix as p in lexicographical increasing order. Please refer the explanation part for better understanding.
Note: If there is no match between query and contacts, print "0".

Example 1:

Input: 
n = 3
contact[] = {"geeikistest", "geeksforgeeks", "geeksfortest"}
s = "geeips"

Output:
geeikistest geeksforgeeks geeksfortest
geeikistest geeksforgeeks geeksfortest
geeikistest geeksforgeeks geeksfortest
geeikistest
0
0

Explaination: 
By running the search query on contact list for "g" we get: "geeikistest", "geeksforgeeks" and "geeksfortest".
By running the search query on contact list for "ge" we get: "geeikistest" "geeksforgeeks" and "geeksfortest".
By running the search query on contact list for "gee" we get: "geeikistest" "geeksforgeeks" and "geeksfortest".
By running the search query on contact list for "geei" we get: "geeikistest".
No results found for "geeip", so print "0". 
No results found for "geeips", so print "0".

In [21]:
class Node:
    def __init__(self):
        self.children = {}
        self.list = []
        self.end = False

class Solution:
    def __init__(self):
        self.root = Node()
    
    def insert(self,word,index):
        curr = self.root
        
        # At every char, keep inserting index of original array as well.
        for i in word:
            if i in curr.children:
                curr = curr.children[i]
                curr.list.append(index)
            else:
                new = Node()
                curr.children[i] = new
                curr = curr.children[i]
                curr.list.append(index)
        curr.end = True
    
    def check(self,word):
        curr = self.root
        for i in word:
            if i in curr.children:
                curr = curr.children[i]
            else:
                return False
        
        return curr.list
    
    def displayContacts(self, n, contact, s):
        # insert all words
        for i in range(n):
            self.insert(contact[i],i)
        
        ans = []
        
        # For every prefix, get the list of indices and keep making ans array
        for i in range(1,len(s)+1):
            l = self.check(s[:i])
            if l == False:
                ans.append(list([0]))
                continue
                
            temp = []
            for j in l:
                temp.append(contact[j])
            
            temp = list(set(temp))     # Since they asked for unique and sorted ans that why doing it
            temp.sort()
            ans.append(list(temp))
        
        self.root = None
        return ans
    
# Time comp: O(|s| * n * max|contact[i]|)
# space comp: O(n * max|contact[i]|)

In [22]:
s = Solution()
s.displayContacts(3,["geeikistest", "geeksforgeeks", "geeksfortest"],"geeips")

[['geeikistest', 'geeksforgeeks', 'geeksfortest'],
 ['geeikistest', 'geeksforgeeks', 'geeksfortest'],
 ['geeikistest', 'geeksforgeeks', 'geeksfortest'],
 ['geeikistest'],
 [0],
 [0]]

### 379. Unique rows in boolean matrix

Given a binary matrix your task is to find all unique rows of the given matrix.

Example 1:

Input:
row = 3, col = 4 
M[][] = {{1 1 0 1},{1 0 0 1},{1 1 0 1}}
Output: 1 1 0 1 $1 0 0 1 $

Explanation: Above the matrix of size 3x4 looks like
1 1 0 1
1 0 0 1
1 1 0 1

The two unique rows are 1 1 0 1 and 1 0 0 1 .

In [16]:
class Node:
    def __init__(self):
        self.children = {}
        self.end = 0

# Insert every row and at the end increment end counter of last trie node
def insert(root,arr):
    curr = root
    for i in arr:
        if i in curr.children:
            curr = curr.children[i]
        else:
            new = Node()
            curr.children[i] = new
            curr = curr.children[i]
    curr.end += 1

# Traverse entire row and at the end if end counter is greater than 0 then return True and make it 0 
# so that same row doesn't get involved next time
def checkUnique(root,arr):
    curr = root
    for i in arr:
        if i in curr.children:
            curr = curr.children[i]
        else:
            return False
    
    if curr.end > 0:
        curr.end = 0
        return True
    else:
        return False

def uniqueRow(row, col, arr):
    root = Node()
    
    # Prepare a matrix from array
    matrix = []
    k = 0
    for i in range(row):
        temp = []
        for j in range(col):
            temp.append(arr[k])
            k += 1
        matrix.append(list(temp))
    
    # Insert every row
    for i in range(row):
        insert(root,matrix[i])
    
    # Check every row and insert unique row in ans array
    ans = []
    for i in range(row):
        if checkUnique(root,matrix[i]):
            ans.append(list(matrix[i]))
    
    return ans

# Time comp:O(Row * Col)
# Space comp:O(Row * Col)

In [17]:
uniqueRow(3,4,[1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1])

[[1, 1, 0, 1], [1, 0, 0, 1]]

### 502. Longest Word With All Prefixes || Complete String

A string is called a complete string if every prefix of this string is also present in the array ‘A’. Ninja is challenged to find the longest complete string in the array ‘A’.If there are multiple strings with the same length, return the lexicographically smallest one and if no string exists, return "None".

N = 4
A = [ “ab” , “abc” , “a” , “bp” ] 

Explanation : 

Only prefix of the string “a” is “a” which is present in array ‘A’. So, it is one of the possible strings.

Prefixes of the string “ab” are “a” and “ab” both of which are present in array ‘A’. So, it is one of the possible strings.

Prefixes of the string “bp” are “b” and “bp”. “b” is not present in array ‘A’. So, it cannot be a valid string.

Prefixes of the string “abc” are “a”,“ab” and “abc” all of which are present in array ‘A’. So, it is one of the possible strings.

We need to find the maximum length string, so “abc” is the required string.

In [None]:
"""
1. Make trie of all the words given.
2. Then check whether word is complete or not
3. keep max of them in ans
"""

class Node:
    def __init__(self):
        self.children = {}
        self.end = False

def insert(root,word):
    curr = root
    for i in word:
        if i in curr.children:
            curr = curr.children[i]
        else:
            new = Node()
            curr.children[i] = new
            curr = curr.children[i]
    curr.end = True
    return root

# Go through the each char of word and that char should be present in trie 
# as well it it should be end of some other string, means its end should be True
def isComplete(root,word):
    curr = root
    for i in word:
        if i in curr.children:
            curr = curr.children[i]
            if curr.end == False:
                return False
        else:
             return False
    return True
    
def completeString(n: int, a: List[str])-> str:
    root = Node()
    
    # Insert all the words in trie
    for i in range(n):
        root = insert(root,a[i])
    
    # Initially take ans as None
    ans = None
    
    # again gothrough every words and check whether all its prefix is present in array or not.
    for i in range(n):
        if isComplete(root,a[i]):
            # If its complete string then keep record of min of them
            if ans == None or len(ans) < len(a[i]):
                ans = a[i]
            # If more the one same length string then keep laxo smallest of them
            elif len(ans) == len(a[i]):
                ans = min(ans,a[i])
    return ans

# Time comp: O(N) * O(avg len of all words)

### 503. Number of Distinct Substrings in a String

Given a string of length N of lowercase alphabet characters. The task is to complete the function countDistinctSubstring(), which returns the count of total number of distinct substrings of this string.

Input:
ab
ababa

Output:
4
10

In [5]:
# Without trie:

def countDistinctSubstring(s):
    ans=set()
    n=len(s)
    for window in range(n):
        i=0
        for j in range(window,n):
            ans.add(s[i:j+1])
            i+=1
            
    return len(ans)+1

# Time comp:O(N^3)
# Space comp:O(2^N)

In [None]:
# Using Trie:

class Node:
    def __init__(self):
        self.children = {}

def insert(root,word):
    curr = root
    count = 0
    for i in word:
        if i in curr.children:
            curr = curr.children[i]
        else:
            new = Node()
            curr.children[i] = new
            curr = curr.children[i]
            count += 1                  # whenever new node inserted increase the count value
    
    return count

def countDistinctSubstring(s):
    root = Node()
    count = 0
    
    # Insert word by removing 1 char from begining in trie and keep the count of every new insertion in trie
    for i in range(len(s)):
        count += insert(root,s[i:])
    return count + 1            # plus 1 for considering an empty string as substring as well

# Time comp:O(N^2)