Skip to content

Commit 1cab791

Browse files
committed
Trie Book analysis added
1 parent 6b93788 commit 1cab791

File tree

2 files changed

+38
-5
lines changed

2 files changed

+38
-5
lines changed

Trie_Data_Structure.md

+31-1
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,38 @@
11
# Trie Data Structure
22

3-
Trie is an efficient information reTrieval data structure. Using Trie, search complexities can be brought to optimal limit (key length). If we store keys in binary search tree, a well balanced BST will need time proportional to M * log N, where M is maximum string length and N is number of keys in tree. Using Trie, we can search the key in O(M) time. However the penalty is on Trie storage requirements.
3+
Trie is an efficient information reTrieval data structure. Trie also called digital tree, radix tree or prefix tree is a kind of search tree—an ordered tree data structure used to store a dynamic set or associative array where the keys are usually strings. Using Trie, search complexities can be brought to optimal limit (key length). If we store keys in binary search tree, a well balanced BST will need time proportional to M * log N, where M is maximum string length and N is number of keys in tree. Using Trie, we can search the key in O(M) time. However the penalty is on Trie storage requirements.
44
<br>
55

6+
## Efficiency of Trie
7+
8+
The complexity of creating a trie is O(W*L), where W is the number of words, and L is an average length of the word: you need to perform L lookups on the average for each of the W words in the set.
9+
<br>
10+
In the associated [code](Trie_Data_Structure.py), I have used a text file with more than 7 million words (A file created by concatinating text from Books like - [Gone with the wind](http://biblioteka.kijowski.pl/mitchell%20margaret/gone%20with%20the%20wind.pdf) and few others). Creating the Trie Data structure takes O(m*n) time complexity.
11+
But searching takes only O(L) complexity where L is the length of word you are going to search.
12+
<br>
13+
14+
### Analysis
15+
16+
If you run the associated [code](Trie_Data_Structure.py) , you must get something like:
17+
18+
```python
19+
20+
Reading the File
21+
Calculating the number of words
22+
No of words: 7171744
23+
>>TIME: 1.359375
24+
25+
Creating Trie Data Structure
26+
Trie Successfully Created
27+
>>TIME: 24.203125
28+
29+
enter the word to search: the
30+
{'No of times word occurs': 407360, 'The word exists': True}
31+
>>TIME: 0.0
32+
33+
```
34+
where the time is in **fractional seconds** and measured using **time.process_time()** in python.
35+
636
## Advantages of Tries
737

838
<br>

Trie_Data_Structure.py

+7-4
Original file line numberDiff line numberDiff line change
@@ -41,23 +41,26 @@ def search(word,root):
4141
current = node
4242
return {"The word exists" : current.endofword,"No of times word occurs" : current.count}
4343

44-
44+
start = time.process_time()
4545
print("Reading the File")
4646
store = open("store_for_trie.txt","r",encoding="utf8")
4747
text = store.read()
4848
words = text.split()
4949
print("Calculating the number of words")
5050
print("No of words: " + str(len(words)))
51-
51+
print(">>TIME: "+str(time.process_time() - start))
52+
53+
starts = time.process_time()
5254
print("Creating Trie Data Structure")
5355
for word in words:
5456
insert(word)
5557
print("Trie Successfully Created")
58+
print(">>TIME: "+str(time.process_time() - starts))
5659

60+
startss = time.process_time()
5761
tosearch = input("enter the word to search: ")
5862
print(search(tosearch,root))
59-
60-
# to simply try the trie data structure uncomment below lines
63+
print(">>TIME: "+str(time.process_time() - startss))
6164

6265
# insert("HelloBhavin")
6366
# insert("Hell")

0 commit comments

Comments
 (0)