GitHub - jpeak5/prefixMatching

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
ArrayStorage.java		ArrayStorage.java
PrefixMatcher.java		PrefixMatcher.java
README		README
Trie.java		Trie.java
dictionary.txt		dictionary.txt

Repository files navigation

Anthony Legendre
Jason Peak
Michael Taboada

Submission Contents:
PrefixMatcher.java
ArrayStorage.java
Trie.java
dictionary.txt

Essentially, this program takes a prefix input and returns the results using an Array-based storage class and a
Trie-based storage class. After receiving the input, the program runs these two implementations in sequence so that
we may compare their speed and space efficiency.

ArrayStorage.java utilizes an array of strings, which can expand to fit all the strings needed by any given list of
words. The structure uses binary search in an alphabetically sorted text file to find the range of words that match the
prefix input. This range of words is added to a linked list and the list is displayed as output.

Trie.java is a tree of node objects in which each node contains an ArrayList containing references to all the children
of the node in addition to a single character key and an indication of the node's status as a leaf. (Whether or not it
is a leaf.) At the root of the Trie, the structure searches for matches by starting with the first character in the
prefix. If a match is found in the ArrayList containing the node's childeren, the structure uses binary search in the
ArrayLists of children for the next characters in the search query until we fail. If the structure succeeds in matching
all of the characters of input, the last node object encountered in the search is used as the root of the subtree of
word matches. Then, the structure recursively traverses the subtree (subTrie) and reverse the results to produce the
word match.

Results: While the array-based storage takes up slightly less storage space than the trie; the Trie is significantly
faster than the array in returning the search for a match.

Instructions:
To compile this program, type

javac *.java

With that done, run the program by typing java prefixMatcher <dictionary>, where the single
argument is the name of a dictionary file consiting of n words, one word per line. We have
included the file Dictionary.txt for this purpose. This dictionary file contains ~250k words
randomized from the linux file /usr/share/dict/words, and it may take up to a minute to load completely.

The simplest command line, then, is

java PrefixMatcher dictionary.txt