Instructions
Performance
Implementation Details
Screenshots
Run directly using jar:
- Git clone the repository.
- mvn clean install - to generate target resources.
- run using : java -jar target/Follow-The-White-Rabbit-0.0.1-SNAPSHOT.jar.
- enter the required inputs Eg . difficulty level , min word length , max words per anangram.
Run the Docker file:
Docker Image is attached , use the following commands to build and run image
docker build -t {dockerimagename}
docker run -i -t {dockerimagename}
Easy Key - 1 second , Difficult key - 38 seconds , Hard Key - about 9 minutes ( screenshots attached below )
Solution can be divided in 3 parts / 3 levels of optimization.
- Building a Minified Dictionary ( optimizing by discarding all the impossible words )
-
At this level , I've assumed that the following words are useless
- words whose length > length of the given phrase
- words whose length > min_word_length entered by the user
- words which contain characters other than those present in the given phrase
- words which contain characters that occur more than the number of times they've occurred in given phrase
-
with the above assumption we can easily narrow down to a few thousands of words.
For example if min_word_length is 5 , we can narrow down to about 998 words vs 99175 words. -
I've used Trie ( with hashmap ) to implement my dictionary. Ideally I would use DAWG ( directed acyclic word graphs ) if space is a concern because DAWGs reduce the number of nodes. Since we are aiming for lookup performance , I stuck to a trie , as it is easy to implement )
- Creating a service that generates possible anagrams using the Minified Dictionary ( optimizing by using the dictionary to generate phrases )
- At this level , instead of blindly going through all permutations using the letters of the phrase , we just use the minified dictionary to generate possible phrases.
- This approach improves the performance drastically
For example if min_word_length is 5 (998 words) and number of anangrams per phrase = 3 , we generate atmost 998 * 997 * 996 phrases vs 16! ( 16 is the no of letters in 'poultry outwits ants')
- Parallel processing using one thread for each letter in the phrase.
- The key is to return from all other threads if any one of them is successful at finding the key.
- Performance again depends on host machine's specifications. Mine was 4-core , 4GB RAM and it took about 9 minutes to find the hard key.