Skip to content

sidmontu/anagram-prime

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

ANAGRAM CHECKER

Inspired from the Reddit thread: https://www.reddit.com/r/math/comments/6hb0xk/clever_algorithm_to_determine_whether_or_not_two/

IDEA

Proposed Algorithm to find if two input words are anagrams: Map each of the 26 english characters to a unique prime number. Then calculate the product of the string. Two strings are anagrams if and only if their products are the same.

Current best ways of doing the above, in order of popularity/performance:

  1. Count frequency of each letter in each word, and then simply compare. If the strings are anagrams, the frequency bins will be identical.
  2. Sort letters in each word. The resultant strings of each word will be exact. [NOT Implemented in this project]

The proposed algorithm is arguably slower, and also susceptible to integer overflow. Will require 128-bit integer arithmetic at least to guarantee all words can be supported Source.

But the following optimization might help practical use-case: Map primes efficiently to each letter by frequency of occurrence in the english language.. example, 'e' is the most frequently found alphabet in text, so map 'e' to the smallest prime number, i.e. 'e' = 2.

RUNNING/COMPILING

You need PAPI library installed for performance profiling. The Makefile is very hacky, so you might have to edit that too, depending on your unix environment. Tested on Ubuntu 16.04 with PAPI 5.5.1.

Type make, then run ./anagram_checker <word1> <word2> e.g. ./anagram_checker hello olleh

PRELIMINARY RESULTS

$> ./anagram_checker hello olleh

[FREQ] 0.035254 microseconds/evaluation

[PRIME] 0.008961 microseconds/evaluation

[PRIME-ORD] 0.009217 microseconds/evaluation

hello and olleh are anagrams


$> ./anagram_checker hydroxydeoxycorticosterones hydroxydesoxycorticosterone

[FREQ] 0.062869 microseconds/evaluation

[PRIME] 0.048360 microseconds/evaluation

[PRIME-ORD] 0.048361 microseconds/evaluation

hydroxydeoxycorticosterones and hydroxydesoxycorticosterone are anagrams

TO-DO

  1. Some sort of dictionary run to profile performance with increasing/decreasing word lengths?
  2. Sorting algorithm described above.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published