Put words in README.

divegeek · May 14, 2012 · 70c5947 · 70c5947
1 parent 857e784
commit 70c5947
Showing 1 changed file with 69 additions and 0 deletions.
diff --git a/README b/README
@@ -0,0 +1,69 @@
+First, credit where credit is due: This little project was sparked by
+the XKCD commit on password security (http://xkcd.com/936/).
+
+The comic is about the amount of entropy contained in a passphrase of
+a few short, common, memorabl words.  In the comic, the idea is to use
+these words in place of a shorter, but more complicated and less
+memorable password.  It occurred to me, though, that I manage lots of
+long numbers, and that it's hard to do.
+
+At work, there are bug ID numbers, changelist numbers, user ID
+numbers, etc.  These are all just long enough that they're not only
+not memorizable, they don't even stick in short-term memory well
+enough that I can glance at one, switch to a different tab or window,
+and retype it.  So I copy and paste a lot, which is fine, but
+sometimes the numbers are (helpfully!) hyperlinks, which makes
+selecting them for copying without clicking on them hard.
+
+In the rest of my life, there are lots of phone numbers.  In practice
+I keep them all in my phone so I never have to know what they
+are... but if I don't have my phone, I'm sunk, outside of a very small
+group of numbers that I keep memorized.
+
+So, what if... we actually defined a mapping from a dictionary of
+common words onto numbers, and then used that to build a nice number
+to word-sequence translation tool?  If the translation were built into
+all the appropriate places, we could just use either numbers or words,
+whichever is convenient.
+
+For example, where I work we have a web interface to the change
+management system.  If that web interface displayed a sequence of
+three simple words next to each change number, it would be really easy
+to shout a "number" over the cubicle wall and have it understood and
+easily typed.  In experimenting with it, I even found that although
+it's more keystrokes, I can type a sequence of simple words faster
+than a number.
+
+Then I started playing with phone numbers... and I found that I can
+map phone numbers onto a three-word sequence, making them really easy
+to remember.  For example, my old home phone number translates as
+"calm restore utterly".  I find that far easier to remember than
+801-479-0406 (note that I don't know if someone has that number; don't
+call and bother them, please, you won't get me!).  Of course, if you
+live in Utah you don't really have to "remember" 801 -- because nearly
+all of the numbers are 801... but the same holds true here.  Pretty
+much all 801 numbers translate to strings with "calm" as the first
+word.  So someone would really only have to remember "restore
+utterly" to know my number.
+
+A mobile phone interface using this could use the standard type-ahead
+features, but restricted to the known dictionary, so you could
+probably type my number with "ca re ut", tapping each full word when
+it pops up.
+
+So, this is a Java implementation of the concept, complete with two
+dictionaries, "large" and "small".  Large has 4096 words so each word
+represents 12 bits.  This allows all US phone numbers to be
+represented with three words, but to get 4096 common words, I had to
+reach into slightly longer (up to 8 letters) and less well-known
+words.
+
+The small dictionary contains 1626 words.  This number was chosen so
+that three words can represent the full range of a 32-bit integer.
+Some phone numbers require four words with this, but if you're working
+with slightly smaller numbers, it's great because the words are
+shorter and more common.
+
+In both cases I tried to weed out homonyms and words which are often
+hard to spell.  I also excluded offensive words.  I'm sure more can be
+done to improve these dictionaries, but they're pretty usable.