Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prounouncable random string library? #488

Closed
christianlundkvist opened this issue Oct 31, 2015 · 11 comments

Comments

@christianlundkvist
Copy link

commented Oct 31, 2015

Hi all! I wasn't able to find any reference to the library or algorithm that's used to create the pronounceable random strings like ~satnet-rinsyr-silsec-navhut--bacnec-todmeb-sarseb-pagmul for a 128-bit number. Is this an external library or something developed in-house?

Best,
Christian

@Xe

This comment has been minimized.

Copy link

commented Oct 31, 2015

https://github.com/urbit/urbit/blob/master/urb/zod/arvo/hoon.hoon#L1100

names = "dozmarbinwansamlitsighidfidlissogdirwacsabwissibrigsoldopmodfoglidhopdardorlorhodfolrintogsilmirholpaslacrovlivdalsatlibtabhanticpidtorbolfosdotlosdilforpilramtirwintadbicdifrocwidbisdasmidloprilnardapmolsanlocnovsitnidtipsicropwitnatpanminritpodmottamtolsavposnapnopsomfinfonbanporworsipronnorbotwicsocwatdol�magpicdavbidbaltimtasmalligsivtagpadsaldivdactansidfabtarmonranniswolmispallasdismaprabtobrollatlonnodnavfignomnibpagsopralbilhaddocridmocpacravripfaltodtiltinhapmicfanpattaclabmogsimsonpinlomrictapfirhasbosbatpochactidhavsaplindibhosdabbitbarracparloddosbortochilmactomdigfilfasmithobharmighinradmashalraglagfadtopmop�habnilnosmilfopfamdatnoldinhatnacrisfotribhocnimlarfitwalrapsarnalmoslandondanladdovrivbacpollaptalpitnambonrostonfodponsovnocsorlavmatmipfap"
endings = "zodnecbudwessevpersutletfulpensytdurwepserwylsunrypsyxdyrnuphebpeglupdepdysputlughecryttyvsydnexlunmeplutseppesdelsulpedtemledtulmetwenbynhexfebpyldulhetmevruttylwydtepbesdexsefwycburderneppurrysrebdennutsubpetrulsynregtydsupsemwynrecmegnetsecmulnymtevwebsummutnyxrextebfushepbenmuswyxsymselrucdecwexsyrwetd�ylmynmesdetbetbeltuxtugmyrpelsyptermebsetdutdegtexsurfeltudnuxruxrenwytnubmedlytdusnebrumtynseglyxpunresredfunrevrefmectedrusbexlebduxrynnumpyxrygryxfeptyrtustyclegnemfermertenlusnussyltecmexpubrymtucfyllepdebbermughuttunbylsudpemdevlurdefbusbeprunmelpexdytbyttyplevmylwedducfurfexnulluclennerlexrupnedlecrydlydfenweln�ydhusrelrudneshesfetdesretdunlernyrsebhulrylludremlysfynwerrycsugnysnyllyndyndemluxfedsedbecmunlyrtesmudnytbyrsenwegfyrmurtelreptegpecnelnevfes"

def split_len(seq, length):
    return [seq[i:i+length] for i in range(0, len(seq), length)]

prefix = split_len(names, 3)
suffix = split_len(endings, 3)

def ipv4tourbit(ip):
    ip = map(lambda x: int(x), ip.split("."))
    return "~%s%s-%s%s" % (prefix[ip[0]], suffix[ip[1]], prefix[ip[2]], suffix[ip[3]])
$ python -i names.py 
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__', 'endings', 'ipv4tourbit', 'names', 'prefix', 'split_len', 'suffix']
>>> ipv4tourbit("8.8.8.8")
'~fidful-fidful'
>>> ipv4tourbit("127.0.0.1")
'~palzod-doznec'
@christianlundkvist

This comment has been minimized.

Copy link
Author

commented Oct 31, 2015

Cool, thanks! Would you mind if I use the names + algo in a project of mine?

@juped

This comment has been minimized.

Copy link
Contributor

commented Oct 31, 2015

This is unique to urbit, though the Python above will probably be easier to understand than the Hoon if you're new. It's basically just base-256 numbers with some extra fanciness.

See also https://groups.google.com/d/msg/urbit-dev/yTsrEGe9gso/FgLczcg0ocQJ

Like the rest of Urbit, it's MIT-licensed. Note that your strings won't necessarily be compatible with ours unless you scramble the 17 through 32 bit range the same way we do, and the scrambling algorithm will be changed soon (at which point it will be frozen forever).

@Xe

This comment has been minimized.

Copy link

commented Nov 1, 2015

I have been planning on using these name patterns as part of another project of mine, hence making the above python to help translate the lists into json.

@christianlundkvist

This comment has been minimized.

Copy link
Author

commented Nov 1, 2015

Thanks for the help @juped and @Xe! Thanks also for the Google groups writeup, that made it more clear. Looks like @Xe's Python above is "Unscrambled @p" then? The reason I'm asking this is I was looking for a way to encode Ethereum addresses and other related identity-hashes in a more easy-to-remember form (right now they are just hex strings). I might use @Xe's Python for now to play around with and then maybe check back in later for the finalized version of the scrambling.

the Python above will probably be easier to understand than the Hoon if you're new

Hehe, this is definitely true 😁

@chc4

This comment has been minimized.

Copy link
Contributor

commented Nov 1, 2015

Since you are just encoding hashes, you shouldn't have to worry about scrambling. We only do it because ship @p addresses leak heirarchy information.

You might want to replace 'dor' for 'por' and 'fyp' for 'fap' in @Xe's snippet, though, to avoid having your own ~porned-fapped problem :p

@juped

This comment has been minimized.

Copy link
Contributor

commented Nov 1, 2015

Indeed, there is also phoneme censorship coming soon!

If they're more than 64 bits, the thing with scrambling won't even cross your radar. Even if not, it's only necessary if your numbers need to be compatible with ours. (Which isn't impossible, but I wouldn't worry too much.) I find that four "words", or 64 bits, is about the right size for memorability, though.

@Xe

This comment has been minimized.

Copy link

commented Nov 1, 2015

This might also make for a good password generator.

@christianlundkvist

This comment has been minimized.

Copy link
Author

commented Nov 1, 2015

Great points, thanks! Yes, I'll make sure to use the "clean" version of the algorithm so I don't have to put up the "Parental Advisory: Explicit phonemes" disclaimer, haha 😄

Yeah, scrambling don't seem to be necessary here (the hashes are normally 16-32 bytes in length), but I'm thinking about perhaps adding an extra word as a checksum to protect against typos (or permutations) in case someone wants to type the words in manually in an air gapped system or similar. These hashes are often the recipient addresses of cryptocurrency payments, so typos can mean 💸 .

This might also make for a good password generator.

This is a very good point. I'm wondering if these short nonsense words are more memorable than normal words chosen randomly (Diceware-style). In cryptocurrency/blockchain land it's common to encode say a 128-bit seed hash using 12 words from a chosen wordlist. Could be that it's easier to remember the 8 nonsense words above than the 12 "normal words".

@oblivia-simplex

This comment has been minimized.

Copy link

commented Nov 1, 2015

I've already stolen the idea as my own, personal password generator:

https://bpaste.net/show/91409ffd187b

On Sat, Oct 31, 2015 at 9:35 PM, Christian Lundkvist <
notifications@github.com> wrote:

Great points, thanks! Yes, I'll make sure to use the "clean" version of
the algorithm so I don't have to put up the "Parental Advisory: Explicit
phonemes" disclaimer, haha [image: 😄]

Yeah, scrambling don't seem to be necessary here (the hashes are normally
16-32 bytes in length), but I'm thinking about perhaps adding an extra word
as a checksum to protect against typos (or permutations) in case someone
wants to type the words in manually in an air gapped system or similar.
These hashes are often the recipient addresses of cryptocurrency payments,
so typos can mean [image: 💸] .

This might also make for a good password generator.

This is a very good point. I'm wondering if these short nonsense words are
more memorable than normal words chosen randomly (Diceware-style). In
cryptocurrency/blockchain land it's common to encode say a 128-bit seed
hash using 12 words from a chosen wordlist. Could be that it's easier to
remember the 8 nonsense words above than the 12 "normal words".


Reply to this email directly or view it on GitHub
#488 (comment).

@juped juped added the question label Nov 2, 2015

@galenwp

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2016

This is a great thread, but I think we can close it.

@galenwp galenwp closed this Feb 8, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.