-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make the EFF word list the default? #33
Comments
|
The other lists were included before the EFF list. The current default, But in the long run the EFF list would certainly make a better default. I will talk to the securedrop people if they can live with it. In fact I just waited for someone else to ask for EFF as default, so, thanks for this! |
|
Oh hey @ulif. I'm get guy who made that list for SecureDrop. SD moved away from the list I made to the EFF's list with this pr: freedomofpress/securedrop#1452. On those grounds (among others), I'd say it's probably worth it to use the EFF's list over the one I made. |
|
Hey @heartsucker , great to hear from you! Thank you for the feedback! If you can live with it, I would say we switch to the EFF list as default as well. FWIW I started another little project to check existing wordlists for diceware related flaws. Yet it is not really usable: https://github.com/ulif/diceware-list This Python project should generate two CLI scripts: a wordlist creator ( |
|
@ulif Yeah, by all means. Make the switch. |
|
Ooh, just spotted: the EFF list is len 7776, not 8192. Nice for dice, bad for computers. That changes the game. IIRC there are no 2^n list available from EFF. Hm, have to reconsider. |
|
Well either way, don't let any indebtedness influence your decision. :) |
|
Thank you very much, @heartsucker , well, I feel indebted to you anyway. |
|
hmm... this seems like a math problem we should be able to solve now... ie. how to map a X bit entropy source to a Y bit space... why can't we just mod(7776) the entropy source? |
|
@anarcat I think just using mod would change the distribution (because the lower numbers would get the ones above 7776 also, changing the probabilities, right?) @ulif You could make the EFF list default when using real dice, and keep the previous one for urandom. That would barely increase complexity and be much safer, right? |
|
Just an FYI, there is less than 0.5 bits of entropy difference between a list of size 7776 and 8192 at 6 words and less than 1 at 8 words. While @ulif is right that 8192 is better, it is only just barely better for human use. The actual SecureDrop list is even shorter that 7776 because we removed "scary" words from the list in this PR: freedomofpress/securedrop#1546. Even this was considered to be ok since the EFF list was considered more memorizable and still secure. |
well, when i mean modulo, i really mean to discard the reminder of a modulo division, or whatever... of course if you shuffle those discarded numbers back at the top you screw up the distribution, but if you discard the upper (say) 9 bits, you end up with a subset of the list that is 7680 words long and should be uniform. but i'm not a certified cryptographer either (and certainly not a stats geek :p) so take that with a grain of salt. |
|
@anarcat, I think @clawoflight is right. Oh, and you are right too, I think, if you discard bits. In fact that means you should simply discard values from random generators which are out of your scope (here: 1..7776). As random generators will normally give you 1 or 2 or N bits, you can create numbers in the scope 0..2^N-1. But I think you cannot "map" or "move" "unused" bits to the next number, "reuse" them or similar (without making the result "unfair" or less equally distributed). I must confess, @anarcat, I did not get how you got the number 7680 (but I am neither a certified cryptographer). The current implementation, however, copes with shorter wordlists already. So, you can already use the EFF list with "system" random. I should only check, whether this is done correctly (i.e. no mod(), instead discard out-of-scope values). Also @heartsucker is right: the entropy difference is not too much, about 0.0752 bits per word. It is just: I simply hate wasting entropy ;-) Thank you very much @anarcat, @clawoflight, and @heartsucker ! That was really helpful :D I actually tend to make EFF the default list and will close this issue with a respective commit later today. |
|
here are my numbers:
Stripping 8 bits still gives us too many combinations. Stripping 9 bits, not enough. But we just need to reduce the length of the list to Hopefully someone will confirm my math :) |
|
In the source code here, the call is delegated to |
|
right, that was my impression as well - i would also assume this is secure, but i'd be curious to see if the distribution is actually uniform... |
|
The docs here say that |
|
The question is not whether |
|
I think it's safe to say that the devs of the python core library have ensured that |
|
well, here's the source in any case... random.choice calls random._randbelow and, if i read this right, this ends up calling SystemRandom.getrandbits() which does the right thing: def getrandbits(self, k):
"""getrandbits(k) -> x. Generates an int with k random bits."""
if k <= 0:
raise ValueError('number of bits must be greater than zero')
if k != int(k):
raise TypeError('number of bits should be an integer')
numbytes = (k + 7) // 8 # bits / 8 and rounded up
x = int.from_bytes(_urandom(numbytes), 'big')
return x >> (numbytes * 8 - k) # trim excess bitsI'm not absolutely certain def _randbelow(self, n, int=int, maxsize=1<<BPF, type=type,
Method=_MethodType, BuiltinMethod=_BuiltinMethodType):
"Return a random int in the range [0,n). Raises ValueError if n==0."
random = self.random
getrandbits = self.getrandbits
# Only call self.getrandbits if the original random() builtin method
# has not been overridden or if a new getrandbits() was supplied.
if type(random) is BuiltinMethod or type(getrandbits) is Method:
k = n.bit_length() # don't use (n-1) here because n can be 1
r = getrandbits(k) # 0 <= r < 2**k
while r >= n:
r = getrandbits(k)
return ris that retry loop ( then i'll let you wonder if this 3.6 implementation was the same in Python 2.7 (or earlier!!). ;) |
|
Wow, this is escalating... in a nice way :) Overall it looks like Thank you, @anarcat for looking into the sources! One should really do that from time to time. The py2.7 sources BTW, have a different def choice(self, seq):
"""Choose a random element from a non-empty sequence."""
return seq[int(self.random() * len(seq))] # raises IndexError if seq is emptyHere At least I could not find critical flaws when generating respective numbers for some time tonight with the py27 Overall, I am glad they look okay for our purposes. Yes, @heartsucker, rolling own crypto would be really dangerous. Thank you @heartsucker, @clawoflight, and @anarcat for your valuable input! I will make the EFF list now the default. |
|
i wouldn't bet my life on "self.random()", it sure looks like it's taking stuff from a (python) internal RNG, and not from the system 'getrandom' CRNG.... but i have invested too much time in this already... i do strongly encourage you to test |
|
Thanks @anarcat for your time! I learned new tricks, we all got a new default wordlist and I looked into the C-code of Python random implementations for the first time :) I was wrong with the C implementation of def random(self):
"""Get the next random number in the range [0.0, 1.0)."""
return (long(_hexlify(_urandom(7)), 16) >> 3) * RECIP_BPFThis one apparently gets random bits from |
at first glance, it sure looks like the EFF word list is superior to the other alternatives. is there any reason why it's not the default list?
The text was updated successfully, but these errors were encountered: