-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avoid using PRNG? #21
Comments
I would not consider adding this to the site. Using a long string is essentially a brainwallet, and we all know how many problems there's been with that. Thanks for the input and the reply on reddit. |
This came up again on reddit: https://www.reddit.com/r/Bitcoin/comments/5abf1f/mentor_monday_october_31_2016_ask_all_your/d9fbjq8/ Again I referred the poster here. Using a long string isn't essentially a brain wallet unless you attempt to memorize the long string, which I don't think these people are intending to do. The use case is the following: "I want to generate a secure 12 word phrase. I don't trust the pseudo-random number generator in Android, or in Chrome, so I'm going to roll a 6 sided dice 100 times and type the rolls in to make a long string. I want to generate my 12 word seed from that 100 digit number, then throw the number away. Then I know my 12 word seed is properly random and not limited to the range produced by a potentially broken PRNG." I personally did exactly that to generate my 24 word seed, except that I used playing cards rather than dice to generate my "long string". I believe the resulting seed is safer than if I had generated it in a web browser, since I trust the randomness of my shuffling more than I trust whatever Google Chrome uses. Anyway, I just thought I'd comment again since you seem open to suggestions and may have simply misunderstood the use case the first time I brought this up. I'm not suggesting that people should use it as a "brain wallet". You could require the "long string" to be at least 100 characters or something to make sure it's not something they are going to try to remember. That allows for 100 dice rolls or 52 two-character playing cards. |
Since this has been requested multiple times, I will add it. I think combined with zxcvbn or some other measure of strength it should allow users to appreciate if they're shooting themselves in the foot. Using .toMnemonic() as per your implementation should work well. Is sha256 of the entropy standard? For example, bip32jp uses the raw entropy. I would prefer to do that rather than a 'magical' step of sha256 the phrase. Using a sha256 means all entropy will be 12 words, perhaps giving the wrong impression of how strong the phrase is with low entropy. What's your thoughts? Thanks for persisting with this and helping make this tool great. |
I think each set of 3 words gives 32 bits of entropy (and 1 bit of checksum). So 12 words is 128 bits. Wouldn't that mean that sha256 (256 bits) gives enough entropy to generate 24 words (assuming the input to the hash has enough entropy)? Edit: another redditor looking for a secure way of generating offline keys: https://www.reddit.com/r/Bitcoin/comments/5am3vn/what_is_more_secure_bitaddress_bulk_wallet_or/ -- I pointed him your way. |
This feature is now live. See c6624d5 It's mostly compatible with bip32jp entropy, see the test for compatibility and caveats. There are a lot of tests specifying how this feature works, see tests.js lines 1963-2568 Thanks for the suggestions and persistence with pointing out feature requests from users. |
That's great, thanks. Now I can point people at your tool and not have to worry about the entropy generation. A few initial thoughts for what they're worth:
|
Great insights.
So, changes resulting from this are:
|
Why do you want this 1:1 mapping? Nobody ever wants to convert their mnemonic back to the entropy that generated it.
Take a string with 50 bits of entropy. Put it though sha256. Use the result to generate a mnemonic. So long as the mnemonic is 6 words or longer it has 50 bits of entropy, no?
I don't know that it is. I don't care that the 24 word mnemonic that I generated from a shuffled deck of cards doesn't have the full 256 bits of entropy. It has enough bits for my purposes, and that's all I care about. The (hypothetical) wallet I was importing it into only accepted 12 or 24 word mnemonics, and 12 wasn't enough to capture all the entropy in the shuffled deck. |
No. A mnemonic generated from the sha256 of a single bit does not contain 50 bits of entropy. So even though a mnemonic with six words or more should imply at least 50 bits of entropy, it may not. This ambiguity can be removed if raw entropy is used, and not hashed entropy. Users should be able to trust the relationship between entropy length, mnemonic length and mnemonic strength. Using hashed entropy breaks that trust. From the BIP39 Motivation: "It's not a way to process user-created sentences (also known as brainwallets) into a wallet seed." Is it fair to replace 'user-created sentences' with 'weak entropy'? As always I am open to suggestions here but I am very wary of introducing weak entropy. |
I was referring to the mnemonic generated by this recipe:
So long as that specific mnemonic is 6 words or longer it has 50 bits of entropy, no?
Some user-created sentences have weak entropy. Some have strong entropy. They are generally weaker than the creator would guess though. Allowing me to create a wallet from 100 sixes is worse than letting me create a wallet from a long enough nonsense sentence (see 'correct horse battery staple'). The important thing is to estimate the amount of entropy in the input and disallow the user of weak entropy that way. |
Your points have caused me to realize the mistake in my logic. I have been thinking mnemonic strength is directly related to the entropy length and thus the number of words. But the strength is only at most as strong as the number of words. It may be weaker. You can't know just from the number of words in the mnemonic how strong it is. Assuming a 12 word mnemonic always has 128 bits of underlying entropy can be dangerous. So I will also add to the todo list
Thanks for working through it with me. I'll be working on these changes over the next few days. |
Thanks for taking my feedback as it was intended rather than getting offended by it. My goal is only to have your tool be as great as it can be. Thanks also for all your hard work on this very useful project. |
That's looking pretty good now. For what it's worth:
That's all for now. When I can only find 2 things to complain about you're doing it right! ;) Edit:
|
Good suggestions, these changes have been made and are live. I also changed the way entropy is truncated, using modulo instead of truncating from the right. This was the result of conversations with /u/kinoshitajona and improves compatibility with bip32jp.github.io Changes made:
|
That's better, thanks. I notice when I enter a shuffled deck of 52 cards it tells me I have 296 bits of entropy, but I calculate it as 225.5 bits:
I guess you are assuming I am picking 52 cards out of the deck with replacement, but I'm not. The first card is 1 in 52, but the 2nd is only 1 in 51, etc. I'm thinking it would be useful if your tool could detect typos in the deck. I expect to type each card exactly once. If I typo one of the cards it would be easy to detect, because you'd end up with a duplicate. Maybe tag "(full deck)" on the end of the "filtered entropy" report if it's a full deck, or "duplicate 2D" or "missing 6S" if there's a duplicate or missing card. Not that it's necessarily an error to have a missing or duplicate card, but for people using a full shuffled deck it is, and having the tool check these things is quite an effective safeguard against typos. It wouldn't detect the transposition of two cards of course, but that's unlikely to happen when typing a deck in. In general you have the same problem as my "6666666" example earlier. Your tool tells me that "6666666" has 18 bits of entropy when it doesn't really. Without knowing how I generated those 7 digits you really can't say how much entropy they have (though you can guess and be pretty sure that I just hit the same key seven times). |
Great suggestions, I'm enjoying the subtlety of this entropy issue.
zxcvbn has limits, for example a perfectly sorted deck isn't detected as weak. Nor is |
I agree. It's a never-ending job to try to detect all 'obvious' patterns. Better not to start down that road.
As an aside, I have been using the following one-line shell script to generate shuffled decks for testing:
Obviously for generating proper entropy I'd use a physical deck, but I find that command useful for testing your new code. |
I just missed the 'comment' button and hit 'close and comment' by mistake. I'll reopen it now. Sorry! |
Actually it's worse than that. By default it is generating a 27 word mnemonic from a full deck. But 226 bits should only give 21 words. Looks like it's using the full 52*5.7 = 296 bits to decide how many words to create. |
I should have done it 'right' from the start by putting it in the entropy library rather than screw around mutating it in the ui logic...
And the correction to converting cards to binary is in
I'm not sure about the calculation of the binary value for cards. It scales the 296 bit value down to 226 bits - source code and illustration:
|
To be continued using separate issues as they arise. See #33. |
I recently saw a reddit post in which the author was expressing concern about entropy collection in your bip39 code.
I replied here https://www.reddit.com/r/Bitcoin/comments/4fomsb/how_secure_are_the_hd_wallets_generated_by_bip39/d2bid2j with a modified version of your site in which the user can enter a long string which is hashed to provide the 'randomness' rather than using the PRNG. This allows the user to generate an HD wallet by rolling dice, flipping coins, or just hammering on the keyboard for ten minutes.
Is this something that you would consider adding to the site?
I can see it could be misused. If you use a short memorable phrase then it is just as insecure as a traditional single key 'brain wallet' - but for people who know what they are doing it allows them to use true physical randomness, avoiding any worry that their PRNG is somehow weak.
My change is here: dooglus@bed8b774
The text was updated successfully, but these errors were encountered: