Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wishlist for BIP32/39/43/44/49 and SLIP44 replacement #103

Closed
Sjors opened this issue Aug 27, 2017 · 94 comments
Closed

Wishlist for BIP32/39/43/44/49 and SLIP44 replacement #103

Sjors opened this issue Aug 27, 2017 · 94 comments

Comments

@Sjors
Copy link

Sjors commented Aug 27, 2017

It would be useful to have list of requirements for a future replacement of the standards around hierarchical deterministic wallets and other uses of deriving keys from a mnemonic. Let me know if this is not the right place.

In order to turn it into a BIP / SLIP it needs more feedback and I'll need to make it bit less opinionated :-)

A good place to start is the mnemonic word list. @Arachnid suggested several improvements a while ago (see below). Even though an updated word list will have overlap with the BIP39 list, new mnemonics can be generated in such a way to guarantee incompatibility with existing wallets. This intentional incompatibility provides an opportunity to change other rules.

I do not believe this is urgent, so there's time to do this thoroughly and develop a standard that last for a while.

Word list and incompatibility rule

Criteria @Arachnid used in his word list generator draft:

  • all words are 4-8 characters
  • all 4-character prefixes are unique (very useful for hardware wallets)
  • no two words have edit distance < 2

Wallets need to be able to distinguish between the old and new standard, so un-upgraded BIP 39 wallets should consider all new mnemonics invalid. At the same time, some new wallets may not wish to support BIP39. They shouldn't be burdened with storing the old word list.

A solution is to sort the new word list such that reused words appear first. When generating a mnemonic, at least one new word must be present. A wallet only needs to know the index of the last BIP39 overlapping word. They reject a proposed mnemonic if none of the elements use a word with a higher index.

Other coins and versioning

BIP 44 is too detailed. E.g. it doesn't make much sense for non-UTXO coins such as Ethereum. It's also not very flexible, leading to the creation of BIP49 to add SegWit support. BIP43 on the other hand is too permissive. This makes it difficult for wallets to properly advertise their compatibility.

The community for each coin is probably most suited to figure out their own derivation scheme below the coin type level. I propose the following rules for coins to be accepted into the standard and for wallets to be able to claim compatibility.

  1. the coin needs to have a BIP-like process and their derivation must be specified in such a "BIP" (which in turn could be as simple as "same as Bitcoin but with different coin type")
  2. a wallet which claims to support this new SLIP-[X] standard can choose which coins to support
  3. if a wallet claims to support SLIP-[X], then for each coin it supports, it must support this standard specify otherwise
  4. SLIP-[X] starts at version 1.0 and wallets should communicate this version
  5. New coin types can be added without a new version
  6. Once a coin is accepted, it must wait for a new SLIP-[X] before allowing coins to be deposited on an address where existing wallets would not look
  7. wallets which claim to be compatible with version N are assumed to be compatible for all coins they support, unless otherwise specified
  8. coins may not change their standard in such a way that new wallets don't look in places where old wallets may have left coins

See also SLIP-0010 for non-secp256k1 coins.

GAP limit, etc

The rules for which addresses to scan should be coin specific.

Bitcoin

This discussion might be more appropriate for a BIP proposal, but I'm just putting it out there.

In my own experience the current limit of 20 has downsides. It may be a reasonable performance trade-off, but this should be evaluated.

There is often a delay between when a wallet user sends an address and when they receive payment. Sometimes they never receive payment. There are services such as exchanges which require you to give them an address, but you may end up never using it. For privacy reasons a wallet should not reuse such an address anywhere else.

As with BIP 44, change addresses don't need a GAP limit. Unless someone objects.

It would be a better user experience is empty accounts can be allowed, e.g. max 3 (again, assuming there's no unacceptable performance issue).

Perhaps by the time this standard goes live, all wallets default to SegWit. But if not, I suggest that when a wallet scans:

  • the receive chain: for each index, check the P2PKH address first. If nothing is found, check the P2SH-P2WPKH address. Once it finds coins on a P2SH-P2WPKH address, it should only check the P2SH-P2WPKH address moving forward
  • a separate bech32 receive chain. Since a wallet user needs to interact with older wallets, having a separate chain might be more practical then checking P2SH-P2WPKH and bech32 variants for each index
  • one change chain: for each index, check the P2PKH first, then the P2SH-P2WPKH address, then the bech32 address. Stop checking P2WPKH once you find a P2SH-P2WPKH address, stop checking that once you find a bech32 address

Ethereum

Again, more of an EIP discussion, but just one thought: consider hardened derivation for each independent "account". Private keys can be exported and this is often useful when different wallets have strongly differentiated features and development is in flux.

Mnemonic to a seed derivation

Can this can be improved? @sipa might have some ideas regarding error correction. Representing the words as integer values rather than literal strings might add more flexibility. I like how bech32 allows a wallet to pinpoint the location of a typo. Similarly it would be nice if it can pinpoint which word is wrong and suggest the right one. For 12 word mnemonics it's surprisingly easy to type a wrong word and still get a valid mnemonic (but an empty wallet).

The minimum number of words could also be reconsidered, but there is a trade-off regarding the likeliness that someone actually writes it down.

Encryption

12-24 word mnemonics are great for new users, but they're not great if someone gets their hand on your piece of paper. It would be nice if the seed can also be exported in a BIP38-like encrypted fashion, perhaps printed as a QR code. More generally, it should be possible to take advantage of hierarchical deterministic wallets without having to use the mnemonic.

Account / address (hardened) derivation

Can this be improved?

I vaguely remember some Bitcoin Core developers having doubts . @luke-jr do you remember who / why?

Duress passphrase

Personally I'm skeptical about this feature and I think it just confuses people. For duress, wouldn't it be better for software to suggest a slight variant of the mnemonic that's easy enough to remember?

Removing that feature would allow more flexibly in the derivation algorithm.

Other languages

I don't think it's a good idea to map word lists in other languages directly to the seed. This could create accidental vendor lock-in if only one wallet supports a certain language. I suggest mapping each word to English or directly to an integer value. It doesn't have to be the same meaning.

If a foreign language mnemonic supporting wallet ever becomes abandoned, the community can create a printable sheet with the mapping of each foreign word to the corresponding English word (again, meanings don't have to match at all).

In addition to a list of universal criteria, it may be useful to have an approval process for each new language. For example some sort of testimony from a linguist, or a native speaker with significant experience in bitcoin. Every language has its quirks which leads to things to avoid (e.g. tons of homonyms in Mandarin) and things to embrace (e.g. many 2 character words in Mandarin).

Other purposes

E.g.:

  • password generation
  • pointing to data on a distributed filesystem (hash of public key points to a resource, private key decrypts it)
  • etc

There should be a way to plug these new applications in. Perhaps through redefining "coin type" as "coin or application type"?

Name

I would suggest giving this standard a name that's as easy to recognize as USB. BIP44 caught on a little bit within the bitcoin tech savvy community, but it's not great to have a name tied to a specific BIP/SLIP number, even with versioning.

Other issues?

What else should be considered?

@prusnak
Copy link
Member

prusnak commented Aug 27, 2017

BIP39: I am already thinking about creating a standard that will supersede BIP39. I want to support Shamir Secret Sharing Scheme (M out of N), where old mnemonic is just special case when M=1 and N=1.

BIP 44 is too detailed.

That is the main feature of it. We wanted to BIP44-compatible actually mean something. If you had one standard that would do normal addresses, segwit addresses, etc. You would need to distinguish between various variants of BIP44 and I think it is better to say, this is a BIP44+BIP49 compatible wallet than saying BIP44-normal and BIP44-segwit.

In my own experience the current limit of 20 has downsides.

Trust me on that increasing this limit or removing it completely is a suicide.

But if not, I suggest that when a wallet scans ...

I don't agree. I think we should treat P2PKH, P2SH-P2WPKH and Bech32 as separate chains, because they ARE separate address chains. You are introducing a lot of logic and hopefully most of this won't be necessary in the future once we migrate all coins to Bech32 addresses. If we did your way, we'd need to keep this logic forever.

Account / address (hardened) derivation: Can this be improved?

I don't think so. You would not be able to use XPUB for account.

Duress passphrase

Agree that this is bad. Plausible deniable passphrase is much better and already implemented.

Other languages

I was against foreign langauges wordlists in BIP39 and still am. The new mnemonic standard should have English only words.

@saleemrashid
Copy link

saleemrashid commented Aug 27, 2017

Haven't read this all yet but I wouldn't mind something where we could encode further information. e.g. A SSSS scheme where the mnemonics have the first word as shamir and the second word encoding the m for the m-of-n.

EDIT: Didn't even read the start of @prusnak's response, ignore me.

@Sjors
Copy link
Author

Sjors commented Aug 28, 2017

@prusnak regarding separate receive chains: makes sense. Question: can P2SH-P2WPKH be derived from bech32?

Shamir Secret Sharing Scheme (M out of N) sounds really useful. That can be done on the existing BIP39 word set as well as a new set I assume? Does that get easier if mnemonic words are mapped to integer values instead of the literal strings they are now?

@Arachnid
Copy link

Agreed on pretty much everything in the initial post. I'd also suggest that new wordlist should also ensure all words have unique metaphone codes.

I personally quite like the idea of deriving the seed from the sequence numbers of the words rather than their text; this makes it possible to express the same mnemonic in different languages/dictionaries.

A while ago I wrote up a spec on 'extended mnemonics' that can encode additional data; an adaption of this may be useful for a future BIP39 replacement.

@saleemrashid
Copy link

/cc @ecdsa

@saleemrashid
Copy link

@Sjors A BIP39 mnemonic is the encoding of (number of words × 11 ÷ 8) bytes (which includes a 4 byte checksum) which would be what you want to encode for SSSS (as having (m - 1) parts would be dangerous if you encode the mnemonic itself)

@saleemrashid
Copy link

saleemrashid commented Aug 30, 2017

I think we should definitely version the mnemonic and make it incompatible with BIP39 and implementations should check that they support that version, else refuse to import it. Then we can add newer features with reduced risk of people not being able to import it in, e.g. 5 years, because they can't remember which piece of broken software they used.

Also, Electrum doesn't check the checksum for BIP39 mnemonics which is very problematic (because even if we use a new word list, Electrum would accept it as BIP39)

@ecdsa
Copy link

ecdsa commented Aug 30, 2017

@saleemrashid Electrum does check the bip39 checksum (in git master).

@Sjors Electrum has versioned mnemonic seeds. I tried to propose this idea to the trezor team years ago, but they rejected it because it was going to slow down the commercialization of their product. If you want versioned seeds, you should use the Electrum standard instead of creating a new one.

@prusnak
Copy link
Member

prusnak commented Aug 30, 2017

@ecdsa Could you point me to BIP standard where is your seed documented?

@ecdsa
Copy link

ecdsa commented Aug 30, 2017

@prusnak http://docs.electrum.org/en/latest/seedphrase.html
There is no BIP at this point, but we can create one

@prusnak
Copy link
Member

prusnak commented Aug 30, 2017

Thank you for the link, no need to create BIP, though, because I do think we still need to create a new standard, which will take best of the both worlds and also adds M-of-N SSSS into the mix.

Also I don't like the fact the version stored in mnemonic defines the derivation scheme. IMO mnemonic should only encode entropy (or entropies) and a way how to generate master private key from it, not the derivation scheme.

I think this is the main philosophical difference between your and our approach. I see the benefits of your solution (you don't have to try several schemes), but at the same time I really like that our seed is "upgradable" and I think this feature is much better than having to try several schemes (and you only do this once - during restore procedure).

Let's take the current SegWit situtation. If you created Electrum seed 2 years ago and buried it in the garden 500 km from your house, you would need to go to that place today again to store the new SegWit-enabled seed. How about next year when native SegWit and Bech32 is widely used? What if you want to use the seed to generate U2F or SSH keys? My example is far stretched, maybe your garden is just 25 km from your house, but I think that illustrates my point.

I'd like to invite you to drafting a new standard with me, if we find a way how to make the seed upgradable (or if you decide that not storing a derivation scheme in seed is a good idea).

@saleemrashid
Copy link

Also I don't like the fact the version stored in mnemonic defines the derivation scheme.

What do you mean by derivation scheme? Are you talking about BIP-0032 or BIP-0044/49?

@prusnak
Copy link
Member

prusnak commented Aug 30, 2017

I'm talking about this: http://docs.electrum.org/en/latest/seedphrase.html#list-of-reserved-numbers

Not sure if Electrum follows BIP-32/BIP-44/BIP-49 for these particular cases.

@saleemrashid
Copy link

saleemrashid commented Aug 30, 2017

Totally agree, there's no reason why you need to encode anything more than the entropy in the mnemonic.

@prusnak
Copy link
Member

prusnak commented Aug 30, 2017

I am talking about the fact that once mnemonic encodes something more than the entropy (because it defines WHAT to do next with that entropy in order to derive keys to be used), you need to generate a new one every time you want to use it for something new.

@Arachnid
Copy link

Arachnid commented Aug 30, 2017

also adds M-of-N SSSS into the mix

Please let's not hardcode just that, though - personally I'd much prefer a system like the one I linked, that permits different types of mnemonic, with different means of deriving the secret data.

I am talking about the fact that once mnemonic encodes something more than the entropy (because it defines WHAT to do next with that entropy in order to derive keys to be used), you need to generate a new one every time you want to use it for something new.

I agree, but I would like to be super clear that every (mnemonic type, network) tuple should have a single canonical derivation defined by an extension proposal. We're currently suffering from a glut of different derivation paths in Ethereum, and I wouldn't wish it on anyone.

@saleemrashid
Copy link

saleemrashid commented Aug 30, 2017

@Arachnid The way I see it, Bitcoin can be an "application" for this new derivation scheme and Ethereum can be another "application". Then we can have something like m/bitcoin/<coin_type>/<...> for Bitcoin, Testnet, Litecoin, etc. and m/ethereum/<chain_id>/<...> for Ethereum. (Of course, we won't use strings for it). Sounds good?

@Arachnid
Copy link

Yes, I agree - I'm just saying that we should do everything we can to make sure the derivation path(s) can be known based on the code and the context it's used in; any ambiguity there will lead to multiple competing derivation paths.

@saleemrashid
Copy link

@Arachnid Agreed. I've always thought it was a mistake to use BIP-0044 for Ethereum (and other non Bitcoin-like coins).

@Sjors
Copy link
Author

Sjors commented Sep 4, 2017

@saleemrashid wrote:

A BIP39 mnemonic is the encoding of (number of words × 11 ÷ 8) bytes (which includes a 4 byte checksum)

Just to clarify, my concern is with this sequence: N × 11 ÷ 8 bytes -> N words -> pbkdf2(words + passphrase) -> key space. This makes the key space depend on the specific language, which creates a compatibility risk for non-english languages (and I find it inelegant, but that's not a strong argument).

I would like to see the following sequence instead: N words -> M bytes -> key space, where I don't have any preference as to whether M = N × 11 ÷ 8 or some other scheme. This allows for different ways to create the M bytes, e.g. words in another language, SSSS, something like BIP38 or a combination.

If people really want a passphrase, that could go either in the step N words -> M bytes or in the step M bytes ->key space.

@Arachnid
Copy link

Arachnid commented Sep 5, 2017

@Sjors I agree entirely.

@Sjors
Copy link
Author

Sjors commented Nov 16, 2017

@prusnak wrote:

In my own experience the current limit of 20 has downsides.

Trust me on that increasing this limit or removing it completely is a suicide.

The ValueShuffle proposal seems to require never using an address it it was revealed in a failed mixing attempt.

There's probably more use cases like this.

@roconnor-blockstream
Copy link
Contributor

I would like a BIP-39 replacement to have the property that it be plausible (although not necessarily easy) to derive a master seed by hand, by rolling dice, or flipping coins, etc. For BIP-39 there are many reasonable ways of transforming dice rolls or coin flips into a uniform selection of of choices of words, and I don't think any particular method need be prescribed. However, the SHA-256 based checksum in BIP-39 is what kills the ability to generate a master seed by hand.

A BCH code, such as the one used for Bech32 in BIP-173 provides a checksum that I believe can be plausibly computed by hand, though I suppose I need to validate this by trying it.

For Bech32, the pencil and paper algorithm for generating a checksum works by operating on a string of 6 characters from the bech32 alphabet, starting with the expanded HRP data (which is a fixed value and can be safely precomputed and published). The checksum computation works by appending one data character to the end of this 6 character buffer and removing a character from the other end of the 6 character buffer. Using a precomputed printed lookup table of 32 entries (Table 1), one for each Bech32 character, one finds the corresponding entry for the removed character and it will have an associated 6 character (from the bech32 alphabet) entry. One needs to combine this value, with the current 6 character buffer using a second table (Table 2). Table 2 contains a 32x32 "addition" table for Bech32 characters. Working character-wise one "adds" each character from the current buffer to the corresponding character from the entry picked out from Table 1. This "sum" becomes the new 6 character buffer.

This process repeats until all the data character have be processed. After that, there is a bit of post processing work where you process 6 more 'q' characters in the same way and then you need to change the last character according to a third table, and you are done. The resulting 6 characters is the Bech32 checksum.

My point of writing out the above is to illustrate that it is plausible to do such computation by hand and still get a powerful error-correcting checksum like what Bech32 uses. Creating a checksum by hand isn't too dangerous. The failure scenario is that the checksum computed is incorrect. One would be expected to test one's master seed in the hardware device one is using before committing to it, or otherwise repeat the computation independently 2 or 3 times. In any case, no one would be forcing one to generated master seeds this way; I only want it to be possible for sophisticated users to be able to do this sort of calculation. The process doesn't even need to be documented as far as I'm concerned.

From what you have written above, it sounds like you were leaning towards using such a checksum anyways, which is great!

@saleemrashid
Copy link

@roconnor-blockstream Take a look at https://github.com/satoshilabs/slips/blob/master/slip-0039.md for the current status of this.

@roconnor-blockstream
Copy link
Contributor

Thanks. Where is an appropriate place for me to comment on Slip-39?

Given the 10-bit word list, I think it would be nice to simply replace GF(256) with G(32) and make everything a multiple of 5 bits except for the result of the key derivation function. (We would want to round the master seed size to 130-bit / 255-bit (or maybe 230-bit). Then you could drop in Bech32's 30-bit checksum, since it too works over G(32) (though I do feel like having a checksum on the master seed itself is a bit overkill). Everything could plausibly be computed by hand (while I wouldn't expect anyone to do SSSS for by hand for any threshold other than M=1, I do think SSSS could be done by hand this way without more effort than a typical World War 2 spy would expend on a hand-cypher of that era.)

With 10-bit words, there is a canonical mapping between words and pairs of bech32 characters, allowing one to have a choice between compressed encoding (bech32) and word-list encodings of the share data, which is a nice property of your word list.

I think having the option of constructing master seeds without the use of modern digital computer is important, but not everyone may agree on this point.

@saleemrashid
Copy link

saleemrashid commented Jan 5, 2018

@roconnor-blockstream Bear in mind that it can be used for secret data that isn't a BIP32 master secret, so we don't want to put unnecessary restrictions on the length of the secret.

I agree with your last point, but I don't think you need to be able to generate the recovery seed.
Being able to provide the entropy would be sufficient and you can always verify that the entropy gives the expected recovery seed. If you cannot do that for whatever reason, generating the recovery seed without a computer is equally useless because you cannot verify that private keys are being derived correctly.

@roconnor-blockstream
Copy link
Contributor

I don't think anything I suggested would put any unnecessary restrictions on the length of secret data (though I'm unclear if by "secret" you mean the master secret or the resulting seed).

I agree with your last paragraph. But just to emphasize, there is a huge difference between trusting a device such as a Trezor with correctly computing deterministic private keys and deterministic signatures versus trusting such a device to produce genuine random data.

But yes, A college suggested it would be sufficient to write an app for whatever hardware device will be holding the secret anyways that computes checksums from data. As far as I'm aware there are no such apps for the Trezor, etc. Perhaps the onus is on me to create these apps if I think it is so important. There are other schemes where you could enter random data and get a certificate that proves that the given data was incorporated into the device's generation of the random seed, which would also work. Although ideally those certificates would need be such that one could validate them by hand, and I don't know if that is the case.

Still I'm a big fan of Bech32. I think error correcting codes is be great way for storing share data and why not use an ECC that you'll already be needing anyways?

@saleemrashid
Copy link

computing deterministic private keys and deterministic signatures versus trusting such a device to produce genuine random data

I think you misunderstood what I was saying.

To be clear, you can currently verify that entropy the connected computer provided (either from the CSPRNG or your own entropy) was used combined with the TREZOR's internal entropy because the device displays the internal entropy it used.

But I'm talking about a new feature that would allow you to securely provide all of the entropy. The TREZOR would compute the checksum and output a recovery seed as usual, but you could verify that the entropy you provided was used by the TREZOR.

As far as I'm aware there are no such apps for the Trezor, etc.

There are a huge number of BIP39 implementations and any of those should allow you to generate (the same) mnemonic from the entropy.

Still I'm a big fan of Bech32. I think error correcting codes is be great way for storing share data and why not use an ECC that you'll already be needing anyways?

Totally agree but, even if we switch to ECC, the usage of PBKDF2 and AES would still cause an issue for you.

@jb55
Copy link

jb55 commented Jan 5, 2018 via email

@prusnak
Copy link
Member

prusnak commented Jan 5, 2018

I think it would be nice to simply replace GF(256) with G(32)

There are already nice implementations of GF(256), so I would rather stick to these, not go with GF(32).

Bech32's 30-bit checksum

This is just too much, effectively adding 3 extra words to the share.

I think having the option of constructing master seeds without the use of modern digital computer is important, but not everyone may agree on this point.

You can create a seed using a dice or whatever other means. It will not be a valid BIP39 seed, but you can still use it. I don't think you want to create SLIP39 mnemonics by hand.

why not use an ECC?

For mnemonics you don't want to use ECC. Mnemonic generated from a wordlist is itself already an error correcting code (if you see a word acadrrnic - it's most probably the word academic). Also you don't want to use ECC, because it would help an attacker to reconstruct the seed if they don't have the full information, which is something you don't want.

Also as Saleem indicated above, in case of TREZOR - two sources of entropy are mixed, so it is very hard for an attacker to meaningfully exploit both processors at the same time.

@prusnak
Copy link
Member

prusnak commented Jul 25, 2018

Don't reinvent the wheel.

Why? I am the author of BIP39 and in the next iteration of the mnemonic standard (SLIP39) I want to change some minor stuff in the old design. Your simple code change is a very good proof that it should be trivial for any software which implements BIP39 to also implement SLIP39. The only reason why NOT to change the PBKDF2 parameters is that someone already has a working cracking rig for BIP39 and they want to keep using that to crack SLIP39 too. So unless you give a VERY good reason why we should stick to old params, I'll keep them changed.

@Steve132
Copy link

Steve132 commented Jul 25, 2018 via email

@clarkmoody
Copy link
Contributor

Requirements on engineers discourage adoption

But user demand encourages adoption. The ability to split the HD seed into Shamir Shares will be so compelling that I see SLIP-0039 getting adopted very quickly.

@Steve132
Copy link

Steve132 commented Jul 25, 2018 via email

@prusnak
Copy link
Member

prusnak commented Jul 26, 2018

So why reinvent the wheel?

The reason why I wanted the change from PBKDF2-HMAC-SHA512 to PBKDF2-HMAC-SHA256 is that SHA512 is MUCH faster on 64-bit processors than on 32-bit processors, making it easier to bruteforce on desktop computers (as opposed to a computation that takes place on an embedded hardware). SHA256 does not have that big difference.

@prusnak
Copy link
Member

prusnak commented Jul 26, 2018

@roconnor-blockstream what do you think about 947f4b9 ? This is a slight advantage that it is possible to convert existing BIP39_seed + one of its passphrases into a new SLIP39_split_seed that produces the same seed using the same passphrase (it will produce different seeds with different passphrases though ...)

@roconnor-blockstream
Copy link
Contributor

Looks good to me. I like having the option to migrate to an SSS key without necessarily having to re-key everything and/or move all my coins.

@roconnor-blockstream
Copy link
Contributor

human-readable part (hrp) of Bech32 is "SLIP0039"

The hrp for Bech32 should be lowercase otherwise you are going to have a bad time. When you make this change, you might want to also lowercase the same string in the salt to keep them matching.

@clarkmoody
Copy link
Contributor

clarkmoody commented Jul 26, 2018 via email

@roconnor-blockstream
Copy link
Contributor

roconnor-blockstream commented Jul 26, 2018

Yes but mixed case is not allowed in Bech32 and all characters are converted to lowercase, including the HRP, prior to computing the checksum. Effectively, the uppercase and lowercase forms of an HRP are equivalent and the lowercase form is canonical. I think casually putting the uppercase form into the spec is liable to cause implementation errors where programmers use the uppercase form in the checksum computation, especially in this case where the Bech32 alphabet isn't being used.

I mean, maybe I'm an idiot, but that's what I started to do before I caught myself.

@prusnak
Copy link
Member

prusnak commented Jul 26, 2018

Looks good to me. I like having the option to migrate to an SSS key without necessarily having to re-key everything and/or move all my coins.

I still might revert this change. The reason is that AES is a block cipher and you can recover the first half of the master seed only by having the first half of the first share (in 1-of-2 scenario). Stream ciphers are even worse in that regard. Also there are couple of other reasons why it's not a good idea, but it would take a lot of time to explain.

@prusnak
Copy link
Member

prusnak commented Jul 26, 2018

lowercase form is canonical

Fixed in 52bb7c6

@roconnor-blockstream
Copy link
Contributor

roconnor-blockstream commented Jul 26, 2018

I still might revert this change. The reason is that AES is a block cipher and you can recover the first half of the master seed only by having the first half of the first share (in 1-of-2 scenario).

I think the AES is fine. Here is my argument (for the 256 master secret case). Barring a change of replacing GF(256) with GF(2^256), if an attacker manages to get the first half of M shares, they can recover the first half of the master secret no matter what. Whether you use PBKDF2 or you use AES, in both cases, the attacker needs to do 2^128 work to recover the full master seed, so the two procedures offer effectively the same security.

At most you need to recommend the master seed be hashed as the first step of any application (like bip-32 does).

@prusnak
Copy link
Member

prusnak commented Jul 26, 2018

My argument for not using this method is the following:

  1. I am using BIP39 seed with passphrase A, I don't think I use any other passphrases
  2. I migrate to SLIP39 using the passphrase A
  3. I destroy BIP39 backups, because I don't need them anymore
  4. after few years I realize that I also used a passphrase B with BIP39 seed for storing something special
  5. I am screwed

This is going to happen and this scenario is very strong reason why we should be very explicit about moving stuff to a new mnemonic, although it might be less comfortable.

@roconnor-blockstream
Copy link
Contributor

I'm confused by your scenario. Where does passphrase B fit into the story?

@prusnak
Copy link
Member

prusnak commented Jul 26, 2018

point 4. is BIP39 seed + passphrase B

@roconnor-blockstream
Copy link
Contributor

So you are arguing for using AES because it allows a user to keep their master seed when migrating to SLIP39?

@roconnor-blockstream
Copy link
Contributor

Okay I think I get what you are saying. You meant "4. after few years I realize that I also used a passphrase B the BIP39 mnemonic from 1 for storing something special."

@prusnak
Copy link
Member

prusnak commented Jul 26, 2018 via email

@roconnor-blockstream
Copy link
Contributor

roconnor-blockstream commented Jul 26, 2018

But how does using PBKDF2 over AES help?

In your scenario you are going to create a brand new SLIP39 derived seed, spend a lot of time, effort and money to rekey all your infrastructure, move all your funds, probably eroding your financial privacy in doing so. Then you will toss out your old BIP39 mnemonic and still end up realizing later that you hand something else using passphrase B with the discarded mnemonic.

@Steve132
Copy link

Steve132 commented Jul 27, 2018 via email

@jhoenicke
Copy link
Contributor

@Steve132 the point is that sha512 is much slower on 32-bit arm and other similar architectures than sha256, because the architecture is not suited for this hash. sha512 uses 8 64 bit values, which can't be kept in registers making the operations much slower. This means that a single iteration is ten times slower for sha512, while on other hardware sha512 is sometimes faster than sha256. If we want to keep the iterations low enough for the hardware wallets, this means that we gain about 10 times more security by switching to sha256 with 10 times more iterations.

Slip39 and BIP39 are already incompatible because of other changes, e.g., the master secret is not reinterpreted as a BIP39 mnemonic with a bip39 english word list.

AES vs. PBKDF only:

Both allow multiple wallets with passphrases. AES allows in addition to reuse one existing wallet with some arbitrary passphrase. However then you obviously can't choose what wallets are generated for the other passphrases.

@prusnak I like the possiblity of encoding a previous wallet with slip39 with an additional passphrase protection. Whether or not a wallet should allow you to export your existing wallets as slip39 is a different issue. Some wallets that don't have multiple accounts with plausible deniability may want to use this feature.

@prusnak
Copy link
Member

prusnak commented Jul 27, 2018

However then you obviously can't choose what wallets are generated for the other passphrases.

@roconnor-blockstream There is another hidden problem. There are some applications (namely U2F, but password manager is another one) which do not use passphrase by design (there were technical reasons for this), so people use two passphrases even if they don't know they do. The passphrase A, passphrase B scenario from above applies to this as well. (just replace 4 with "I want to use new SLIP39 seed with U2F/password manager). But I agree this is an implementation issue, not an issue of the standard.

@roconnor-blockstream
Copy link
Contributor

Those people using a passwordless BIP39 HD-wallet are using the same master seed as their U2F application? So they could port both their master seed and U2F service to AES-SLIP39? Just making sure I understand.

@prusnak
Copy link
Member

prusnak commented Jul 27, 2018

Those people using a passwordless BIP39 HD-wallet are using the same master seed as their U2F application?

Yes, the same BIP39 seed + passphrase = HD-wallet, the same BIP39 seed + no passphrase = U2F, password manager, etc.

So they could port both their master seed and U2F service to AES-SLIP39?

Yes, but it's not so trivial to understand you have to do it twice. Personally, I think that it's easier to just unlink the U2F device. The password manager is a bigger problem though.

@roconnor-blockstream
Copy link
Contributor

roconnor-blockstream commented Jul 27, 2018

I see. I can't help but see enormous benefits in allowing the possibility of porting one master seed to a SLIP39 encoding. I'm not sure how broadly you are hoping SLIP39 to be supported throughout the ecosystem, but allowing for the possibility of porting one master seed doesn't necessarily obligate implementations to support this porting for all applications. For example, an application may just choose only to allow porting of passwordless BIP39 seeds to avoid dragging users into confusing technical issues.

AES-SLIP39 would allow some other valuable features. For example, users who are using a single password AES-SLIP39 could change their password (in case they think it has been compromised) by generating a new set of shares, without having to rekey their whole infrastructure or move their funds. (Same issues for users who use multiple passphrases applies here though; not all application can be moved this way).

Perhaps you guys want to write a public key encrypted export/import function for your password manager for migration by securely sweeping that data to a new device? I don't know; perhaps this isn't so easy to do.

@Steve132
Copy link

Steve132 commented Jul 30, 2018 via email

@prusnak
Copy link
Member

prusnak commented Jul 30, 2018 via email

@roconnor-blockstream
Copy link
Contributor

roconnor-blockstream commented Aug 2, 2018

I'm not a cryptographer but Wikipedia suggests you can use a (3 or) 4 round Luby–Rackoff construction to construct a pseudo-random permutation from a pseudo-random function (like SHA256 is assumed to be): https://en.wikipedia.org/wiki/Feistel_cipher#Theoretical_work

This would be like your AES idea, but, if I understand correctly, get you a full permutation over all 256-bits.

I suppose you'd use the PBKDF to generate the four rounds of sub keys, then use HMAC-SHA256 for the F function in each round.

@prusnak
Copy link
Member

prusnak commented Aug 3, 2018

This would be like your AES idea, but, if I understand correctly, get you a full permutation over all 256-bits.

Thanks! We'll investigate this. AES giving permutation over just 128-bits was one of the main reasons why I was hesitant to use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests