Skip to content

Commit

Permalink
rework seedphrase
Browse files Browse the repository at this point in the history
  • Loading branch information
ecdsa committed Oct 2, 2016
1 parent a2023ef commit f94eca3
Showing 1 changed file with 89 additions and 55 deletions.
144 changes: 89 additions & 55 deletions seedphrase.rst
Original file line number Diff line number Diff line change
@@ -1,81 +1,123 @@
Electrum Seed Version System
============================

Electrum has been the first Bitcoin wallet to derive private keys from
a seed phrase made of English words. Early versions of Electrum used a
bidirectional encoding between seed phrase and entropy, requiring a
fixed wordlist.

Starting with version 2.0, Electrum derives its master key from a hash
of the UTF8 normalized seed phrase, in a way that does not depend on
the wordlist. This means that the wordlist can be updated without
breaking existing seeds, and that future wallet implementations will
not need to carry today's wordlists in order to be able to decode seeds
created today. The rationale is to minimize the cost of forward
comptibility.
This document describes the Seed Version System used in Electrum
(version 2.0 and higher).

Motivation
----------

Electrum was the first Bitcoin wallet to derive private keys from a
seed phrase made of English words. Early versions of Electrum (before
2.0) used a bidirectional encoding between seed phrase and
entropy. This type of encoding requires a fixed wordlist. This means
that future versions of Electrum must ship with the exact same
wordlist, in order to be able to read old seed phrases.

BIP39 was introduced two years after Electrum, in a way that broke
compatibility with existing Electrum seeds, because nothing was done
to prevent collisions. BIP39 seeds include a checksum, in order to
help users figure out typing errors. However, BIP39 suffers the same
shortcomings as early Electrum seed phrases:

- A fixed wordlist is still required. Following our recommendation,
BIP39 authors accepted to derive keys and addresses in a way that
does not depend on the wordlist. However, BIP39 still requires the
wordlist in order to compute its checksum, which is plainly
inconsistent, and defeats the purpose of our recommendation. This
problem is exacerbated by the fact that BIP39 proposes to create
one wordlist per language. This threatens the portability of BIP39
seed phrases.

- BIP39 seed phrases do not include a version number. This means that
software should always know how to generate keys and
addresses. BIP43 suggests that wallet software will try various
existing derivation schemes, within the BIP32 framework. This is
vastly inefficient, and it rests on the assumption that future
wallets will support all previously accepted derivation
methods. If, in the future, a wallet developer decides not to
implement a particular derivation method because it is deprecated,
then the software will not be able to detect that the corresponding
seed phrases are not supported, and it will return an empty wallet
instead. This threatens users funds.

For these reasons, Electrum does not generate BIP39 seeds. Starting
with version 2.0, Electrum uses the following Seed Version System,
which addresses these issues.


Description
-----------

Electrum 2.0 derives keys and addresses from a hash of the UTF8
normalized seed phrase, in a way that does not depend on the
wordlist. This means that the wordlist can be updated without breaking
existing seeds, and that future wallet implementations will not need
to carry today's wordlists in order to be able to decode the seeds
created today. This minimizes the cost of forward compatibility.

In addition, Electrum 2.0 seed phrases include a version number. The
purpose of the version number is to indicate how addresses and keys
are derived from the seed. Similar to keys derivation, the version
number is obtained by a hash of the UTF8 normalized seed phrase.

The version number is also used to check seed integrity; to be
correct, a seed phrase must produce a registered version number.
The version number is also used to check seed integrity; in order to
be correct, a seed phrase must produce a registered version number.


Seed phrase normalization
-------------------------

.. code-block:: python
normalized_seedphrase = mnemonic.prepare_seed(seed_phrase)
Note that the normalization function removes diacritics and
also spaces between asian CJK characters (this differs from
bip39).


Version number
--------------

The following hash is computed from the seed phrase:
The version number is a prefix of a hash derived from the seed
phrase. The length of the prefix is a multiple of 4 bits. The prefix
is computed as follows:

.. code-block:: python
s = hmac_sha_512("Seed version", normalized_seedphrase)
def version_number(seed_phrase):
# normalize seed
normalized = prepare_seed(seed_phrase)
# compute hash
h = hmac_sha_512("Seed version", normalized)
# use hex encoding, because prefix length is a multiple of 4 bits
s = h.encode('hex')
# the length of the prefix is written on the fist 4 bits
# for example, the prefix '101' is of length 4*3 bits = 4*(1+2)
length = int(s[0]) + 2
# read the prefix
prefix = s[0:length]
# return version number
return hex(int(prefix, 16))
The version number is a prefix of s. The length of the prefix is a
multiple of 4 bits:
The normalization function (prepare_seed) removes all but one space
between words. It also removes diacritics, and it removes spaces
between asian CJK characters.

.. code-block:: python

length = 4*(n+2)

where n is encoded on the first four bits of s.
For example, the prefix '0x101' is of length 12 bits = 4*(1+2)


List of reserved prefixes
-------------------------
List of reserved numbers
------------------------

The following seed types are used in Electrum.

======== ========= =============================
Prefix Type Description
======== ========= =============================
0x01 Standard P2PKH, single account
======== ========= =====================================
Number Type Description
======== ========= =====================================
0x01 Standard P2PKH and Multisig P2SH wallets
0x02 Segwit Reserved for Segwit
0x101 2FA Two-factor authenticated
======== ========= =============================
0x101 2FA Two-factor authenticated wallets
======== ========= =====================================


Seed generation
---------------

Seed generation requires to find a phrase whose hash has the desired
prefix. This can only be achieved by enumeration, so the existence of
that constraint does not decrease the security of the seed (up to the
cost of key stretching required to generate the private keys).
Seed generation requires to find a seed phrase with a hash that has
the desired prefix. This can only be achieved by enumeration. Thus,
the existence of that constraint does not decrease the security of the
seed (up to the cost of key stretching, that might be required to
generate the private keys).


Wordlist
Expand All @@ -84,11 +126,3 @@ Wordlist
Electrum currently use the same wordlist as BIP39 (2048 words).
A typical seed has 12 words and 132 bits of entropy.


Comparison to BIP39
-------------------

This system is not compatible with BIP39. BIP39 requires a
predetermined wordlist in order to compute its checksum.
BIP39 also lacks a version number.

0 comments on commit f94eca3

Please sign in to comment.