# 3. Wallets

At a high level, a wallet is an application that serves as the primary user interface that control access to a user's money, managing keys and addresses, tracking the balance, and creating and signing transactions.
A common misconception about bitcoin is that bitcoin wallets contain bitcoin. Bitcoin wallets contain keys, not coins.

Each user has a wallet containing keys. Users sign transactions with the keys, thereby proving they own the transaction outputs (the coins). The coins are store on the blockchain in the form of transaction outputs.

There are two primary types of wallets, distinguished by whether the keys they contain are related to each other or not. 

The first type is a *nondeterministic wallet*, where each key is independently generated from a random number, as shown in the previous sections. These keys are not related to each other.

The second type is a *deterministic wallet*, where all the keys are derived froma single master key, known as the *seed*. All the keys in this type of wallet are related to each other and can be generated again if one has the original seed. There are a number of different key derivation methods used in deterministic wallets. The most commonly used derivation method uses a tree-like structure and is known as a *hierarchical deterministic* or *HD* wallet.

Deterministic wallets are initialized from a seed. To make them easier to use, seeds are encoded as English words.

## Nondeterministic (Random) Wallets

In the first Bitcoin wallet (now called Bitcoin Core), wallets were collections of randomly generated private keys. Such wallets are being replaced with deterministic wallets because they are cumbersome to manage, back up, and import.
The disadvantage of random keys is that generating a lot of them means that they must be copied, meaning that the wallet must be backed up frequently.
Each key must be backed up or the funds it controls are irrevocably lost if the wallet become inaccessible.

## Deterministic (Seeded) Wallets

Deterministic or seeded wallets are wallets that contain private keys that are all derived from a common seed, through the use of a one-way hash function.
The seed is randomly generated number that is combined with other data, such as an index number or "chain code" to derive the private keys.
The seed is sufficient to recover all the derived keys, and therefore a single backup at creation time is sufficient. The seed is also sufficient for a wallet export and import, allowing an easy migration of all the user's keys between wallet implementations.

### HD Wallets (BIP-32/BIP-44)

The most advanced form of deterministic wallet is the HD wallet defined by the BIP-32 standard. HD wallets contain keys derived in a tree structure, such that a parent key can derive a sequence of children keys, each of which can derive a sequence of grandchildren keys, and so on.
HD wallets offer two major advantages over random (nondeterministic) keys.
1. First the tree structure can be used to express additional organizational meaning. Such as when a specific branch of subkeys is used to receive incoming payments and a different branch is used to receive change from outgoing payments.
2. Users can create a sequence of public keys without having access to the corresponding private keys. This allows HD wallets to be used on a insecure server or in a receive-only capacity, issuing a different public key for each transaction.

### BIP-39 Mnemonic Codes

HD wallets are very good for managing many keys and addresses, and even more if combined with a standardized way of creating seeds from a sequence of words that are easy to transcribe, export, and import across wallets. This is known as *mnemonic* and the standard is defined by BIP-39.

Mnemonic words are generated automatically by the wallet using the standardized process. The wallet from a source of entropy adds a checksum and then maps the entropy to a word list:
1. Create a random sequence (entropy) of 128 to 256 bits.
2. Create a checksum of the random sequence by taking the first 32 bits of its SHA256 hash.
3. Add the checksum to the end of the random sequence.
4. Split the result into 11 bit length segments.
5. Map each 11 bit value to a word from the predefined dictionary of 2048 words.
6. The mnemonic code is the sequence of words.

The entropy is used to derive a longer 512 seed through the use of the key-stretching algorithm PBKDF2. The result is then used to build a deterministic wallet and derive its keys,

7. The first parameter of the PBKDF2 function is the *mnemonic* from the previous step.
8. The second parameter to the PBKDF2 function is a *salt*. The salt is composed of the string constant "mnemonic" concatenated with an optional user-supplied passphrase string.
9. The PBKDF2 stretches the mnemonic and salt parameters using 2048 rounds of hashing with the HMAC-SHA512 algorithm, producing a 512 bit value as its final output. This is the seed.

The BIP-39 standard allows the use of an optional passphrase in the derivation of the seed.
The passphrase create two features:
1. A second factor that makes a mnemonic uselesss on its own, protecting backups from compromise by a thief.
2. A form of plausible deniability, where a chosen passphrase leads to a wallet with a small amount of funds used to distract an attacker from the real wallet that contains the majority of the funds.



In [130]:
# Import libraries
import hashlib
import hmac
import itertools
import os
import secrets
import unicodedata
from typing import AnyStr, List, TypeVar, Union
import sys

In [131]:
# Base58 encoding
def b58encode(v: bytes) -> str:
    alphabet = "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz"

    p, acc = 1, 0
    for c in reversed(v):
        acc += p * c
        p = p << 8

    string = ""
    while acc:
        acc, idx = divmod(acc, 58)
        string = alphabet[idx : idx + 1] + string
    return string

# String normalization
def normalize_string(txt: AnyStr) -> str:
    if isinstance(txt, bytes):
        utxt = txt.decode("utf8")
    elif isinstance(txt, str):
        utxt = txt
    else:
        raise TypeError("String value expected")
    return unicodedata.normalize("NFKD", utxt)

In [132]:
# The 2048 words list
wordlist = []

dir = os.path.join(os.getcwd(), 'wordlist.txt')
if os.path.exists(dir) and os.path.isfile(dir):
    with open(dir, "r", encoding="utf-8") as f:
        wordlist = [w.strip() for w in f.readlines()]
    if len(wordlist) != 2048:
        print("Error: Lenght < 2048")
else:
    print("File not detected")


print(wordlist)


['abandon', 'ability', 'able', 'about', 'above', 'absent', 'absorb', 'abstract', 'absurd', 'abuse', 'access', 'accident', 'account', 'accuse', 'achieve', 'acid', 'acoustic', 'acquire', 'across', 'act', 'action', 'actor', 'actress', 'actual', 'adapt', 'add', 'addict', 'address', 'adjust', 'admit', 'adult', 'advance', 'advice', 'aerobic', 'affair', 'afford', 'afraid', 'again', 'age', 'agent', 'agree', 'ahead', 'aim', 'air', 'airport', 'aisle', 'alarm', 'album', 'alcohol', 'alert', 'alien', 'all', 'alley', 'allow', 'almost', 'alone', 'alpha', 'already', 'also', 'alter', 'always', 'amateur', 'amazing', 'among', 'amount', 'amused', 'analyst', 'anchor', 'ancient', 'anger', 'angle', 'angry', 'animal', 'ankle', 'announce', 'annual', 'another', 'answer', 'antenna', 'antique', 'anxiety', 'any', 'apart', 'apology', 'appear', 'apple', 'approve', 'april', 'arch', 'arctic', 'area', 'arena', 'argue', 'arm', 'armed', 'armor', 'army', 'around', 'arrange', 'arrest', 'arrive', 'arrow', 'art', 'artefact',

In [133]:
delimiter = " "

def to_entropy(words: Union[List[str], str]) -> bytearray:
    if not isinstance(words, list):
        words = words.split(" ")
    if len(words) not in (128, 160, 192, 224, 256):
        print("Number of words not correct")
    concat_len_bits = len(words) * 11
    concat_bits = [False] * concat_len_bits
    wordindex = 0
    for word in words:
        ndx = wordlist.index(word)
        if ndx < 0:
            raise LookupError(f'Unable to find {word} in word list')
        for ii in range(11):
            concat_bits[(wordindex * 11) + ii] = ndx & (1 << (10 - ii)) != 0
        wordindex += 1
    checksum_length_bits = concat_len_bits // 33
    entropy_length_bits = concat_len_bits - checksum_length_bits
    entropy = bytearray(entropy_length_bits // 8)
    for ii in range(len(entropy)):
        for jj in range(8):
            if concat_bits[(ii*8)+jj]:
                entropy[ii] |= 1 << (7 - jj)
    hash_bytes = hashlib.sha256(entropy).digest()
    hash_bits = list(itertools.chain.from_iterable(
        [c & (1 << (7-1)) != 0 for i in range(8)] for c in hash_bytes
    ))
    for i in range(checksum_length_bits):
        if concat_bits[entropy_length_bits + i] != hash_bits[i]:
            print("Failed checksum")
    return entropy

def to_mnemonic(data: bytes) -> str:
    if len(data) not in [16, 20, 24, 28, 32]:
        print("Data length not correct")
    h = hashlib.sha256(data).hexdigest()
    b = bin(int.from_bytes(data, byteorder="big"))[2:].zfill(len(data)*8) + bin(int(h, 16))[2:].zfill(256)[: len(data)*8//32]
    result = []
    for i in range(len(b)//11):
        idx = int(b[i*11:(i+1)*11], 2)
        result.append(wordlist[idx])
    return delimiter.join(result)


In [134]:
def generate(strength: int = 128) -> str:
    if strength not in [128, 160, 192, 224, 256]:
        print("Invalid strength value")
    return to_mnemonic(secrets.token_bytes(strength // 8))


In [135]:
# BIP 39 Mnemonic code
seed = generate()

print(f"Mnemonic code: {seed}")

Mnemonic code: hurt middle gorilla good mystery short degree swap school surround raven junk


### From the seed to the HD Wallet

HD wallets are created from a single root seed, which is a 128, 256, or 512 bit random number.
Most commonly the seed is generated froma *mnemonic*

Every key in the HD wallet is deterministically derived from this root seed, which makes it possible to recreate the entire HD wallet from that seed in any compatible HD wallet.

The process to create the master keys and master chain code for an HD wallet is composed of the following steps:

1. The root seed is input the HMAC-SHA512 algorithm and the resulting hash is used to create a *master private key* and a *master chain code*
2. The master private key then generates a corresponding master public key using the normal elliptic curve multiplication process, as described in the introduction section.
3. The chain code is used to introduce entropy in the function that creates the child keys from parent keys.

In [None]:
def to_hd_master_key(seed: bytes) -> str:
    if len(seed) != 64:
        print("Provided seed should have length of 64")
    # HMAC-SHA512 of seed
    seed = hmac.new(b"Bitcoin seed", seed, digestmod=hashlib.sha512).digest()
    xprv = b"\x04\x88\xad\xe4" # Private key serialization format for mainnet
    xprv += b"\x00" * 9  # Depth, parent fingerprint, and child number
    xprv += seed[32:]  # Chain code
    xprv += b"\x00" + seed[:32]  # Master key
    hashed_xprv = hashlib.sha256(xprv).digest()
    hashed_xprv = hashlib.sha256(hashed_xprv).digest()
    xprv += hashed_xprv[:4]
    return b58encode(xprv)

HD Wallets use a *child key derivation* (CKD) function to derive child keys from parent keys.
The function combines:
1. Parent private or public key (ECDSA uncompressed key)
2. Seed called a chain code (256 bits)
3. Index number (32 bits)

The chain code is used to introduce deterministic random data to the process, so that knowing the index and a child key is not sufficient to derive other child keys. Knowing a child key does not make it possible to find its siblings, unless the chain code is also available.
