# Create Mnemonic from Entropy
## 1. Create Entropy
### What is Entropy?
Entropy is a sequence of random data used as the foundation for creating a secure mnemonic phrase. In the context of BIP-39, it is represented as a binary string generated with a specific length (e.g., 128, 160, 192, 224, or 256 bits). The higher the entropy, the more secure the mnemonic phrase will be.

In [1]:
import secrets

# Define strength (in bits) for the entropy
strength = 128  # You can set this to 128, 160, 192, 224, or 256
strength_bytes = strength // 8

# Generate entropy using the secrets module
entropy = secrets.token_bytes(strength_bytes)

# Print the generated entropy in hexadecimal format
print(f'Created entropy: {entropy.hex()} ({len(entropy)} bytes, len(hex): {len(entropy.hex())})')

Created entropy: 8e49ef46fed1e261ce2e7e2ea5e216b2 (16 bytes, len(hex): 32)


## 2. Create Mnemonic
### 2-1. Load wordlist
#### What is a Wordlist?
The wordlist is a predefined list of 2048 unique words as specified in the BIP-39 standard. These words are used to convert binary data (entropy + checksum) into a human-readable mnemonic phrase. Each word corresponds to an 11-bit binary number.

In [2]:
wordlist = []
with open("english.txt", "r", encoding="utf-8") as f:
    wordlist = [w.strip() for w in f.readlines()]

print('Successfully loaded wordlist')
wordlist

Successfully loaded wordlist


['abandon',
 'ability',
 'able',
 'about',
 'above',
 'absent',
 'absorb',
 'abstract',
 'absurd',
 'abuse',
 'access',
 'accident',
 'account',
 'accuse',
 'achieve',
 'acid',
 'acoustic',
 'acquire',
 'across',
 'act',
 'action',
 'actor',
 'actress',
 'actual',
 'adapt',
 'add',
 'addict',
 'address',
 'adjust',
 'admit',
 'adult',
 'advance',
 'advice',
 'aerobic',
 'affair',
 'afford',
 'afraid',
 'again',
 'age',
 'agent',
 'agree',
 'ahead',
 'aim',
 'air',
 'airport',
 'aisle',
 'alarm',
 'album',
 'alcohol',
 'alert',
 'alien',
 'all',
 'alley',
 'allow',
 'almost',
 'alone',
 'alpha',
 'already',
 'also',
 'alter',
 'always',
 'amateur',
 'amazing',
 'among',
 'amount',
 'amused',
 'analyst',
 'anchor',
 'ancient',
 'anger',
 'angle',
 'angry',
 'animal',
 'ankle',
 'announce',
 'annual',
 'another',
 'answer',
 'antenna',
 'antique',
 'anxiety',
 'any',
 'apart',
 'apology',
 'appear',
 'apple',
 'approve',
 'april',
 'arch',
 'arctic',
 'area',
 'arena',
 'argue',
 'arm',
 

### 2-2. Calculate hash of entropy
#### Why Calculate a Hash?
The hash of the entropy (using SHA-256) is used to generate a checksum. This checksum ensures the integrity of the entropy and adds an additional layer of verification during mnemonic phrase generation.

In [3]:
import hashlib

entropy_hash = hashlib.sha256(entropy).hexdigest()
print(f'Calculated sha256 hash of entropy: {entropy_hash}')

Calculated sha256 hash of entropy: 01e0f0b2e7e5281ef80259039f724bd1ddff8c69cb587dac896565753aba8a9e


### 2-3. Create Entropy bits + Checksum bits for Creating Mnemonic
#### What are Checksum Bits?
Checksum bits are derived from the SHA-256 hash of the entropy. The number of checksum bits is determined by the length of the entropy divided by 32. These bits are appended to the entropy bits to form the full binary data used for mnemonic generation.

In [4]:
# Compute the SHA-256 hash of the entropy, which is used to derive the checksum.
# The hash is assumed to be in hexadecimal format, so it's converted to binary.
entropy_hash_bits = bin(int(entropy_hash,16))[2:].zfill(256)
print(f'Entropy hash bits: {entropy_hash_bits} ({len(entropy_hash_bits)} bits)')

# Extract the first (entropy length / 32) bits of the hash to use as the checksum.
# This ensures the checksum length is proportional to the entropy length.
first_bits = len(entropy) * 8 // 32
print(f'Calculate checksum bits... | len_entropy(bits): {len(entropy) * 8}, len_entropy // 32: {first_bits}')
checksum_bits = entropy_hash_bits[:first_bits]
print(f'Checksum bits: {checksum_bits} (len: {len(checksum_bits)})')

# Convert the entropy to binary bits.
# The entropy is converted from bytes to an integer, then to a binary string.
# The length is padded to match the number of bits in the entropy.
entropy_bits = b = bin(int.from_bytes(entropy, byteorder="big"))[2:].zfill(len(entropy) * 8)
print(f'Entroy bits: {entropy_bits} (len: {len(entropy_bits)})')

# Concatenate the entropy bits with the checksum bits.
# This forms the final bit sequence that will be used to generate the mnemonic words.
bits = entropy_bits + checksum_bits
print(f'[entropy_bits|checksum_bits]: {bits} (len: {len(bits)})')

Entropy hash bits: 0000000111100000111100001011001011100111111001010010100000011110111110000000001001011001000000111001111101110010010010111101000111011101111111111000110001101001110010110101100001111101101011001000100101100101011001010111010100111010101110101000101010011110 (256 bits)
Calculate checksum bits... | len_entropy(bits): 128, len_entropy // 32: 4
Checksum bits: 0000 (len: 4)
Entroy bits: 10001110010010011110111101000110111111101101000111100010011000011100111000101110011111100010111010100101111000100001011010110010 (len: 128)
[entropy_bits|checksum_bits]: 100011100100100111101111010001101111111011010001111000100110000111001110001011100111111000101110101001011110001000010110101100100000 (len: 132)


### 2-4. Create mnemonic from entropy + checksum bits 
#### How is the Mnemonic Phrase Created?
- **Split the combined bitstring**: The combined entropy and checksum bitstring is divided into fixed-length 11-bit segments.
- **Convert binary to integer**: Each 11-bit segment is interpreted as a binary number and converted to an integer using `int(segment, 2)`. This integer serves as an index for the BIP-39 wordlist.
- **Map integers to words**: The integers derived from the 11-bit segments are used to select words from the wordlist. These words collectively form the mnemonic phrase.
- **Construct the mnemonic phrase**: The selected words are appended to form a human-readable and secure mnemonic phrase.

In [5]:
result = []
for i in range(len(bits) // 11):
    idx = int(bits[i * 11 : (i + 1) * 11], 2)
    print(f'Calculated idx: {idx}')
    word = wordlist[idx]
    print(f'Word at {idx}: {word}')
    result.append(word)

print(f'Successfully created mnemonic: {result} ({len(result)} words, {len(result)*11} bits)')

Calculated idx: 1138
Word at 1138: mixture
Calculated idx: 635
Word at 635: exhaust
Calculated idx: 1677
Word at 1677: spider
Calculated idx: 2029
Word at 2029: world
Calculated idx: 241
Word at 241: bullet
Calculated idx: 391
Word at 391: couch
Calculated idx: 453
Word at 453: december
Calculated idx: 1662
Word at 1662: sound
Calculated idx: 373
Word at 373: concert
Calculated idx: 376
Word at 376: congress
Calculated idx: 1069
Word at 1069: mad
Calculated idx: 800
Word at 800: goat
Successfully created mnemonic: ['mixture', 'exhaust', 'spider', 'world', 'bullet', 'couch', 'december', 'sound', 'concert', 'congress', 'mad', 'goat'] (12 words, 132 bits)


## 3. Validate Mnemonic
### What is Mnemonic Validation?
Mnemonic validation ensures that a given mnemonic phrase is valid according to the BIP-39 standard. This involves checking:
1. The mnemonic length is appropriate (12, 15, 18, 21, or 24 words).
2. Each word exists in the BIP-39 wordlist.
3. The checksum derived from the mnemonic matches the original entropy's checksum.

In [6]:
def validate_mnemonic(mnemonic_words: list, wordlist: list) -> bool:
    """
    Validates a BIP-39 mnemonic phrase by checking:
    1. The length of the mnemonic.
    2. Whether all words exist in the valid wordlist.
    3. Whether the checksum is correct.

    :param mnemonic_list: List of mnemonic words to validate.
    :param wordlist: List of valid words from the BIP-39 wordlist.
    :return: True if the mnemonic is valid, False otherwise.
    """

    # Check if the mnemonic length is valid (must be 12, 15, 18, 21, or 24 words)
    if len(mnemonic_words) not in [12, 15, 18, 21, 24]:
        print("Invalid mnemonic length. Expected 12, 15, 18, 21, or 24 words.")
        return False

    # Convert words to binary indexes
    try:
        binary_indexes = [format(wordlist.index(word), "011b") for word in mnemonic_words]
        full_binary_string = "".join(binary_indexes)  # Concatenate all binary values
    except ValueError:
        print("Mnemonic contains invalid words that are not in the BIP-39 wordlist.")
        return False

    # Determine bit lengths
    total_bits = len(full_binary_string)
    entropy_bits_length = (total_bits * 32) // 33  # Extract entropy length
    checksum_bits_length = total_bits - entropy_bits_length  # Extract checksum length
    print(f'total_bits: {total_bits}, entropy_bits_length: {entropy_bits_length}, checksum_bits_length: {checksum_bits_length}')

    # Split into entropy bits and checksum bits
    entropy_bits = full_binary_string[:entropy_bits_length]
    checksum_bits = full_binary_string[entropy_bits_length:]

    # Recalculate checksum from entropy
    entropy_bytes = int(entropy_bits, 2).to_bytes(entropy_bits_length // 8, byteorder="big")
    calculated_checksum = bin(int(hashlib.sha256(entropy_bytes).hexdigest(), 16))[2:].zfill(256)[:checksum_bits_length]

    # Validate the checksum
    if checksum_bits == calculated_checksum:
        print("Mnemonic is valid")
        return True
    else:
        print("Invalid checksum")
        return False

In [7]:
# Example Usage
print(f'Validate mnemonic we created before: {result}')
is_valid = validate_mnemonic(result, wordlist)
if is_valid == True:
    print(f'The mnemonic we created is valid')

Validate mnemonic we created before: ['mixture', 'exhaust', 'spider', 'world', 'bullet', 'couch', 'december', 'sound', 'concert', 'congress', 'mad', 'goat']
total_bits: 132, entropy_bits_length: 128, checksum_bits_length: 4
Mnemonic is valid
The mnemonic we created is valid
