In [1]:
# Useful functions (covered in previous sections of the course)
import hashlib

def hash256(data: bytes):
    '''Two rounds of SHA256 (aka Hash256)'''
    hash_1 = hashlib.sha256(data).digest()
    hash_2 = hashlib.sha256(hash_1).digest()
    return hash_2

def hash160(data: bytes):
    '''sha256 followed by ripemd160'''
    hash_1 = hashlib.sha256(data).digest()
    hash_2 = hashlib.new('ripemd160', hash_1).digest()
    return hash_2

# Bitcoin scriptPubKey formats and addresses

Here we will cover the different scriptPubKey formats as well as how they can be encoded and decoded.

## Recommended reading
- TODO _(suggestions/PRs welcome)_

## Introduction
When Alice sends Bob bitcoin, Alice does so by creating a new transaction where one (or more) of the outputs has a scriptPubKey (aka 'locking script') specified by Bob. What makes the output effectively belong to Bob is that only he knows how to create a scriptSig that will unlock the locking script.

If Bob were to send Alice the scriptPubKey as raw bytes, any error in communication could result in Alice sending the bitcoin to the wrong scriptPubKey, making the bitcoin impossible to recover.

To help prevent this problem, there are common address formats for encoding scriptPubKeys. These addresses are designed to be easier to read and contain a checksum to help with error detection.

Bitcoin uses three address types (base58, bech32, bech32m) that cover the standard scriptPubKey formats:
- Base58
    - p2pkh
    - p2sh
    - p2sh-p2wpkh
- Bech32
    - p2wpkh
    - p2wsh
- Bech32m
    - p2tr

### Address prefixes
These address formats not only encode the scriptPubKey for the output, but they also encode a prefix that specifies which network (mainnet/testnet) the output is intended for. The same prefix is also used by other bitcoin forks (e.g. litecoin or zcash) to indicate which cryptocurrency the output is intended for. If a wallet implementation doesn't check that the prefix matches with the type of transaction being created, the wallet user may end up creating a transaction that for a different cryptocurrency than the one the user is intending to use.

Here are some commonly used bitcoin address prefixes:
- Mainnet
    - P2PKH - `0x00`
    - P2SH  - `0x05`
- Testnet
    - P2PKH - `0x00`
    - P2SH  - `0x05`



    if network == "testnet":
        prefix = 'tb'
    if network == "regtest":
        prefix = 'bcrt'
    elif network == "mainnet":
        prefix = 'bc'

A full list of bitcoin address prefixes can be found here: https://en.bitcoin.it/wiki/List_of_address_prefixes

## Base58

TODO - cover base58 encoding

In [2]:
import base58

def encode_base58_checksum(b: bytes):
    return base58.b58encode(b + hash256(b)[:4]).decode()

def decode_base58(s: str):
    return base58.b58decode(s)


### Creating a base58 P2PKH address from a pubkey
Given the pubkey `02466d7fcae563e5cb09a0d1870bb580344804617879a14949cf22285f1bae3f27`, create a p2pkh address for regtest.

In [3]:
pubkey = bytes.fromhex("02466d7fcae563e5cb09a0d1870bb580344804617879a14949cf22285f1bae3f27")

# Take the hash (hash160) of the pubkey
pk_hash = hash160(pubkey)

# Set the address prefix. For regtest p2pkh we use 0x6f
# a list of prefixes can be found at https://en.bitcoin.it/wiki/List_of_address_prefixes
# In bitcoin core it is defined in chainparams.cpp
# https://github.com/bitcoin/bitcoin/blob/767d825e27b452d6e846280256e5932e906da44d/src/chainparams.cpp#L241
prefix = bytes.fromhex("6f")

# Append the prefix
payload = prefix + pk_hash

# Apply base58 encoding 
p2pkh_address = encode_base58_checksum(payload)

print(p2pkh_address)

mo6CPsdW8EsnWdmSSCrQ6225VVDtpMBTug


For the rest of the notebooks we'll use the following function to convert pubkeys to base58 p2pkh addresses:

In [4]:
def pk_to_p2pkh(compressed: bytes, network: str):
    '''Creates a p2pkh address from a compressed pubkey'''
    pk_hash = hash160(compressed)
    if network == "regtest" or network == "testnet":
        prefix = bytes.fromhex("6f")
    elif network == "mainnet":
        prefix = bytes.fromhex("00")
    else:
        return "Enter the network: testnet/regtest/mainnet"
    return encode_base58_checksum(prefix + pk_hash)

### Creating a base58 P2SH address from a multisig script

Here we'll create a 2-of-3 multisig script from 3 pubkeys and use that to generate a base58 P2SH address.

Creating a P2SH base58 address is much like a P2PKH address, however we use the _redeemScript_ hash instead of a pubkey hash, and a different prefixes. 

The OP_CODES `02` and `03` are represented by `0x52` and `0x53`. For more on the multisig script, refer to the 'Bitcoin Script' chapter.

In [5]:
pubkey1 = bytes.fromhex("034f355bdcb7cc0af728ef3cceb9615d90684bb5b2ca5f859ab0f0b704075871aa")
pubkey2 = bytes.fromhex("02466d7fcae563e5cb09a0d1870bb580344804617879a14949cf22285f1bae3f27")
pubkey3 = bytes.fromhex("023c72addb4fdf09af94f0c94d7fe92a386a7e70cf8a1d85916386bb2535c7b1b1")

redeemScript = bytes.fromhex(
    "52"            # OP_2
    + "21"          # OP_PUSHBYTES_33 ("21" is the length of a 33 byte (compressed) pubkey in hex notation)
    + pubkey1.hex() # pubkey1
    + "21"          # OP_PUSHBYTES_33
    + pubkey2.hex() # pubkey2
    + "21"          # OP_PUSHBYTES_33
    + pubkey3.hex() # pubkey3
    + "53"          # OP_3
    + "ae"          #OP_CHECKMULTISIG
)


Now that we have our redeemScript, we can convert it to a base58 regtest P2SH address:

In [6]:
# Take the hash (hash160) of the redeemScript
script_hash = hash160(redeemScript)

# Set the address prefix. For regtest p2sh we use 0xc4
prefix = bytes.fromhex("c4")

# Append the prefix
payload = prefix + script_hash

# Apply base58 encoding 
p2sh_address = encode_base58_checksum(payload)

print(p2sh_address)

2MuXogRGTh7uADB2wKBqFcsPTprVKnChJe6


For the rest of the notebooks we'll use the following function for converting a P2SH redeemScript to a base58 address:

In [7]:
def script_to_p2sh(redeemScript, network):
    rs_hash = hash160(redeemScript)
    if network == "regtest" or network == "testnet":
        prefix = bytes.fromhex("c4")
    elif network == "mainnet":
        prefix = bytes.fromhex("05")
    else:
        return "Enter the network: tesnet/regtest/mainnet"
    return encode_base58_checksum(prefix + rs_hash)

### Decoding a base58 address
Now let's do the reverse. Given a base58 address, decode it to get the prefix and scriptPubkey.

In [8]:
address = 'mo6CPsdW8EsnWdmSSCrQ6225VVDtpMBTug'
address_decoded = decode_base58(address)

# Check the checksum is valid
decoded = address_decoded[:-4] # everything before the last 4 bytes is the message
checksum = address_decoded[-4:] # last 4 bytes are the checksum

# Check that the first four bytes of the hash are equal to the checksum
print("Does the checksum match: ", hash256(decoded)[:4] == checksum)

print("prefix: ", hex(decoded[0]))

pk_hash = decoded[1:]
print("pubkey hash: ", pk_hash.hex())

Does the checksum match:  True
prefix:  0x6f
pubkey hash:  531260aa2a199e228c537dfa42c82bea2c7c1f4d


#### Pubkey hash to scriptPubkey
- The checksum was valid, so it is safe to assume the data was received and read accurately. 
- The prefix `0x6f` tells us we are creating a scriptPubKey for a P2PKH output on bitcoin regtest.
- The last part of the data therefore encodes the pubkey hash, and we can create a P2PKH script with it.

To turn the pubkey hash it into a P2PKH scriptPubkey we inset it into the standard P2PKH script format:

`OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG`

We can look up the corresponding op codes bytes from https://en.bitcoin.it/wiki/Script.

Note that in front of `<pubKeyHash>` we need to add an opcode for the length of the hash. Since the pubkey hash is taken from hash160, we have a 20 byte hash, which is `0x14` in hex notation.

In [9]:
scriptPubkey = bytes.fromhex("76a914" + pk_hash.hex() + "88ac")
print("scriptPubkey: ", scriptPubkey.hex())

scriptPubkey:  76a914531260aa2a199e228c537dfa42c82bea2c7c1f4d88ac


## Bech32

TODO

## Segwit

In [10]:
# Segwit
def pk_to_p2wpkh(compressed, network):
    '''generates a p2wpkh bech32 address corresponding to a compressed pubkey'''
    pk_hash = hash160(compressed)
    spk = bytes.fromhex("0014") + pk_hash
    version = spk[0] - 0x50 if spk[0] else 0
    program = spk[2:]
    if network == "testnet":
        prefix = 'tb'
    if network == "regtest":
        prefix = 'bcrt'
    elif network == "mainnet":
        prefix = 'bc'
    else:
        return "Enter the network: testnet/regtest/mainnet"
    return bech32.encode(prefix, version, program)


def script_to_p2wsh(redeemScript, network):
    '''Creates a p2wsh bech32 address corresponding to a redeemScript'''
    script_hash = hashlib.sha256(redeemScript).digest()
    spk = bytes.fromhex("0020") + script_hash
    version = spk[0] - 0x50 if spk[0] else 0
    program = spk[2:]
    if network == "testnet":
        prefix = 'tb'
    if network == "regtest":
        prefix = 'bcrt'
    elif network == "mainnet":
        prefix = 'bc'
    else:
        return "Enter the network: testnet/regtest/mainnet"
    return bech32.encode(prefix, version, program)


def pk_to_p2sh_p2wpkh(compressed, network):
    pk_hash = hash160(compressed)
    redeemScript = bytes.fromhex("0014"+str(pk_hash.hex()))
    rs_hash = hash160(redeemScript)
    if network == "regtest" or network == "testnet":
        prefix = b"\xc4"
    elif network == "mainnet":
        prefix = b"\x05"
    else:
        return "Enter the network: tesnet/regtest/mainnet"
    return encode_base58_checksum(prefix + rs_hash)

def bech32_to_spk(hrp, address):
    witver, witprog = bech32.decode(hrp, address)
    pubkey_hash = bytearray(witprog)
    return (
        witver.to_bytes(1, byteorder="little", signed=False)
        + varint_len(pubkey_hash)
        + pubkey_hash
    )

## Bech32m

TODO

In [11]:
# Taproot
# def pk_to_p2tr()

## Quiz

What is the scriptPubkey encoded by this address `<address>`? What network is it intended for (mainnet/testnet)?

