# Bitcoin from scratch

Bitcoin is a software program that runs on a decentralized network of computers. The software defines the rules of Bitcoin, including how transactions are verified and how new bitcoins are created.
Bitcoin uses cryptography to secure transactions and to prevent fraud. Cryptography is the science of encrypting and decrypting data. In Bitcoin, cryptography is used to create digital signatures, which are used to verify the authenticity of transactions.

Bitcoin is a decentralized currency, meaning that it is not controlled by any central authority. The Bitcoin network is run by a network of volunteers around the world.
All Bitcoin transactions are recorded on the blockchain, which is a public ledger. This means that anyone can view the history of Bitcoin transactions.

The purpose of this notebook is to discover and analyze as deeply as possible the various components of Bitcoin. The final goal is to create, digitally sign, and broadcast a Bitcoin transaction.

By the end of this notebook, you will have a better understanding of how a Bitcoin transaction works.
You will be able to create, digitally sign, and broadcast your own Bitcoin transactions. You will also be able to understand the security and privacy implications of Bitcoin.

Here is a roadmap of the notebook:

1) Introduction to the cryptography behind Bitcoin
- Concept of Finite field
- Modular arithmetic
- Elliptic Curve Cryptography (ECC)
2) Generation of the Crypto identity
- Generation of the Private key
- Generate the Public key from the Private key
- Generation of the Bitcoin address


##  1. Introduction to the cryptography behind Bitcoin

### Finite fields

Finite fields, also known as Galois fields, are fundamental mathematical structures used in various fields, including algebra, coding theory, cryptography, and computer science.

A field is a mathematical structure that consists of a set of elements along with two binary operations: addition and multiplication, which satisfy certain properties.
Specifically, a field must satisfy the following conditions:
- Closure: The result of addition or multiplication of any two elements in the field is also an element in the field.
- Associativity: Addition and multiplication operations are associative.
- Commutativity: Both addition and multiplication are commutative; the order of operands does not matter.
- Identity Elements: There exist unique elements called the additive and multiplicative identities (denoted as 0 and 1, respectively) such that adding or multiplying any element by these identities leaves the element unchanged.
- Inverse Elements: Every nonzero element in the field has a unique additive and multiplicative inverse.
- Distributive Property: Multiplication distributes over addition.

A finite field is a field that has a finite number of elements. The order of a finite field is the number of elements it contains. For any finite field, the order is a prime power, represented as "q," where "q" is a prime number raised to a positive integer power.

The characteristic of a finite field is the smallest positive integer "p" such that summing "p" copies of the multiplicative identity (1) yields the additive identity (0). The characteristic can be either a prime number or zero. If the characteristic is zero, the field is said to have characteristic zero.

Finite fields can be constructed using various methods, but the most common method is through polynomial construction. For each prime power "q," there exists a unique (up to isomorphism) finite field of order "q." This finite field is denoted as GF(q) or Fq. The elements of a finite field Fq are usually denoted as 0, 1, 2, ..., q-1.

Finite fields can have subfields, which are smaller fields contained within the original finite field. These subfields can also be finite fields themselves. Additionally, one can create extensions of finite fields by defining new finite fields with larger orders.

For each prime power "q," there is a unique finite field of order "q" up to isomorphism. This means that any two finite fields of the same order are essentially the same in terms of their algebraic properties, even though they might be represented with different sets of elements.

The study of finite fields is closely related to Galois theory, a branch of abstract algebra. Galois theory explores the relationship between field extensions and symmetries of polynomial equations. Finite fields are an important example of finite field extensions in Galois theory.

In [359]:
# Some imports
from __future__ import annotations # https://peps.python.org/pep-0563/ PEP 563 – Postponed Evaluation of Annotations
# from dataclasses import dataclass # https://docs.python.org/3/library/dataclasses.html This module provides a decorator and functions for automatically adding generated special methods such as __init__() and __repr__() to user-defined classes.


In [360]:
# Constructing a finite field
class FieldElement:

    def __init__(self, num, prime):
        if num >= prime or num < 0:
            error = f'Num {num} not in field range 0 to {num-1}'
            raise ValueError(error)
        self.num = num
        self.prime = prime

    def __repr__(self):
        return f'FieldElement_{self.prime}({self.num})'

    # Equal
    def __eq__(self, other):
        if other is None:
            return False
        return self.num == other.num and self.prime == other.prime

    # Not equal
    def __ne__(self, other):
        return not (self == other)

    # Addition
    def __add__(self, other):
        if self.prime != other.prime:
            raise TypeError('Cannot add two numbers in different fields')
        num = (self.num + other.num) % self.prime
        return self.__class__(num, self.prime)

    # Subtraction
    def __sub__(self, other):
        if self.prime != other.prime:
            raise TypeError('Cannot subtract two numbers in different fields')
        num = (self.num - other.num) % self.prime
        return self.__class__(num, self.prime)

    # Multiplication
    def __mul__(self, other):
        if self.prime != other.prime:
            raise TypeError('Cannot multiply two numbers in different fields')
        num = (self.num * other.num) % self.prime
        return self.__class__(num, self.prime)

    # Pow function
    def __pow__(self, exponent):
        n = exponent % (self.prime-1)
        num = pow(self.num, n, self.prime)
        return self.__class__(num, self.prime)

    # Division
    # use the Fermat's little theorem:
    # self.num**(p-1) % p == 1
    # that means 1/n == pow(n, p-2, p)
    def __truediv__(self, other):
        if self.prime != other.prime:
            raise TypeError('Cannot divide two numbers in different Fields')
        num = (self.num * pow(other.num, self.prime - 2, self.prime)) % self.prime
        return self.__class__(num, self.prime)


    # Scalar multiplication function
    def __rmul__(self, coefficient):
        num = (self.num * coefficient) % self.prime
        return self.__class__(num, self.prime)


In [361]:
a = FieldElement(7, 13)
b = FieldElement(6, 13)

print(a == b)   # False


False


### Modulo arithmetic

Modulo arithmetic, also known as clock arithmetic or modular arithmetic, is a fundamental branch of arithmetic that deals with numbers' remainders after division. It is a way of performing arithmetic operations on integers that are constrained to a fixed positive integer, known as the modulus.

In the context of modulo arithmetic, let's consider two integers: a (the dividend) and m (the modulus), where m is a positive integer greater than 1. The symbol used to denote the modulo operation is "mod." The result of the modulo operation is the remainder obtained when dividing 'a' by 'm'.

Mathematically, the modulo operation is represented as:

a mod m = r

Here, 'r' represents the remainder obtained when 'a' is divided by 'm'.

Key properties of modulo arithmetic:

Range constraint: The result of the modulo operation is always within the range of 0 to (m - 1). In other words, 0 ≤ (a mod m) ≤ (m - 1).

Congruence: If two integers 'a' and 'b' have the same remainder when divided by 'm', they are said to be congruent modulo 'm', represented as a ≡ b (mod m). This means (a mod m) = (b mod m).

Common arithmetic operations in modulo arithmetic:

Addition: To perform addition in modulo arithmetic, simply add the two numbers and take the result modulo 'm'.

(a + b) mod m = (a mod m + b mod m) mod m

Subtraction: To perform subtraction, similarly subtract the two numbers and take the result modulo 'm'.

(a - b) mod m = (a mod m - b mod m) mod m

Multiplication: For multiplication, multiply the two numbers and then take the result modulo 'm'.

(a * b) mod m = (a mod m * b mod m) mod m

Exponentiation: For exponentiation, raise 'a' to the power of 'b' and take the result modulo 'm'.

(a^b^) mod m = (a mod m)^b^ mod m

Modulo arithmetic finds numerous applications in various fields, including computer science, cryptography, number theory, and digital signal processing. In computer science, it is used in hashing functions, random number generation, and cyclic data structures. In cryptography, it forms the basis of some encryption and decryption algorithms.

Modulo arithmetic has a cyclical nature, akin to a clock that repeats itself after completing a full circle. This property makes it a valuable tool in solving problems involving periodicity and repetitions.

#### Mathematical Groups

For the purposes of generating key in the public key cryptography, finite cyclic groups are what we need.

Taking a generator point from an elliptic curve over a finite field is possible to generate a finite cyclic group. Unlike fields, groups have only a single operation, the point addition. Groups have a few properties like closure, invertibility, commutativity, associativity, and lastly the identity.



In [362]:
# Addition on a finite field

a = FieldElement(7, 19)
b = FieldElement(8, 19)
c = FieldElement(15, 19)

# (7 + 8) % 19 = 15 mod 19
print(a+b==c) # True

# Subtraction on a finite field

a = FieldElement(7, 19)
b = FieldElement(8, 19)
c = FieldElement(1, 19)

# (8 + 7) % 19 = 1 mod 19
print(b-a==c) # True

# Multiplication on a finite field
a = FieldElement(3, 13)
b = FieldElement(12, 13)
c = FieldElement(10, 13)

# 3 mod 13 * 12 mod 13 = 10 mod 13
print(a*b==c) # True

# Exponentiation on a finite field
a = FieldElement(3, 13)
b = FieldElement(1, 13)

# 3^3 mod 13 = 1 mod 13
print(a**3==b) # True

# Division on a finite field
"""
In normal math, division is the inverse of multiplication
7 x 8 = 56 implies that 56 / 8 = 7
12 x 2 = 24 implies that 24 / 12 = 2
Dividing any two numbers where the denominator is not 0 will result in another finite field element.
n^(p-1) is always is 1 for every p that is prime and every n>0. This comes from number theory called Fermat's little theorem.
n^(p-1)%p=1
where p is prime.
{1 x 2 x 3 ... x (p-2) x (p-1) % p = n x 2n x 3n x ... x (p-2)n x (p-1)n % p}
"""
a = FieldElement(3, 31)
b = FieldElement(24, 31)
c = FieldElement(4, 31)

print(a/b==c) # True


True
True
True
True
True


### Elliptic Curve Cryptography

Elliptic Curve Cryptography (ECC) is a type of asymmetric or public key cryptography based on the algebraic structures of the elliptic curves over finite fields and on the discrete logarithm problem as expressed by addition and multiplication on the points of the curve.

The foundation of ECC lies in the mathematical properties of elliptic curves. An elliptic curve is a smooth curve defined by an equation of the form:

y^2^ = x^3^ + ax + b

'a' and 'b' are constants that define the specific curve's shape, and the coordinates 'x' and 'y' represent points on the curve. However, these coordinates must satisfy certain conditions. One key property of elliptic curves is that they have an additive group structure, allowing for point addition and scalar multiplication operations.

ECC uses this group structure to create cryptographic key pairs: a private key and a corresponding public key. The private keys in the ECC (usually denoted as 'd') are integers (in the range of the curve's field size, typically 256-bit integers).
The key generation in the ECC cryptography is as simple as securely generating a random integer in certain range, so any number within the range is a valid ECC private key.

The public key is a point on the elliptic curve obtained by multiplying the generator point by the private key

Public key (P) = Private key (d) * Generator point (G)

ECC crypto algorithms can use different underlying elliptic curves. Different curves provide different levels of security, performance and key length.

A point at infinity is needed to be part of the curve, and this point can be denoted as 0 (zero).

The ECC uses elliptic curves over the finite field Fp, where p is prime and p > 3.
This means that the field is a square matrix of size p x p and the points on the curve are limited to integer coordinates within the field only.

ECC is employed in two primary cryptographic functions: key exchange and digital signatures.

1) Key Exchange:
ECC allows two parties to agree on a shared secret key over an insecure channel securely. The parties exchange their public keys, perform point multiplication using their private keys and the received public keys, and arrive at the same shared secret point. This point is then used to derive a symmetric encryption key or to initialize a secure communication session.

2) Digital Signatures:
In ECC-based digital signatures, the private keyholder signs a message using their private key to produce a unique signature. The signature can be publicly verified using the corresponding public key. Verifying the signature involves performing point operations to ensure the authenticity and integrity of the message.

Overall, Elliptic Curve Cryptography offers strong security with relatively smaller key sizes compared to traditional cryptographic algorithms, making it a popular choice for various applications in the modern digital landscape, including secure communications, digital signatures, and secure key exchange protocols.


#### Bitcoin and secp256k1

As said before an elliptic curve for public key cryptography is defined with the following parameters:
- The *a* and *b* of the curve y^2^ = x^3^ + ax + b are specified.
- The prime of the finite field Fp is specified.
- The coordinates (x, y) of the generator point G are specified.
- The order of the group generated by G, n and p are specified.

These numbers are known publicly and are used to generate the public key of the elliptic curve.
Bitcoin uses a specific elliptic curve and set of mathematical constants as defined in the standard secp256k1.
Established by the National Institute of Standards and Technology (NIST) the secp256k1 is defined by the following function:

y^2^ mod p = (x^3^ + 7) mod p

The mod p (modulo prime number p) indicates that the curve is over a finite field of prime order p, where p = 2^256^ - 2^32^ - 2^9^ - 2^8^ - 2^7^ - 2^6^ - 2^4^ - 1 which is a very large prime number.


In [363]:
# Point class

class Point:

    def __init__(self, x, y, a, b):
        self.a = a
        self.b = b
        self.x = x
        self.y = y
        if self.x is None and self.y is None:
            return
        if self.y**2 != self.x**3 + a*x + b:
            raise ValueError(f'({x}, {y}) is not on the curve')

    def __eq__(self, other):
        return self.x == other.x and self.y == other.y and self.a == other.a and self.b == other.b

    def __ne__(self, other):
        return not (self == other)

    def __repr__(self):
        if self.x is None:
            return 'Point(infinity)'
        else:
            return f'Point({self.x}, {self.y})_{self.a}_{self.b}'

    def __add__(self, other):
        if self.a != other.a or self.b != other.b:
            raise TypeError('Points {}, {} are not on the same curve'.format(self, other))
        if self.x is None:
            return other
        if other.x is None:
            return self
        if self.x == other.x and self.y != other.y:
            return self.__class__(None, None, self.a, self.b)
        if self.x != other.x:
            s = (other.y - self.y) / (other.x - self.x)
            x = s**2 - self.x - other.x
            y = s * (self.x - x) - self.y
            return self.__class__(x, y, self.a, self.b)
        if self == other and self.y == 0 * self.x:
            return self.__class__(None, None, self.a, self.b)
        if self == other:
            s = (3*self.x**2 + self.a) / (2*self.y)
            x = s**2 - 2*self.x
            y = s * (self.x -x) - self.y
            return self.__class__(x, y, self.a, self.b)

    # Double-and-add Algorithm
    def __rmul__(self, coefficient):
        coef = coefficient
        current = self
        result = self.__class__(None, None, self.a, self.b)
        while coef:
            if coef & 1:
                result += current
            current += current
            coef >>= 1
        return result


In [364]:
# Bitcoin elliptic curve secp256k1 parameters
# y^2 mod p = (x^3 + 7) mod p
A = 0x0000000000000000000000000000000000000000000000000000000000000000  # A = 0
B = 0x0000000000000000000000000000000000000000000000000000000000000007  # B = 7

# Prime number P = 2**256 - 2**32 - 2**9 - 2**8 - 2**7 - 2**6 - 2**4 - 1
P = 0xfffffffffffffffffffffffffffffffffffffffffffffffffffffffefffffc2f

# Generator point coordinates
gx = 0x79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798
gy = 0x483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8

# The order n of G
n = 0xfffffffffffffffffffffffffffffffebaaedce6af48a03bbfd25e8cd0364141


# Verify that the generator point (G) is on the elliptic curve
print(f"The generator point is on the elliptic curve: {gy**2 % P == (gx**3 + 7) % P}")

# and is possible to verify if the generator point (G) has the order n
x = FieldElement(gx, P)
y = FieldElement(gy, P)
seven = FieldElement(7, P)
zero = FieldElement(0, P)
G = Point(x, y, zero, seven)

print(n * G)

The generator point is on the elliptic curve: True
Point(infinity)


### Serialization

#### SHA256

The SHA256 algorithm can be summarized in 6 macro steps:

1. The data padding where the input data is padded to ensure its length is a multiple of 512 bits (64 bytes). Padding includes appending a 1 bit, followed by a sequence of 0 bits and finally the original length of the input in bits (64 bits) as a big-endian 64 bit integer.
2. Initialization. The algorithm initializes eight 32 bit words, called "working variables", with specific constant values. It also defines a 64 element "message schedule" array that will be used in the computation.
3. Message Processing: the padded input data is divided into 512 bit blocks, and each block undergoes a series of computations.
4. Compression Function: Each block is further divided into sixteen 32 bit words forming the message schedule array.
From the 17th word onwards, additional words are generated using a specific formula based on the previously generated words.
5. Round operations: The message schedule words and working variables go through a series of rounds (64 rounds in total). Each round involves various bitwise operations such as XOR, AND, OR, as well as logical functions such as Ch (choose), Maj (majority), and Sigma functions.
6. Final Hash Value: After processing all blocks, the final hash value is derived by concatenating the values of the working variables in their respective order. These values represent the 256 bit hash value of the input data.

The resulting SHA256 hash is tipically represented as a sequence of 64 hexadecimal digits providing a compact and unique representation of the input data.

SHA256 plays a crucial role in Bitcoin. Bitcoin utilizes SHA256 as part of its underlying cryptographic algorithms for several key operations:

1. Mining: Bitcoin mining involves the process of adding new transactions to the blockchain and creating new blocks. Miners compete to solve a computationally intensive mathematical puzzle, the proof-of-work.
The proof-of-work algorithm relies on SHA256. Miners take the block header, which includes the transactions and other metadata, and concatenate it with random number called a nonce. They then hash this combined data using SHA-256 repeatedly until they find a hash that meets specific criteria (target difficulty). This process requires significant computational power, and the miner who succesfully finds a valid hash is rewarded with newly minted bitcoins.
2. Block Verification: Each block in the blockchain is verified by nodes in the network to ensure its validity. As part of the verification process, the block header is hashed using SHA-256, and the resulting hash is compared to the previous block's hash. This ensures the integrity and continuity of the blockchain.
3. Transaction Verification: SHA256 is used to verify the integrity of individual transactions within a block. Each transaction has a unique transaction ID, which is calculated by hashing the transaction data, including inputs, outputs, and metadata, using SHA256.
Nodes in the network can independently verify that the transaction ID matches the transaction data, ensuring that the transaction has not been tampered with. Additionally, the transaction ID is used in the construction of Merkle trees, which provide an efficient way to confirm the inclusion of transactions in a block.
4. Address Generation: Bitcoin addresses are derived from public keys using a series of cryptographic operations, including SHA256. Public keys are hashed twice, first with SHA256 and then with RIPEMD160 resulting in a unique address. This ensures that each Bitcoin address is associated with a specific public key, providing a level of privacy for users.

In [365]:
def gen_sha256_with_variable_scope_protector_to_not_pollute_global_namespace():

    """
    SHA256 implementation.

    Follows the FIPS PUB 180-4 description for calculating SHA-256 hash function
    https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf

    Noone in their right mind should use this for any serious reason. This was written
    purely for educational purposes.
    """

    import math
    from itertools import count, islice

    # -----------------------------------------------------------------------------
    # SHA-256 Functions, defined in Section 4

    def rotr(x, n, size=32):
        return (x >> n) | (x << size - n) & (2**size - 1)

    def shr(x, n):
        return x >> n

    def sig0(x):
        return rotr(x, 7) ^ rotr(x, 18) ^ shr(x, 3)

    def sig1(x):
        return rotr(x, 17) ^ rotr(x, 19) ^ shr(x, 10)

    def capsig0(x):
        return rotr(x, 2) ^ rotr(x, 13) ^ rotr(x, 22)

    def capsig1(x):
        return rotr(x, 6) ^ rotr(x, 11) ^ rotr(x, 25)

    def ch(x, y, z):
        return (x & y)^ (~x & z)

    def maj(x, y, z):
        return (x & y) ^ (x & z) ^ (y & z)

    def b2i(b):
        return int.from_bytes(b, 'big')

    def i2b(i):
        return i.to_bytes(4, 'big')

    # -----------------------------------------------------------------------------
    # SHA-256 Constants

    def is_prime(n):
        return not any(f for f in range(2,int(math.sqrt(n))+1) if n%f == 0)

    def first_n_primes(n):
        return islice(filter(is_prime, count(start=2)), n)

    def frac_bin(f, n=32):
        """ return the first n bits of fractional part of float f """
        f -= math.floor(f) # get only the fractional part
        f *= 2**n # shift left
        f = int(f) # truncate the rest of the fractional content
        return f

    def genK():
        """
        Follows Section 4.2.2 to generate K

        The first 32 bits of the fractional parts of the cube roots of the first
        64 prime numbers:

        428a2f98 71374491 b5c0fbcf e9b5dba5 3956c25b 59f111f1 923f82a4 ab1c5ed5
        d807aa98 12835b01 243185be 550c7dc3 72be5d74 80deb1fe 9bdc06a7 c19bf174
        e49b69c1 efbe4786 0fc19dc6 240ca1cc 2de92c6f 4a7484aa 5cb0a9dc 76f988da
        983e5152 a831c66d b00327c8 bf597fc7 c6e00bf3 d5a79147 06ca6351 14292967
        27b70a85 2e1b2138 4d2c6dfc 53380d13 650a7354 766a0abb 81c2c92e 92722c85
        a2bfe8a1 a81a664b c24b8b70 c76c51a3 d192e819 d6990624 f40e3585 106aa070
        19a4c116 1e376c08 2748774c 34b0bcb5 391c0cb3 4ed8aa4a 5b9cca4f 682e6ff3
        748f82ee 78a5636f 84c87814 8cc70208 90befffa a4506ceb bef9a3f7 c67178f2
        """
        return [frac_bin(p ** (1/3.0)) for p in first_n_primes(64)]

    def genH():
        """
        Follows Section 5.3.3 to generate the initial hash value H^0

        The first 32 bits of the fractional parts of the square roots of
        the first 8 prime numbers.

        6a09e667 bb67ae85 3c6ef372 a54ff53a 9b05688c 510e527f 1f83d9ab 5be0cd19
        """
        return [frac_bin(p ** (1/2.0)) for p in first_n_primes(8)]

    # -----------------------------------------------------------------------------

    def pad(b):
        """ Follows Section 5.1: Padding the message """
        b = bytearray(b) # convert to a mutable equivalent
        l = len(b) * 8 # note: len returns number of bytes not bits

        # append but "1" to the end of the message
        b.append(0b10000000) # appending 10000000 in binary (=128 in decimal)

        # follow by k zero bits, where k is the smallest non-negative solution to
        # l + 1 + k = 448 mod 512
        # i.e. pad with zeros until we reach 448 (mod 512)
        while (len(b)*8) % 512 != 448:
            b.append(0x00)

        # the last 64-bit block is the length l of the original message
        # expressed in binary (big endian)
        b.extend(l.to_bytes(8, 'big'))

        return b

    def sha256(b: bytes) -> bytes:

        # Section 4.2
        K = genK()

        # Section 5: Preprocessing
        # Section 5.1: Pad the message
        b = pad(b)
        # Section 5.2: Separate the message into blocks of 512 bits (64 bytes)
        blocks = [b[i:i+64] for i in range(0, len(b), 64)]

        # for each message block M^1 ... M^N
        H = genH() # Section 5.3

        # Section 6
        for M in blocks: # each block is a 64-entry array of 8-bit bytes

            # 1. Prepare the message schedule, a 64-entry array of 32-bit words
            W = []
            for t in range(64):
                if t <= 15:
                    # the first 16 words are just a copy of the block
                    W.append(bytes(M[t*4:t*4+4]))
                else:
                    term1 = sig1(b2i(W[t-2]))
                    term2 = b2i(W[t-7])
                    term3 = sig0(b2i(W[t-15]))
                    term4 = b2i(W[t-16])
                    total = (term1 + term2 + term3 + term4) % 2**32
                    W.append(i2b(total))

            # 2. Initialize the 8 working variables a,b,c,d,e,f,g,h with prev hash value
            a, b, c, d, e, f, g, h = H

            # 3.
            for t in range(64):
                T1 = (h + capsig1(e) + ch(e, f, g) + K[t] + b2i(W[t])) % 2**32
                T2 = (capsig0(a) + maj(a, b, c)) % 2**32
                h = g
                g = f
                f = e
                e = (d + T1) % 2**32
                d = c
                c = b
                b = a
                a = (T1 + T2) % 2**32

            # 4. Compute the i-th intermediate hash value H^i
            delta = [a, b, c, d, e, f, g, h]
            H = [(i1 + i2) % 2**32 for i1, i2 in zip(H, delta)]

        return b''.join(i2b(i) for i in H)

    return sha256


sha256 = gen_sha256_with_variable_scope_protector_to_not_pollute_global_namespace()

print("Verify empty hash:", sha256(b'').hex()) # should be e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
print(f"A random bytes message hash: {sha256(b'Here is a random bytes message, cool right?').hex()}")
print("Number of bytes in a sha256 digest: ", len(sha256(b'')))

Verify empty hash: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
A random bytes message hash: bf1e756ee0047e5c541d0fdfa97bcd5b1323939854cafe914304f58a1e14f51a
Number of bytes in a sha256 digest:  32


#### RIPEMD160

The RIPEMD-160 (RACE Integrity Primitives Evaluation Message Digest 160) algorithm is a cryptographic hash function that was developed as an improved version of the original RIPEMD algorithm.
The key features and characteristics of the RIPEMD-160 algorithm are:

1. Hash Function: RIPEMD-160 is a hash function, meaning that it's computationally infeasible to reverse the process and obtain the original input message from the hash value.
2. Collision Resistance: RIPEMD-160 aims to provide a high level of collision resistance, which means that it should be extremely difficult to find two different input messages that produce the same hash value.
3. Fixed Output Size: The output of RIPEMD-160 is always a 160 bit hash value, regardless of the size of the input message.
4. Security Strength: RIPEMD-160 offers a level of security that is considered sufficient for many practical purposes.

In [366]:
def gen_ripemd160_with_variable_scope_protector_to_not_pollute_global_namespace():

    import sys
    import struct

    # -----------------------------------------------------------------------------
    # public interface

    def ripemd160(b: bytes) -> bytes:
        """ simple wrapper for a simpler API to this hash function, just bytes to bytes """
        ctx = RMDContext()
        RMD160Update(ctx, b, len(b))
        digest = RMD160Final(ctx)
        return digest

    # -----------------------------------------------------------------------------

    class RMDContext:
        def __init__(self):
            self.state = [0x67452301, 0xEFCDAB89, 0x98BADCFE, 0x10325476, 0xC3D2E1F0] # uint32
            self.count = 0 # uint64
            self.buffer = [0]*64 # uchar

    def RMD160Update(ctx, inp, inplen):
        have = int((ctx.count // 8) % 64)
        inplen = int(inplen)
        need = 64 - have
        ctx.count += 8 * inplen
        off = 0
        if inplen >= need:
            if have:
                for i in range(need):
                    ctx.buffer[have+i] = inp[i]
                RMD160Transform(ctx.state, ctx.buffer)
                off = need
                have = 0
            while off + 64 <= inplen:
                RMD160Transform(ctx.state, inp[off:])
                off += 64
        if off < inplen:
            for i in range(inplen - off):
                ctx.buffer[have+i] = inp[off+i]

    def RMD160Final(ctx):
        size = struct.pack("<Q", ctx.count)
        padlen = 64 - ((ctx.count // 8) % 64)
        if padlen < 1 + 8:
            padlen += 64
        RMD160Update(ctx, PADDING, padlen-8)
        RMD160Update(ctx, size, 8)
        return struct.pack("<5L", *ctx.state)

    # -----------------------------------------------------------------------------

    K0 = 0x00000000
    K1 = 0x5A827999
    K2 = 0x6ED9EBA1
    K3 = 0x8F1BBCDC
    K4 = 0xA953FD4E
    KK0 = 0x50A28BE6
    KK1 = 0x5C4DD124
    KK2 = 0x6D703EF3
    KK3 = 0x7A6D76E9
    KK4 = 0x00000000

    PADDING = [0x80] + [0]*63

    def ROL(n, x):
        return ((x << n) & 0xffffffff) | (x >> (32 - n))

    def F0(x, y, z):
        return x ^ y ^ z

    def F1(x, y, z):
        return (x & y) | (((~x) % 0x100000000) & z)

    def F2(x, y, z):
        return (x | ((~y) % 0x100000000)) ^ z

    def F3(x, y, z):
        return (x & z) | (((~z) % 0x100000000) & y)

    def F4(x, y, z):
        return x ^ (y | ((~z) % 0x100000000))

    def R(a, b, c, d, e, Fj, Kj, sj, rj, X):
        a = ROL(sj, (a + Fj(b, c, d) + X[rj] + Kj) % 0x100000000) + e
        c = ROL(10, c)
        return a % 0x100000000, c

    def RMD160Transform(state, block): #uint32 state[5], uchar block[64]

        x = [0]*16
        assert sys.byteorder == 'little', "Only little endian is supported atm for RIPEMD160"
        x = struct.unpack('<16L', bytes(block[0:64]))

        a = state[0]
        b = state[1]
        c = state[2]
        d = state[3]
        e = state[4]

        #/* Round 1 */
        a, c = R(a, b, c, d, e, F0, K0, 11,  0, x)
        e, b = R(e, a, b, c, d, F0, K0, 14,  1, x)
        d, a = R(d, e, a, b, c, F0, K0, 15,  2, x)
        c, e = R(c, d, e, a, b, F0, K0, 12,  3, x)
        b, d = R(b, c, d, e, a, F0, K0,  5,  4, x)
        a, c = R(a, b, c, d, e, F0, K0,  8,  5, x)
        e, b = R(e, a, b, c, d, F0, K0,  7,  6, x)
        d, a = R(d, e, a, b, c, F0, K0,  9,  7, x)
        c, e = R(c, d, e, a, b, F0, K0, 11,  8, x)
        b, d = R(b, c, d, e, a, F0, K0, 13,  9, x)
        a, c = R(a, b, c, d, e, F0, K0, 14, 10, x)
        e, b = R(e, a, b, c, d, F0, K0, 15, 11, x)
        d, a = R(d, e, a, b, c, F0, K0,  6, 12, x)
        c, e = R(c, d, e, a, b, F0, K0,  7, 13, x)
        b, d = R(b, c, d, e, a, F0, K0,  9, 14, x)
        a, c = R(a, b, c, d, e, F0, K0,  8, 15, x) #/* #15 */
        #/* Round 2 */
        e, b = R(e, a, b, c, d, F1, K1,  7,  7, x)
        d, a = R(d, e, a, b, c, F1, K1,  6,  4, x)
        c, e = R(c, d, e, a, b, F1, K1,  8, 13, x)
        b, d = R(b, c, d, e, a, F1, K1, 13,  1, x)
        a, c = R(a, b, c, d, e, F1, K1, 11, 10, x)
        e, b = R(e, a, b, c, d, F1, K1,  9,  6, x)
        d, a = R(d, e, a, b, c, F1, K1,  7, 15, x)
        c, e = R(c, d, e, a, b, F1, K1, 15,  3, x)
        b, d = R(b, c, d, e, a, F1, K1,  7, 12, x)
        a, c = R(a, b, c, d, e, F1, K1, 12,  0, x)
        e, b = R(e, a, b, c, d, F1, K1, 15,  9, x)
        d, a = R(d, e, a, b, c, F1, K1,  9,  5, x)
        c, e = R(c, d, e, a, b, F1, K1, 11,  2, x)
        b, d = R(b, c, d, e, a, F1, K1,  7, 14, x)
        a, c = R(a, b, c, d, e, F1, K1, 13, 11, x)
        e, b = R(e, a, b, c, d, F1, K1, 12,  8, x) #/* #31 */
        #/* Round 3 */
        d, a = R(d, e, a, b, c, F2, K2, 11,  3, x)
        c, e = R(c, d, e, a, b, F2, K2, 13, 10, x)
        b, d = R(b, c, d, e, a, F2, K2,  6, 14, x)
        a, c = R(a, b, c, d, e, F2, K2,  7,  4, x)
        e, b = R(e, a, b, c, d, F2, K2, 14,  9, x)
        d, a = R(d, e, a, b, c, F2, K2,  9, 15, x)
        c, e = R(c, d, e, a, b, F2, K2, 13,  8, x)
        b, d = R(b, c, d, e, a, F2, K2, 15,  1, x)
        a, c = R(a, b, c, d, e, F2, K2, 14,  2, x)
        e, b = R(e, a, b, c, d, F2, K2,  8,  7, x)
        d, a = R(d, e, a, b, c, F2, K2, 13,  0, x)
        c, e = R(c, d, e, a, b, F2, K2,  6,  6, x)
        b, d = R(b, c, d, e, a, F2, K2,  5, 13, x)
        a, c = R(a, b, c, d, e, F2, K2, 12, 11, x)
        e, b = R(e, a, b, c, d, F2, K2,  7,  5, x)
        d, a = R(d, e, a, b, c, F2, K2,  5, 12, x) #/* #47 */
        #/* Round 4 */
        c, e = R(c, d, e, a, b, F3, K3, 11,  1, x)
        b, d = R(b, c, d, e, a, F3, K3, 12,  9, x)
        a, c = R(a, b, c, d, e, F3, K3, 14, 11, x)
        e, b = R(e, a, b, c, d, F3, K3, 15, 10, x)
        d, a = R(d, e, a, b, c, F3, K3, 14,  0, x)
        c, e = R(c, d, e, a, b, F3, K3, 15,  8, x)
        b, d = R(b, c, d, e, a, F3, K3,  9, 12, x)
        a, c = R(a, b, c, d, e, F3, K3,  8,  4, x)
        e, b = R(e, a, b, c, d, F3, K3,  9, 13, x)
        d, a = R(d, e, a, b, c, F3, K3, 14,  3, x)
        c, e = R(c, d, e, a, b, F3, K3,  5,  7, x)
        b, d = R(b, c, d, e, a, F3, K3,  6, 15, x)
        a, c = R(a, b, c, d, e, F3, K3,  8, 14, x)
        e, b = R(e, a, b, c, d, F3, K3,  6,  5, x)
        d, a = R(d, e, a, b, c, F3, K3,  5,  6, x)
        c, e = R(c, d, e, a, b, F3, K3, 12,  2, x) #/* #63 */
        #/* Round 5 */
        b, d = R(b, c, d, e, a, F4, K4,  9,  4, x)
        a, c = R(a, b, c, d, e, F4, K4, 15,  0, x)
        e, b = R(e, a, b, c, d, F4, K4,  5,  5, x)
        d, a = R(d, e, a, b, c, F4, K4, 11,  9, x)
        c, e = R(c, d, e, a, b, F4, K4,  6,  7, x)
        b, d = R(b, c, d, e, a, F4, K4,  8, 12, x)
        a, c = R(a, b, c, d, e, F4, K4, 13,  2, x)
        e, b = R(e, a, b, c, d, F4, K4, 12, 10, x)
        d, a = R(d, e, a, b, c, F4, K4,  5, 14, x)
        c, e = R(c, d, e, a, b, F4, K4, 12,  1, x)
        b, d = R(b, c, d, e, a, F4, K4, 13,  3, x)
        a, c = R(a, b, c, d, e, F4, K4, 14,  8, x)
        e, b = R(e, a, b, c, d, F4, K4, 11, 11, x)
        d, a = R(d, e, a, b, c, F4, K4,  8,  6, x)
        c, e = R(c, d, e, a, b, F4, K4,  5, 15, x)
        b, d = R(b, c, d, e, a, F4, K4,  6, 13, x) #/* #79 */

        aa = a
        bb = b
        cc = c
        dd = d
        ee = e

        a = state[0]
        b = state[1]
        c = state[2]
        d = state[3]
        e = state[4]

        #/* Parallel round 1 */
        a, c = R(a, b, c, d, e, F4, KK0,  8,  5, x)
        e, b = R(e, a, b, c, d, F4, KK0,  9, 14, x)
        d, a = R(d, e, a, b, c, F4, KK0,  9,  7, x)
        c, e = R(c, d, e, a, b, F4, KK0, 11,  0, x)
        b, d = R(b, c, d, e, a, F4, KK0, 13,  9, x)
        a, c = R(a, b, c, d, e, F4, KK0, 15,  2, x)
        e, b = R(e, a, b, c, d, F4, KK0, 15, 11, x)
        d, a = R(d, e, a, b, c, F4, KK0,  5,  4, x)
        c, e = R(c, d, e, a, b, F4, KK0,  7, 13, x)
        b, d = R(b, c, d, e, a, F4, KK0,  7,  6, x)
        a, c = R(a, b, c, d, e, F4, KK0,  8, 15, x)
        e, b = R(e, a, b, c, d, F4, KK0, 11,  8, x)
        d, a = R(d, e, a, b, c, F4, KK0, 14,  1, x)
        c, e = R(c, d, e, a, b, F4, KK0, 14, 10, x)
        b, d = R(b, c, d, e, a, F4, KK0, 12,  3, x)
        a, c = R(a, b, c, d, e, F4, KK0,  6, 12, x) #/* #15 */
        #/* Parallel round 2 */
        e, b = R(e, a, b, c, d, F3, KK1,  9,  6, x)
        d, a = R(d, e, a, b, c, F3, KK1, 13, 11, x)
        c, e = R(c, d, e, a, b, F3, KK1, 15,  3, x)
        b, d = R(b, c, d, e, a, F3, KK1,  7,  7, x)
        a, c = R(a, b, c, d, e, F3, KK1, 12,  0, x)
        e, b = R(e, a, b, c, d, F3, KK1,  8, 13, x)
        d, a = R(d, e, a, b, c, F3, KK1,  9,  5, x)
        c, e = R(c, d, e, a, b, F3, KK1, 11, 10, x)
        b, d = R(b, c, d, e, a, F3, KK1,  7, 14, x)
        a, c = R(a, b, c, d, e, F3, KK1,  7, 15, x)
        e, b = R(e, a, b, c, d, F3, KK1, 12,  8, x)
        d, a = R(d, e, a, b, c, F3, KK1,  7, 12, x)
        c, e = R(c, d, e, a, b, F3, KK1,  6,  4, x)
        b, d = R(b, c, d, e, a, F3, KK1, 15,  9, x)
        a, c = R(a, b, c, d, e, F3, KK1, 13,  1, x)
        e, b = R(e, a, b, c, d, F3, KK1, 11,  2, x) #/* #31 */
        #/* Parallel round 3 */
        d, a = R(d, e, a, b, c, F2, KK2,  9, 15, x)
        c, e = R(c, d, e, a, b, F2, KK2,  7,  5, x)
        b, d = R(b, c, d, e, a, F2, KK2, 15,  1, x)
        a, c = R(a, b, c, d, e, F2, KK2, 11,  3, x)
        e, b = R(e, a, b, c, d, F2, KK2,  8,  7, x)
        d, a = R(d, e, a, b, c, F2, KK2,  6, 14, x)
        c, e = R(c, d, e, a, b, F2, KK2,  6,  6, x)
        b, d = R(b, c, d, e, a, F2, KK2, 14,  9, x)
        a, c = R(a, b, c, d, e, F2, KK2, 12, 11, x)
        e, b = R(e, a, b, c, d, F2, KK2, 13,  8, x)
        d, a = R(d, e, a, b, c, F2, KK2,  5, 12, x)
        c, e = R(c, d, e, a, b, F2, KK2, 14,  2, x)
        b, d = R(b, c, d, e, a, F2, KK2, 13, 10, x)
        a, c = R(a, b, c, d, e, F2, KK2, 13,  0, x)
        e, b = R(e, a, b, c, d, F2, KK2,  7,  4, x)
        d, a = R(d, e, a, b, c, F2, KK2,  5, 13, x) #/* #47 */
        #/* Parallel round 4 */
        c, e = R(c, d, e, a, b, F1, KK3, 15,  8, x)
        b, d = R(b, c, d, e, a, F1, KK3,  5,  6, x)
        a, c = R(a, b, c, d, e, F1, KK3,  8,  4, x)
        e, b = R(e, a, b, c, d, F1, KK3, 11,  1, x)
        d, a = R(d, e, a, b, c, F1, KK3, 14,  3, x)
        c, e = R(c, d, e, a, b, F1, KK3, 14, 11, x)
        b, d = R(b, c, d, e, a, F1, KK3,  6, 15, x)
        a, c = R(a, b, c, d, e, F1, KK3, 14,  0, x)
        e, b = R(e, a, b, c, d, F1, KK3,  6,  5, x)
        d, a = R(d, e, a, b, c, F1, KK3,  9, 12, x)
        c, e = R(c, d, e, a, b, F1, KK3, 12,  2, x)
        b, d = R(b, c, d, e, a, F1, KK3,  9, 13, x)
        a, c = R(a, b, c, d, e, F1, KK3, 12,  9, x)
        e, b = R(e, a, b, c, d, F1, KK3,  5,  7, x)
        d, a = R(d, e, a, b, c, F1, KK3, 15, 10, x)
        c, e = R(c, d, e, a, b, F1, KK3,  8, 14, x) #/* #63 */
        #/* Parallel round 5 */
        b, d = R(b, c, d, e, a, F0, KK4,  8, 12, x)
        a, c = R(a, b, c, d, e, F0, KK4,  5, 15, x)
        e, b = R(e, a, b, c, d, F0, KK4, 12, 10, x)
        d, a = R(d, e, a, b, c, F0, KK4,  9,  4, x)
        c, e = R(c, d, e, a, b, F0, KK4, 12,  1, x)
        b, d = R(b, c, d, e, a, F0, KK4,  5,  5, x)
        a, c = R(a, b, c, d, e, F0, KK4, 14,  8, x)
        e, b = R(e, a, b, c, d, F0, KK4,  6,  7, x)
        d, a = R(d, e, a, b, c, F0, KK4,  8,  6, x)
        c, e = R(c, d, e, a, b, F0, KK4, 13,  2, x)
        b, d = R(b, c, d, e, a, F0, KK4,  6, 13, x)
        a, c = R(a, b, c, d, e, F0, KK4,  5, 14, x)
        e, b = R(e, a, b, c, d, F0, KK4, 15,  0, x)
        d, a = R(d, e, a, b, c, F0, KK4, 13,  3, x)
        c, e = R(c, d, e, a, b, F0, KK4, 11,  9, x)
        b, d = R(b, c, d, e, a, F0, KK4, 11, 11, x) #/* #79 */

        t = (state[1] + cc + d) % 0x100000000
        state[1] = (state[2] + dd + e) % 0x100000000
        state[2] = (state[3] + ee + a) % 0x100000000
        state[3] = (state[4] + aa + b) % 0x100000000
        state[4] = (state[0] + bb + c) % 0x100000000
        state[0] = t % 0x100000000

    return ripemd160

ripemd160 = gen_ripemd160_with_variable_scope_protector_to_not_pollute_global_namespace()

print(f"RIPEMD-160 hash test: {ripemd160(b'hello this is a test').hex()}")
print("Number of bytes in a RIPEMD-160 digest: ", len(ripemd160(b'')))



RIPEMD-160 hash test: f51960af7dd4813a587ab26388ddab3b28d1f7b4
Number of bytes in a RIPEMD-160 digest:  20


#### Base58 and Base58Check Encoding

In order to represent long numbers in a compact way, computer systems use mixed-alphanumeric representations with a base higher than 10. For example the hexadecimal system uses 16 with the letters A through F as the six additional simbols. So a number represented in hexadecimal format is shorter than the equivalent decimal representation. Base64 representation uses 26 lover case letter, 26 capital letters, 10 numerals and 2 more characters such as + and / to transmit binary data over text-based media such as email.

Base58 is a text-based binary-encoding format developed for use in Bitcoin. Base58 is a subset of Base64, omitting few characters that are frequently mistaken for one another. Specifically those symbols are 0 (the number zero), O (capital o), l (lower L), I (capital i), and the symbols + and /.

To add extra security against typos and transcription errors Base58Check is a Base58 encoding format which has a built-in error-checking code.

The checksum is an additional four bytes added to the end of the data that is being encoded, and is derived from the hash of the encoded data.

Here follows a step-by-step breakdown of how Base58Check encoding works:
1. Start with the raw data, such as a Bitcoin address or a Public Key.
2. Take the raw data and add a version byte in front of it. The version byte is usually used to indicate the network type (mainnet, testnet, etc.) or the type of data being encoded.
3. Calculate the double-SHA256 hash of the version byte and the raw data.
4. Take the first four bytes of the double-SHA256 hash calculated in the previous step. These four bytes will serve as the checksum.
5. Append the four-byte checksum to the end of the version byte and raw data.
6. Convert the resulting binary data (version byte + raw data + checksum) into a large integer.
7. Encode the large integer from step 6 using the Base58 encoding scheme. This involves representing the integer in Base58 using a set of characters that excludes confusing or similar-looking characters.
8. Finally, prepend a certain number of '1' characters to the Base58 encoded data to represent leading zero bytes in the original data.

The resulting Base58Check encoded string is the final Bitcoin address.

It's possible to create a subclass of both the field element and the point to work exclusively with the parameter of *secp256k1*.


### Public Key Cryptography

Public key cryptography involves a pair of keys known as a public key and a private key (a public key pair), which are associated with an entity that needs to authenticate its identity electronically or to sign or encrypt data.
Security of public-key cryptography depends on keeping the private key secret; the public key can be openly distributed without compromising security.

In a public-key encryption system, anyone with a public key can encrypt a message, yielding a ciphertext, but only those who know the corresponding private key can decrypt the ciphertext to obtain the original message.

In a digital signature system, a sender can use a private key together with a message to create a signature. Anyone with the corresponding public key can verify whether the signature matches the message, but a forger who does not know the private key cannot find any message/signature pair that will pass verification with the public key.
Public key algorithms are fundamental security primitives in modern cryptosystems, including applications and protocols which offer assurance of the confidentiality, authenticity and non-repudiability of electronic communications and data storage.

The key operation needed in our case is P = eG, which is an asymmetric equation. It's easy to compute P when e and G are known, but we cannot easily compute e when we know P and G.
e is the private key and P the public key. The private key is a single 256-bit number and the public key is a coordinate (x,y), where x and y are each 256-bit numbers.


#### ECDSA

ECDSA stands for Elliptic Curve Digital Signature Algorithm.
ECDSA is based on the mathematics of elliptic curves, which offer a high level of security with shorter key lengths compared to traditional algorithms like RSA.

Digital Signatures are used to ensure the authenticity, integrity, and non-repudiation of data. With a digital signature a sender can sign a message to prove its origin. Recipients can verify the signature to ensure the message has not been tampered with and indeed came from the claimed sender.

In ECDSA each participant generates a pair of cryptographic keys a private key and a corresponding public key. The private key is a randomly generated secret number, while the public key is derived from the private key using elliptic curve operations.

To create a digital signature using ECDSA, the following steps are performed:
- Hashing
- Random number generation
- Signature calculation

In the validation process, to verify a received signature the recipient performs the following steps:
- Hashing
- Public key derivation
- Signature verification

The security of the ECDSA depends on the size of the elliptic curve used and the length of the private key. The larger the curve and key length, the more secure the signature is.
In Bitcoin the length of the key is 256 bits.

#### SEC Format
The Standards for Efficient Cryptography (SEC) is the standard method for encoding a Bitcoin public key. A Bitcoin public key is a point on the elliptic curve, called secp256k1, and thus has an x and y coordinate.

However, each x value only has two possible y values, and due to the nature of secp256k1, one of these y values is odd and the other is even for every x value. Thus, the x value and the parity of the y value are sufficient to identify the public key. The parity of the y value is displayed by either a 0x02 or an 0x03 byte, indicating even or odd respectively. This is followed by the x value, which is a 32 byte number.

This format is called compressed, because it only takes up 33 bytes, compared with the uncompressed SEC format, which begins with an 0x04 prefix followed by the full x and y values and takes up 65 bytes.

#### DER Signatures
DER (Distinguished Encoding Rules) is a binary encoding format used to represent data structures in public key cryptography. DER signatures are a type of encoding used to guarantee the suitability of a digital signature under any environment, and are used in Bitcoin and cryptocurrencies to ensure the validity of signatures.

A digital signature is a cryptographic technique that provides integrity, authenticity, and non-repudiation to data. It allows the recipient of a message or data to verify that it came from the claimed sender and has not been altered during transmission. Digital signatures are based on asymmetric cryptography, which involves the use of a public-private key pair.

The process of creating a digital signature involves the following steps:

1. Generating the Message Digest: The original data (message) is first passed through a one-way cryptographic hash function (e.g., SHA-256). This produces a fixed-size output called the message digest or hash value.
2. Signing the Digest: The signer's private key is used to encrypt the message digest, creating the digital signature. This process is typically performed using an algorithm like RSA or ECDSA.
3. Encoding the Signature: The resulting digital signature is then encoded in DER format before attaching it to the original message.

A DER signature consists of two main components:

1. R (Positive Integer): The first component represents part of the signature value. It is a positive integer that represents a point on the elliptic curve (in the case of ECDSA) or a number in the case of RSA.

2. S (Positive Integer): The second component also represents part of the signature value. Like R, it is a positive integer that complements R to create the complete signature.

Both R and S are critical components for verifying the digital signature. They are used in the verification process to ensure the authenticity of the signature and the integrity of the data.

In [367]:
# base58 encoding / decoding utilities
# reference: https://en.bitcoin.it/wiki/Base58Check_encoding

alphabet = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'

def b58encode(b: bytes) -> str:
    print(len(b))
    assert len(b) == 25 # version is 1 byte, pkb_hash 20 bytes, checksum 4 bytes
    n = int.from_bytes(b, 'big')
    chars = []
    while n:
        n, i = divmod(n, 58)
        chars.append(alphabet[i])
    # special case handle the leading 0 bytes... ¯\_(ツ)_/¯
    num_leading_zeros = len(b) - len(b.lstrip(b'\x00'))
    res = num_leading_zeros * alphabet[0] + ''.join(reversed(chars))
    return res

# secp256k1 Field element
class S256Field(FieldElement):

    def __init__(self, num, prime=None):
        super().__init__(num=num, prime=P)

    def __repr__(self):
        return '{:x}'.format(self.num).zfill(64)

    def sqrt(self):
        return self**((P + 1) // 4)


# secp256k1 Point class
class S256Point(Point):

    def __init__(self, x, y, a=None, b=None):
        a, b = S256Field(A), S256Field(B)
        if type(x) == int:
            super().__init__(x=S256Field(x), y=S256Field(y), a=a, b=b)
        else:
            super().__init__(x=x, y=y, a=a, b=b)

    def __repr__(self):
        if self.x is None:
            return 'S256Point(infinity)'
        else:
            return f'S256Point({self.x}, {self.y})'

    def __rmul__(self, coefficient):
        coef = coefficient % n
        return super().__rmul__(coef)

    def verify(self, z, sig):
        # Fermat's Little Theorem, 1/s = pow(s, N-2, N)
        s_inv = pow(sig.s, n - 2, n)
        # u = z / s
        u = z * s_inv % n
        # v = r / s
        v = sig.r * s_inv % n
        # u*G + v*P should have as the x coordinate, r
        total = u * G + v * self
        return total.x.num == sig.r

    def sec(self, compressed=True):
        """
        Returns the binary version of the SEC format
        """
        # if compressed, starts with b'\x02' if self.y.num is even, b'\x03' if self.y is odd
        # then self.x.num remember, you have to convert self.x.num/self.y.num to
        # binary (some_integer.to_bytes(32, 'big'))
        if compressed:
            if self.y.num % 2 == 0:
                return b'\x02' + self.x.num.to_bytes(32, 'big')
            else:
                return b'\x03' + self.x.num.to_bytes(32, 'big')
        else:
            # if non-compressed, starts with b'\x04' followod by self.x and then self.y
            return b'\x04' + self.x.num.to_bytes(32, 'big') + \
                self.y.num.to_bytes(32, 'big')

    # SHA256 followed by RIPEMD160
    def hash160(self, compressed=True):
        return ripemd160(sha256(self.sec(compressed=compressed)))

    # Two rounds of SHA256
    def hash256(self, compressed=True):
        return sha256(sha256(self.sec(compressed=compressed)))

    def b58encode_checksum(self, s: str):
        return b58encode(s + self.hash256(s)[:4])

    def b58decode(self, s: str):
        num = 0
        for c in s:
            num *= 58
            num += alphabet.index(c)
        combined = num.to_bytes(25, 'big')
        checksum = combined[-4:]
        if self.hash256(combined[:-4])[:4]!= checksum:
            raise ValueError(f'Bad address: {checksum} {self.hash256(combined[:-4])[:4]}')
        return combined[1:-4]

    def address(self, compressed=True, testnet=True):
        """
        Returns the address string
        """
        h160 = self.hash160(compressed)
        if testnet:
            prefix = b'\x6f'
        else:
            prefix = b'\x00'
        return b58encode(prefix + h160)


In [368]:
from io import BytesIO

class Signature:
    
    def __init__(self, r, s):
        self.r = r
        self.s = s
        
    def __repr__(self):
        return f'Signature(r={self.r}, s={self.s})'
    
    def der(self):
        rbin = self.r.to_bytes(32, 'big')
        rbin = rbin.lstrip(b'\x00')
        if rbin[0] & 0x80:
            rbin = b'\x00' + rbin
        result = bytes([2, len(rbin)]) + rbin
        sbin = self.s.to_bytes(32, 'big')
        sbin = sbin.lstrip(b'\x00')
        if sbin[0] & 0x80:
            sbin = b'\x00' + sbin
        result += bytes([2, len(sbin)]) + sbin
        return bytes([0x30, len(result)]) + result
    
    @classmethod
    def parse(cls, signature_bin):
        s = BytesIO(signature_bin)
        compound = s.read(1)[0]
        if compound != 0x30:
            raise ValueError(f'Bad signature: {signature_bin}')
        length = s.read(1)[0]
        if length + 2 != len(signature_bin):
            raise ValueError(f'Bad signature length: {len(signature_bin)}')
        marker = s.read(1)[0]
        if marker != 0x02:
            raise ValueError(f'Bad signature: {marker}')
        rlength = s.read(1)[0]
        r = int.from_bytes(s.read(rlength), 'big')
        marker = s.read(1)[0]
        if marker!= 0x02:
            raise ValueError(f'Bad signature: {marker}')
        slength = s.read(1)[0]
        s = int.from_bytes(s.read(slength), 'big')
        if len(signature_bin) != 6 + rlength + slength:
            raise ValueError(f'Signature too long: {len(signature_bin)}')
        return cls(r, s)


#### HMAC

HMAC (Hash-based Message Authentication Code) is a specific type of message authentication code (MAC) that provides a way to verify the integrity and authenticity of a message or data. It is widely used in various network protocols and applications to ensure data integrity and prevent tampering.

HMAC uses a cryptographic hash function in combination with a secret key to generate a fixed-size output (the MAC) for a given input (the message). The HMAC algorithm is designed to be resistant to various cryptographic attacks, including collision attacks and length extension attacks.

The HMAC process can be summarized as follows:

1. Input: A message (M) that needs to be protected, and a secret key (K) known only to the sender and receiver.

2. Hash Function: HMAC uses a cryptographic hash function, such as SHA-256 or SHA-512, which takes an arbitrary-length message and produces a fixed-size hash value.

3. Key Padding: If the secret key (K) is shorter than the block size of the hash function, it is padded to match the block size. If it is longer, it is first hashed to reduce it to the correct size.

4. Inner Padding: The XOR (exclusive OR) operation is applied to the padded key and a constant value to create the inner padding.

5. Outer Padding: The XOR operation is applied to the padded key and a different constant value to create the outer padding.

6. Hash Computation: The inner padding is concatenated with the message (M), and the resulting data is hashed using the hash function. Then, the outer padding is concatenated with the previous hash result, and the resulting data is hashed again.

7. Output: The final hash value is the HMAC, which serves as the message authentication code.

HMAC provides several security properties:

1. Integrity: The receiver can verify the integrity of the message by recomputing the HMAC using the received message and the shared secret key and comparing it to the received HMAC.

2. Authenticity: Since the HMAC is based on the secret key, it ensures that the message was generated by someone possessing the key.

3. Keyed Hashing: HMAC uses a secret key, making it different from standard hash functions, which only provide integrity but not authenticity.

HMAC is widely used in various security protocols, such as IPsec, TLS, and SSH, to ensure the authenticity and integrity of transmitted data, and it is an essential component of secure communication on the internet.

In [369]:
import hmac

# Private Key class implementation
class PrivateKey:
    
    def __init__(self, secret):
        self.secret = secret
        self.point = secret * G
        
    def hex(self):
        return '{:x}'.format(self.secret).zfill(64)
    
    def deterministic_k(self, z):
        k = b'\x00' * 32
        v = b'\x01' * 32
        if z > n:
            z -= n
        z_bytes = z.to_bytes(32, 'big')
        secret_bytes = self.secret.to_bytes(32, 'big')
        s256 = sha256
        k = hmac.new(k, v + b'\x00' + secret_bytes + z_bytes, s256).digest()
        v = hmac.new(k, v, s256).digest()
        k = hmac.new(k, v + b'\x01' + secret_bytes + z_bytes, s256).digest()
        v = hmac.new(k, v, s256).digest()
        while True:
            v = hmac.new(k, v, s256).digest()
            candidate = int.from_bytes(v, 'big')
            if 1 <= candidate < n:
                return candidate
            k = hmac.new(k, v + b'\x00', s256).digest()
            v = hmac.new(k, v, s256).digest()
            
    def sign(self, z):
        k = self.deterministic_k(z)
        r = (k * G).x.num   # r is the x coordinate of the point k*G
        k_inv = pow(k, n-2, n)  # 1/k = pow(k, N-2, N)
        s = (z + r * self.secret) * k_inv % n
        if s > n / 2:
            s = n - s
        return Signature(r, s)
    
    def wif(self, compressed=True, testnet=True):
        secret_bytes = self.secret.to_bytes(32, 'big')
        if testnet:
            prefix = b'\xef'
        else:
            prefix = b'\x80'
        if compressed:
            suffix = b'\x01'
        else:
            suffix = b''
        return b58encode_checksum(prefix + secret_bytes + suffix)

## 2. Generation of the Crypto identity

In the Bitcoin network, public-key cryptography is used to secure transactions and manage ownership of funds.

A *private key* in Bitcoin is a randomly generated 256-bit number (64 characters in hexadecimal format). It is the most critical piece of information in a Bitcoin wallet as it grants control and ownership over the associated funds.
With the private key, is possible to create digital signatures that authenticate the transactions and spend the bitcoins associated with the corresponding public address.
So it's essential to keep the private key secure and confidential.

A *Public key* is derived from the private key using the elliptic curve cryptography, in particular the public key it's a point on the elliptic curve and can be represented as an uncompressed 65 bytes hexadecimal value or as compressed 33 bytes hexadecimal value.
The public key is used to create a Bitcoin address and verify digital signatures generated by the associated private key.
Unlike the private key, the public key can be shared openly without compromising the security of the funds.

A *Bitcoin address* is a shorter and more user-friendly representation of the public key. It's tipically 26-35 characters long, starts with "1", "3", or "bc1", and can be used as a unique identifier for the public key.
The address is used to receive funds. The process of converting the public key to a Bitcoin address is as follows:

1. Hashing the public key with SHA-256.
2. Hashing the result again with RIPEMD-160.
3. Adding the version byte (0x00 in case of mainnet address) to the end of the result and generate a 4-byte checksum.
4. Encoding the data using Base58Check to create the final Bitcoin address.

It's important to note that while a public key can be derived from a private key, the reverse is not computationally feasible. That means you cannot obtain a private key from a public key or Bitcoin address, ensuring the security of the system.

Private keys, public keys, and addresses are integral components of Bitcoin's cryptographic system.

To summarize, the private key allows you to control the associated funds, the public key is used to create addresses and verify signatures, and the address is used for receiving funds and representing the public key in a user-friendly way

In [370]:
# Private Key generation
pk = PrivateKey(int.from_bytes(b'Bitcoin is cool!', 'big'))
print(f'Private key: {pk.hex()}')

Private key: 00000000000000000000000000000000426974636f696e20697320636f6f6c21


In [371]:
# Public Key generation
pbk_point = S256Point(pk.point.x, pk.point.y)
public_key = pbk_point.sec(compressed=True)
print(f'Public key: {public_key.hex()}')

Public key: 031624f69b1b10f449846762615c16aec047b6eedfe5e10a5d12595f125e751a95


In [372]:
# Bitcoin Address generation
# TODO: Missing checksum in b58encode, b58encode_checksum error!
bitcoin_address = pbk_point.address(testnet=False)

21


AssertionError: 