I will build a simple blockchain data structure which relates to Bitcoin.

Agenda:
* SHA-256
* Hashing property
* Create genesis block
* Build blockchain
* Check data integrity

Reference: https://medium.com/coinmonks/building-a-simple-blockchain-data-structure-with-python-e7ebd448647a

## SHA-256

In [2]:
import hashlib
hashlib.sha256(b"hello world").hexdigest()

'b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9'

The hashing process turns string "hello world" into a fixed 256 bit string. One character is 8 bits, so 64 characters string.

We can check for the 256 bits, which is why it is called SHA-256.

In [4]:

bin(0xb94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9)

'0b1011100101001101001001111011100110010011010011010011111000001000101001010010111001010010110101111101101001111101101010111111101011000100100001001110111111100011011110100101001110000000111011101001000010001000111101111010110011100010111011111100110111101001'

## Hashing property

* Collision free: it is hard to find hash(x) is equal hash(y) where x is different than y.
* The outputs will be very different with a tiny change from inputs.

In [6]:
hashlib.sha256(b"1").hexdigest()

'6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b'

In [7]:
hashlib.sha256(b"2").hexdigest()

'd4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35'

## Create genesis block

In [8]:
import hashlib, json

In [9]:
block_genesis = {
    'prev_hash': None,
    'transactions': [1,3,4,2]
}

The transactions are like "a pays 1 btc to b".

Serialize the block so it can be hashed.

In [12]:
block_genesis_serialized = json.dumps(block_genesis, sort_keys = True).encode('utf-8')
block_genesis_hash = hashlib.sha256(block_genesis_serialized).hexdigest()
#utf-8 is one of the character encodings that can be used to implement unicode

## Build block chain

In [17]:
block_2 = {
    'prev_hash': block_genesis_hash,
    'transactions': [3,3,3,8,7, 12]
}

hash block_2

In [18]:
block_2_serialized = json.dumps(block_2, sort_keys = True).encode('utf-8')
block_2_hash = hashlib.sha256(block_2_serialized).hexdigest()

build block_3

In [19]:
block_3 = {
    'prev_hash': block_2_hash,
    'transactions': [3,4,4,8,34]
}

hash block_3

In [20]:
block_3_serialized = json.dumps(block_3, sort_keys = True).encode('utf-8')
block_3_hash = hashlib.sha256(block_3_serialized).hexdigest()

## Check data integrity

To make sure data has not been tampered, we need to check the last block's hash.

In [21]:
def hash_blocks(blocks):
    prev_hash = None
    for block in blocks:
        block['prev_hash'] = prev_hash
        block_serialized = json.dumps(block, sort_keys = True).encode('utf-8')
        block_hash = hashlib.sha256(block_serialized).hexdigest()
        prev_hash = block_hash
    return prev_hash

In [22]:
print("original hash")
print(hash_blocks([block_genesis, block_2, block_3]))

original hash
45eda4f7a76bf0f92a0acda2ce4752dfbe167473376f766f22d7ec68501cac40


In [23]:
print("tampering the data")
block_genesis['transactions'][0] = 3

tampering the data


In [24]:
print("after being tampered")
print(hash_blocks([block_genesis,block_2,block_3]))

after being tampered
27d68dae05428be6aa244869196a481f431fca6645dd33c3df7a740afa03b7d9
