<div style="text-align: right">Paul Novaes<br>August 2018</div> 

# Blockchain

This notebook explains the basics of Blockchain by giving a simplified implementation that captures the most important concepts.

A blockchain is a list of blocks, each one containing a piece of data. The blockchain is distributed (replicated) across nodes in a peer-to-peer network. Blocks can be appended to the blockchain, but cannot be removed or tampered, as long as more than 50% of the computational power belongs to honest parties.

## Block

In Blockchain, a block is a record that has some data and the hash of the previous block in the chain. This way, blocks are ordered, by date of insertion, within the blockchain.

In [1]:
import hashlib

class BasicBlock:
    def __init__(self, data, previous_block_hash):
        self.data = data
        self.previous_block_hash = previous_block_hash

    def get_hash(self):
        record = self.data + self.previous_block_hash
        return hashlib.sha256(record.encode()).hexdigest()
    
    def __str__(self):
        return 'data: ' + self.data \
             + '\nprevious_block_hash: ' + self.previous_block_hash \
             + '\nhash: ' + self.get_hash()
    
    def __eq__(self, other):
        # This is not entirely correct but good enough for our needs.
        return self.get_hash() == other.get_hash()

## Genesis Block

The initial block of the blockchain is called the genesis block and does not have a previous block.

In [2]:
genesis_data = 'The Times 03/Jan/2009 Chancellor on brink of second bailout for banks'
genesis_previous_block_hash = 'n/a'
basic_genesis_block = BasicBlock(genesis_data, genesis_previous_block_hash)

print("Genesis Block:\n")
print(basic_genesis_block)

Genesis Block:

data: The Times 03/Jan/2009 Chancellor on brink of second bailout for banks
previous_block_hash: n/a
hash: 6b6e12db2b97827a703cac00af905fd696aae093089f79a1be1d711e74b06397


## Chain

A chain, or blockchain, is an array of blocks that starts with a genesis block and such that the "previous_block_hash" field of block at index $i + 1$ matches the hash of block at index $i$.

In [3]:
class BasicChain:
    def __init__(self, genesis_block):
        self.blocks = [genesis_block]
        
    def add_block(self, block):
        if block.previous_block_hash != self.blocks[-1].get_hash():
            print('Cannot add block:')
            print('hash:', self.blocks[-1].get_hash())
            print('previous_block_hash:', block.previous_block_hash)
            return
        self.blocks.append(block)

    def __str__(self):
        chain_str = 'Blockchain:\n'
        i = 0
        for block in self.blocks:
            chain_str += '\nblock ' + str(i)
            chain_str += '\n' + str(block) + '\n'
            i += 1
        return chain_str

For example:

In [4]:
basic_chain = BasicChain(basic_genesis_block)

block_0 = basic_chain.blocks[0]
block_1 = BasicBlock('Hello', block_0.get_hash())
block_2 = BasicBlock('Hola', block_1.get_hash())

basic_chain.add_block(block_1)
basic_chain.add_block(block_2)

print(basic_chain)

Blockchain:

block 0
data: The Times 03/Jan/2009 Chancellor on brink of second bailout for banks
previous_block_hash: n/a
hash: 6b6e12db2b97827a703cac00af905fd696aae093089f79a1be1d711e74b06397

block 1
data: Hello
previous_block_hash: 6b6e12db2b97827a703cac00af905fd696aae093089f79a1be1d711e74b06397
hash: a6c7a932f965a1995340021e5bbedcbf21a8ed212784f03deb2bc75368e0de63

block 2
data: Hola
previous_block_hash: a6c7a932f965a1995340021e5bbedcbf21a8ed212784f03deb2bc75368e0de63
hash: fc2c10ae7cfa361ae6515b650a0d264ccadb554e6f4d894fdb6cbf9af6c6122a



## Chain Validation

Whoever has a blockchain can (and should) validate it to make sure it is correct.

In [5]:
def validate_basic_chain(chain, genesis_block):
    block_count = len(chain.blocks)
    if block_count == 0:
        print('Empty chain!')
        return False
    block_0 = chain.blocks[0]
    if block_0 != genesis_block:
        print('Incorrect genesis block!')
        print(block_0)
        print(genesis_block)
        return False
    for i in range(1, len(chain.blocks)):
        if chain.blocks[i - 1].get_hash() != chain.blocks[i].previous_block_hash:
            print('Incorrect chaining!')
            return False
    return True

For example:

In [6]:
validate_basic_chain(basic_chain, basic_genesis_block)

True

But if we try to change the data:

In [7]:
basic_chain.blocks[1].data = 'Bonjour'
validate_basic_chain(basic_chain, basic_genesis_block)

Incorrect chaining!


False

## Chain tampering

Unfortunately, it is possible to tamper a chain by recomputing hashes:

In [8]:
basic_chain.blocks[1].data = 'Bonjour'
basic_chain.blocks[1].previous_block_hash = basic_chain.blocks[0].get_hash()
basic_chain.blocks[2].data = 'Hello'
basic_chain.blocks[2].previous_block_hash = basic_chain.blocks[1].get_hash()
print(basic_chain)
validate_basic_chain(basic_chain, basic_genesis_block)

Blockchain:

block 0
data: The Times 03/Jan/2009 Chancellor on brink of second bailout for banks
previous_block_hash: n/a
hash: 6b6e12db2b97827a703cac00af905fd696aae093089f79a1be1d711e74b06397

block 1
data: Bonjour
previous_block_hash: 6b6e12db2b97827a703cac00af905fd696aae093089f79a1be1d711e74b06397
hash: cc60dbae9dec5a899a5cd0abc2c16df3d3b97f846dbfb753a7030a968efb94b3

block 2
data: Hello
previous_block_hash: cc60dbae9dec5a899a5cd0abc2c16df3d3b97f846dbfb753a7030a968efb94b3
hash: 4bf40e3a1d3ef04d0800abf4ca7301e86b450b266682a6caf5d4a9af4e90d954



True

## Block and Chain Revisited

To make it more difficult to tamper a chain, we want to make producing valid blocks more difficult.

To produce a valid block one has to compute a hash but this is straightforward. But, because the hash used in Blockchain (SHA-256) is a one-way function, finding a block with a given hash is very difficult.

Blockchain adds to the data and previous_block_hash fields of a block another field called 'nonce'. One can choose any value for nonce as long as the hash of the block (that includes these 3 fields) starts with a given number of 0's.

To produce a valid block, one has to find such a value of nonce. This is an arbitrary, seemingly vacuous, problem whose only goal is to make producing valid blocks more difficult.

In [9]:
class Block(BasicBlock):
    def __init__(self, data, previous_block_hash, nonce):
        BasicBlock.__init__(self, data, previous_block_hash)
        self.nonce = nonce

    def compute_hash(data, previous_block_hash, nonce):
        record = data + previous_block_hash + str(nonce)
        return hashlib.sha256(record.encode()).hexdigest()
    
    def get_hash(self):
        return Block.compute_hash(self.data, self.previous_block_hash, self.nonce)

    def __str__(self):
        block_str = BasicBlock.__str__(self)
        return block_str + '\nnonce: ' + str(self.nonce)
        
class Chain(BasicChain):
    def __init__(self, genesis_block, difficulty = 5):
        BasicChain.__init__(self, genesis_block)
        self.blocks = [genesis_block]
        self.difficulty = difficulty
        
def validate_chain(chain, genesis_block):
    validated = validate_basic_chain(chain, genesis_block)
    if not validated:
        return False
    zeroes = ''.zfill(chain.difficulty)
    for block in chain.blocks:
        if block.get_hash()[:chain.difficulty] != zeroes:
            print('Incorrect nonce!')
            return False
    return True        

## Mining

Anybody, with enough computing power and/or luck, can solve the hashing problem and produce a legal block. This a called __mining__ because typically it requires a big deal of computational power and energy.

In [10]:
def compute_nonce(data, previous_block_hash, difficulty = 5):
    nonce = 0
    zeroes = ''.zfill(difficulty)
    while True:
        hash = Block.compute_hash(data, previous_block_hash, nonce)
        if hash[:difficulty] == zeroes:
            return nonce
        nonce += 1
        
def mineBlock(chain, data):
    previous_block_hash = chain.blocks[-1].get_hash()
    nonce = compute_nonce(data, previous_block_hash)
    return Block(data, previous_block_hash, nonce)

nonce = compute_nonce(genesis_data, genesis_previous_block_hash)
genesis_block = Block(genesis_data, genesis_previous_block_hash, nonce)
chain = Chain(genesis_block)

block_0 = chain.blocks[0]

chain.add_block(mineBlock(chain, 'Hello'))
chain.add_block(mineBlock(chain, 'Hola'))

print(chain)

validate_chain(chain, genesis_block)

Blockchain:

block 0
data: The Times 03/Jan/2009 Chancellor on brink of second bailout for banks
previous_block_hash: n/a
hash: 0000081aa36c9517b5676ff853ec4ebeca9dd19dd8499785a5589593950c8a90
nonce: 361742

block 1
data: Hello
previous_block_hash: 0000081aa36c9517b5676ff853ec4ebeca9dd19dd8499785a5589593950c8a90
hash: 00000f20e356a372d6ca7c498c472437b16e12b612dc803f25055dd29a54a96e
nonce: 1904167

block 2
data: Hola
previous_block_hash: 00000f20e356a372d6ca7c498c472437b16e12b612dc803f25055dd29a54a96e
hash: 000008887b16f14d555708b9151735d053d671439af4f036a471dc045534dbb9
nonce: 237019



True

## Distributed Chain

In Blockchain, the chain is typically distributed (replicated) across a __peer-to-peer network__. Anybody in the network can read and validate the chain. Anybody can participate in the mining process too, in proportion to their computation power.

Data is added to the chain by the first miner to solve the hashing problem for that piece of data. Once a miner has mined a block, they distribute the block to the other nodes in the network. These nodes verify that the block is valid and they add it to their own chain.

At any moment, chains across the network are nearly identical. The last block, or sometimes the last few blocks, may be different, because the nodes may receive validated block in different orders. When this happens, and a node discovers in the network a longer chain, they switch to that longer chain. The longer chain which, by construction, is the most expensive to produce, is considered to be the valid one. Blockchain is said to be based on __proof of work__. Eventually shorter chains die out and consensus is achieved.

This means that dishonest nodes, without at least 50% of the computational power, cannot tamper the chain, and honest nodes can trust the data in the blockchain, esp. data that is not right at the end of the blockchain.