References
- https://www.geeksforgeeks.org/create-simple-blockchain-using-python/

In [1]:
import datetime
import hashlib

# Demonstration of use of a Hash Puzzle for Mining

Blocks are mined by solving cryptographic puzzles. Originally, the use of the method used in bitcoin (hashcash) was proposed for imposing limits on messages or emails users could send as a way to reduce email or message spam. You can read the original paper here http://www.hashcash.org/. Hashcash was invented by Adam Back in 1997. Another team was also working on similar ideas around the same time.

# Hashlib Explanation


Hashlib is a python library for performing different hashing operations. Techincally it is a 'common interface' to different secure hash and message digest functions. When you run a message through a hash function, the output of this operation is called a 'message digest' or 'Hash digest'

Hashlib requires you to 'encode' objects before hashing them. "Unicode-objects must be encoded before hashing" (see more on unicode in the cell below)

In [24]:
# Encoding (in UTF-8) to prepare for using hashlib - See next section for a longer explanation of encoding.

first_input  = 'We are going to hash this string but first we need to encode it in UTF-8'
first_input = first_input.encode()

print(x)

# The b in front of the string indicates that this string is encoded

b'read this'


In [25]:
#now we can use hashlib on the encoded string to perform the HASH function. This results in a hashlib object.
y = hashlib.sha256(x)
print(y)

<sha256 _hashlib.HASH object @ 0x7f8148b235b0>


In [26]:
# the hashlib object has many Attributes. It also includes block_size, copy, digest, digest_size, name, update
a = dir(y)
print(a)

['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'block_size', 'copy', 'digest', 'digest_size', 'hexdigest', 'name', 'update']


In [42]:
print(y.digest_size) # the size in characters

32


In [44]:
# The attribute we are often interested in is hash 'hexdigest' (output). It is expressed as a hexadecimal number (32 characters) 
y.hexdigest()

'f23321d181a75c36f90b64eeddb424beefaaa052a5c4a911aeccb56d77d6df7a'

In [45]:
# We can also convert this number into its binary representation and print it
bin(int(y.hexdigest(), 16))

'0b1111001000110011001000011101000110000001101001110101110000110110111110010000101101100100111011101101110110110100001001001011111011101111101010101010000001010010101001011100010010101001000100011010111011001100101101010110110101110111110101101101111101111010'

## Unicode and Encodings
the .encode() function is a string function in python
with no arguements, its default is to return a UTF-8 encoding of the string input.    

Here is a link to [a longer tutorial on the python string encode function](https://www.programiz.com/python-programming/methods/string/encode)

A discussion of Unicode is out of scope, but here is a link to a [longer explanation of Unicode if you have never encountered it before]()

In [46]:
# running this cell will throw an error. it is missing .encode()
# A unicode object must be encoded before it can be run through the hash function.
hashlib.sha256('strings need to be encoded before hashing them').hexdigest()

TypeError: Unicode-objects must be encoded before hashing

In [54]:
#running a hashlib hash function (in this case sha256) on a properly encoded string will return a hashlib object. 
hashlib.sha256('Lets try again but this time we will encode the string'.encode())

<sha256 _hashlib.HASH object @ 0x7f8148b23dd0>

In [55]:
# We can use the notebook to inspect the object docstring, methods and attributes
h = hashlib.sha256('Lets try again but this time we will encode the string'.encode())

# lets print the hash digest as a string of hexadecimal digits
h.hexdigest() 

'9a70de49c3142e7e6f0169c9650fa7290fa748ad27a4085ab70f9e99b41d38fe'

In [56]:
# Actually, h contains a LOT more data as well. It is actually a hashlib object. Lets inspect some other fields in the object
# ?? in jupyter notebook can inspect the object docstring, methods and attributes
??h

Object type: HASH

Methods:

- update() -- updates the current digest with an additional string
- digest() -- return the current digest value
- hexdigest() -- return the current digest as a string of hexadecimal digits
- copy() -- return a copy of the current hash object

Attributes:

- name -- the hash algorithm being used by this object
- digest_size -- number of bytes in this hashes output

In [34]:
h.digest() # the current digest value (encoded by default in UTF-8)

b"\x9ap\xdeI\xc3\x14.~o\x01i\xc9e\x0f\xa7)\x0f\xa7H\xad'\xa4\x08Z\xb7\x0f\x9e\x99\xb4\x1d8\xfe"

In [35]:
h.hexdigest() # the current digest as a string of hexadecimal digits

'9a70de49c3142e7e6f0169c9650fa7290fa748ad27a4085ab70f9e99b41d38fe'

In [36]:
h.name # the name of the hash function used to create the hash object

'sha256'

In [37]:
h.digest_size # all sha256 digests will be size 32 characters unless you change the defaults

32

In [57]:
# we can run the hash function on an encoded string and get its output in hex format all in one line
hashlib.sha256('strings need to be encoded before hashing them'.encode()).hexdigest()

'ae1de85725c6f82d5ad8e130bf756fe8dd7f1024e837afc73aedf2e373caba92'

# Demonstration of single run through inside the Proof of Work loop:

In [58]:
# nonce we will use to try to find an answer to the puzzle. 
# Note that there is more than one answer, 
# we just want to find any answer faster than anyone else

nonce = 1 

# this is the nonce that was used to solve the puzzle for the previous block. The previous hash is used in bitcoin 
# to validate that miners are using a valid block from the chain to find the answer and
# not just any arbitrary data. So the solution to the previous block is part of the input into the new potential solution

previous_hash = 0

# A flag to indicated if the puzzle is solved yet

solved = False

new_proof squared minus previous_proof squared. Then encode that and run it through the hash and hope it solves the puzzle 

In [60]:
print(nonce - previous_hash)

1


In [62]:
hashlib.sha256(str(nonce - previous_hash).encode()).hexdigest() # we just encoded and hashed the input "1"

'6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b'

The solution to the "puzzle" in Bitcoin is a hash (in hexadecimal) that starts with a whole bunch of zeros.    

**This hash does not begin with a bunch of zeros, so we need to calculate a new has one.**

Here, we choose to increment the counter, but miners are allowed to try anything. Most just use a counter. The hashcash puzzle is deliberately designed to be resistent to algorithmic speed ups or "trapdoors". That means, brute force methods should be the only way to solve it.

If you want to see more about hashcash and bitcoin in particular, visit https://nakamoto.com/hashcash/

In [69]:
# adding values to the counter (nonce) changes the resulting hash digest (output)
# note that you can ignore previous_hash here. We have it set to 0. 
# It is normally a way to ensure other miners can validate a correct solution
nonce += 1
hashlib.sha256(str(nonce - previous_hash).encode()).hexdigest()

'2c624232cdd221771294dfbb310aca000a0df6ac8b66b696d90ef06fdefb64a3'

In [78]:
# we set the difficulty to 3. This means the hash digest must start with at least 3 zeroes.

difficulty = 3
''.zfill(difficulty)

'000'

In [92]:
# if you don't want to see all the hash results that failed so far then
# comment out the print statement inside the while loop 
while solved is False:
    hash_operation = hashlib.sha256(str(nonce - previous_hash).encode()).hexdigest()
    print(hash_operation)
    if hash_operation[:difficulty] == ''.zfill(difficulty): #zfill pads a string with leading '0's
        solved = True
        print("hash found: ", hash_operation)
    else:
        nonce+=1

The nonce value is the input which solved the puzzle.    
The hash_operation is the output of the puzzle. If the output has 3 leading zeros then the input is a valid solution !

**In the above cell, SCROLL DOWN to the BOTTOM to see the "hash found" value that has 3 or more leading zero's**

In [93]:
print('counter:',nonce ,'result', hash_operation)

counter: 3633 result 00039a15178b11924de22fd1a02f6efb00d8af33c171a6b67614871e8d6012da


In [94]:
# we validate that this is a solution by checking the first 3 characters of the hash output
print(hash_operation[:difficulty])

000


Note: if you want to run it again, you need to reset the "solved" variable to False.

In [98]:
solved = False
nonce += 1

Now go back up to the loop at run it again :) you will now need to find a new nonce (Solution). This is because we did not reset the nonce back to 0. So we will fine a larger counter that also 'solves' the puzzle.

Remember, there is more thank one solution to the puzzle. This means, multiple miners could find different solutions that are both correct. So how does the network know which one to accept? the answer is **Consensus Rules** which we will cover next. 


## Other settings you can play with
You can also adjust the difficulty of the puzzle by adding more leading 0. Bitcoin is currently at 25+ which would take thousands of years on a laptop to solve. Note that 4 and 5 can take less that seconds to solve but 6 takes minutes

In [99]:
difficulty=4
while solved is False:
    hash_operation = hashlib.sha256(str(nonce - previous_hash).encode()).hexdigest()
    if hash_operation[:difficulty] == ''.zfill(difficulty):
        solved = True
    else:
        nonce+=1

In [100]:
print(nonce, hash_operation)

172608 0000f727854b50bb95c054b39c1fe5c92e5ebcfa4bcb5dc279f56aa96a365e5a


In [101]:
solved = False
nonce += 1

In [102]:
difficulty=5
while solved is False:
    hash_operation = hashlib.sha256(str(nonce - previous_hash).encode()).hexdigest()
    if hash_operation[:difficulty] == ''.zfill(difficulty):
        solved = True
    else:
        nonce+=1

In [103]:
print(nonce, hash_operation)

596138 00000691457f4f0ce13e187b9ab4fda6d42c8647752909b8f71f9dbd8f6bd4ab


In [104]:
solved = False
nonce += 1

In [105]:
# if you set it to 6 you might end up waiting 10+ minutes ... Beyond 6, might be hours...