# Writing SHA-256 in Python

Released by the National Security Agency (NSA), SHA-256 has become an incredibly useful cryptographic function. It has the capability to convert any data into a unique 32-byte string (256 bits respectively). The real magic comes from its following two properties.

1. Any change in the incoming data will produce a completely new and unique 32-byte string (if two pieces of data can produce the same string it is called a collision and can be an enormous security problem)
2. The original data cannot be reproduced from the hash, therefore the function is one directional. This means the hash can be shared freely without any fear of the original data being 

<img class="image" src="{{ site.url }}/assets/images/sha256/diagram.png" alt="SHA-256 Basics">

How exactly does it do this? Well, lets make our own SHA-256 function to find out!

This is pretty easy to start, we just need a function to take in any data that can be converted to binary. In this example we'll limit our data to string values for representation sake. With this we can easily convert teach value to it's unicode code point by using Python's `ord()` function and convert this integer to binary by using Python 3's format function.

In [15]:
def _str_to_bin(data_string):
    """Returns the binary representation of a string
    using unicode representation of the str values
    
    args:
        data_string (str): Incoming string to be hashed
        
    return args:
        binary_data (str): Binary representation of the 
                           string
    """
    print("String values:", list(data_string))
    unicode_points = [ord(char) for char in data_string]
    print("Unicode points:", unicode_points)
    binary_values = ['{0:08b}'.format(point) for point in unicode_points]
    print("Binary values:", binary_values)
    binary_data = ''.join(binary_values)
    print('Binary data: ', binary_data)
    return binary_data

binary_representation = _str_to_bin('abc')

String values: ['a', 'b', 'c']
Unicode points: [97, 98, 99]
Binary values: ['01100001', '01100010', '01100011']
Binary data:  011000010110001001100011


Now that we have our binary data we need to preprocess it to have it in the correct format for the future SHA-256 functions. This is is achieved with the following three properties.

1. A 1 is added to the end of the binary data.
2. The length of the binary data (3 x 8 = 24) is appended to the end of the binary blob in a 64 bit number.
3. The final 1 and the 64 bit number from 1. and 2. are separated by 0s so that the total length of the binary blob is a multiple of 512.

<img class="image" src="{{ site.url }}/assets/images/sha256/diagram.png" alt="SHA-256 Basics">