# Introduction to Hash Functions

## Built-In Hash Function

Python has a [built-in hash function](https://docs.python.org/3/library/functions.html#hash) which is internally used by [sets](https://docs.python.org/3/tutorial/datastructures.html#sets) and [dictionaries](https://docs.python.org/3/tutorial/datastructures.html#dictionaries). However this is not a secure or cryptographic hash but rather a convient function to make use of the fast speed of [HashTables](https://en.wikipedia.org/wiki/Hash_table), the underlying data structure of sets and dictionaries.

In [1]:
hash("Hello World!")

-6310723423335719723

The result will be an integer and it is designed in such a way that numeric objects which are equal when compared will have the same hash even though they are actually different. To objects are the same, when they have the same memory position, this can be obtained through the built-in `id` function.

This was done intentionally to increase the speed of the language and for general-purpose approach, not for security.

In [2]:
print(f"Are 1 and 1.0 the same object? {id(1) == id(1.0)}")
print(f"Does 1 have the same hash as 1.0? {hash(1) == hash(1.0)}")

Are 1 and 1.0 the same object? False
Does 1 have the same hash as 1.0? True


This means this function **SHOULD NOT** be used for any cryptographic work. Instead, the [hashlib module](https://docs.python.org/3/library/hashlib.html) should be used.

## Hashlib Module

The [hashlib module](https://docs.python.org/3/library/hashlib.html) uses the [OpenSSL](https://www.openssl.org/) library under the hood and exposes several of its cryptographic hash functions.

One of the particularities of most cryptographic-focused modules and libraries is that they work with low level objects, mainly `bytes` objects, instead of the high level built in types, such as lists, strings, custom objects, etc. Since working with Bytes is not for some Python developers, [an appendix](91_Bytes.ipynb) is provided as a quick introduction.

The Hashlib module exposes two ways to construct hashes, one is a simple function call and the other is implementing the [Builder Pattern](https://en.wikipedia.org/wiki/Builder_pattern), which is more object oriented.

In [3]:
import hashlib

### Methods available

It is possible to list all methods available

In [4]:
print(", ".join(hashlib.algorithms_available))

blake2s, sm3, sha512_256, sha3_224, md4, shake_128, sha512, sha3_256, shake_256, sha3_384, sha384, blake2b, md5, mdc2, sha256, sha1, ripemd160, whirlpool, sha224, sha3_512, sha512_224, md5-sha1


### Function Call

In [5]:
hash_object = hashlib.sha256(b"Hello World!")
print(f"Bytes Digest: {hash_object.digest()}")
print(f"Hex Digest: {hash_object.hexdigest()}")

Bytes Digest: b'\x7f\x83\xb1e\x7f\xf1\xfcS\xb9-\xc1\x81H\xa1\xd6]\xfc-K\x1f\xa3\xd6w(J\xdd\xd2\x00\x12m\x90i'
Hex Digest: 7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069


### Object Instantiation

In [6]:
hasher = hashlib.new('sha256')
hasher.update(b"Hello World!")
print(f"Bytes Digest: {hasher.digest()}")
print(f"Hex Digest: {hasher.hexdigest()}")

Bytes Digest: b'\x7f\x83\xb1e\x7f\xf1\xfcS\xb9-\xc1\x81H\xa1\xd6]\xfc-K\x1f\xa3\xd6w(J\xdd\xd2\x00\x12m\x90i'
Hex Digest: 7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069
