
# Hashing
Within hashing methods, we take data in the form of a byte array, and then create a fixed length hash value. For MD5, the length of the hash is 128 bits, for SHA-1 it is 160 bits, and for SHA-256, it is 256 bits. 

<img src='graphics/g_hash_01.png' width="800px">

These hashes include MD5, SHA-1 and SHA-256. With MD5 we get a 128-bit output, and which is 32 hex characters:

<img src='graphics/g_hash_02.png' width="800px">

SHA-1 has an output of 160 bits, and SHA-256 has an output of 256 bits. MD5 should not be used in production environments as the method has weaknesses, along with the output hash begin too short. SHA-1, too, has been shown to have weaknesses, and thus we should use SHA-2 methods. These include SHA224, SHA-256, SHA-384 and SHA-512. A newer standard is known as SHA-3.  

<img src='graphics/g_hash_04.png' width="800px">


## OpenSSL hashing
OpenSSL can be used to create hash values for SHA1, SHA-256, and other methods. An example for Linux and Windows is [<a href="https://asecuritysite.com/openssl/openssl_full2">here</a>]:

```
Linux command: echo -n "Hello" | openssl dgst -md5
Windows command: echo | set /p = "Hello" | openssl dgst -md5

Message: Hello
Mode: md5
========
MD5d1a7fb5eab1c16cb4f7cf341cf188c3d
```

> Using OpenSSL in the command prompt, or using this site [<a href="https://asecuritysite.com/openssl/openssl_full2">here</a>], determine the hash values SHA-1 and SHA-256 hash values for: "Edinburgh" and "Glasgow".
> Do the hash values change when we use "edinburgh" and "glasgow"?

## MD5 and SHA-1
In the following we will use the hashing methods supported by the Hazmat primitive. 

In [23]:
# 03_01.py
from cryptography.hazmat.primitives import hashes
import binascii
import sys
from cryptography.hazmat.backends import default_backend

st = "Hello"

try:
    data=st.encode() # Convert to a byte array

    digest = hashes.Hash(hashes.MD5(),backend=default_backend())
    digest.update(data)
    res=digest.finalize()
    hexval=binascii.b2a_hex(res).decode() # hex format
    b64val=binascii.b2a_base64(res).decode() # Base64 format


    print ("Data: ",st)
    print (" Hex: ",binascii.b2a_hex(data).decode())
    print (f"MD5: {hexval} {b64val}")

except Exception as e:
    print(e)

Data:  Hello
 Hex:  48656c6c6f
MD5: 8b1a9953c4611296a827abf8c47804d7 ixqZU8RhEpaoJ6v4xHgE1w==





> Can you determine the hash value for "Hello"?

> Now modify Line 11 in the program below to give SHA1() and also SHA256(). What are the values (list the first two hex characters)?

> What is the length of the hash (in bits) for SHA-1?

> What is the lenguth of the hash (in bits) for SHA-256?

Two of the main formats for hashing are hexademical and Base64. In this example, we will use the binascii library to convert our data into a hash value:

<img src='graphics/g_hash_05.png' width="800px">

In this case, we will use other hashing methods such as Blake2, SHA-3, SHA-224, SHA-384 and SHA-512:

In [24]:
# 03_02.py
# https://asecuritysite.com/hazmat/hashnew
from cryptography.hazmat.primitives import hashes
import binascii
import sys
from cryptography.hazmat.backends import default_backend

st = "hello"
hex=False
showhex="No"

def show_hash(name,type,data):
  digest = hashes.Hash(type,backend=default_backend())
  digest.update(data)
  res=digest.finalize()
  hex=binascii.b2a_hex(res).decode()
  b64=binascii.b2a_base64(res).decode()
  print (f"{name}: {hex} {b64}")

if (showhex=="yes"): hex=True

try:
	if (hex==True): data = binascii.a2b_hex(st)
	else: data=st.encode()


	print ("Data: ",st)
	print (" Hex: ",binascii.b2a_hex(data).decode())
	print()

	show_hash("MD5",hashes.MD5(),data)
	show_hash("SHA1",hashes.SHA1(),data)	
	show_hash("SHA224",hashes.SHA224(),data)
	show_hash("SHA256",hashes.SHA256(),data)
	show_hash("SHA384",hashes.SHA384(),data)
	show_hash("SHA3_224",hashes.SHA3_224(),data)
	show_hash("SHA3_256",hashes.SHA3_256(),data)
	show_hash("SHA3_384",hashes.SHA3_384(),data)
	show_hash("SHA3_512",hashes.SHA3_512(),data)
	show_hash("SHA512",hashes.SHA512(),data)
	show_hash("SHA512_224",hashes.SHA512_224(),data)
	show_hash("SHA512_256",hashes.SHA512_256(),data)

except Exception as e:
    print(e)

Data:  hello
 Hex:  68656c6c6f

MD5: 5d41402abc4b2a76b9719d911017c592 XUFAKrxLKna5cZ2REBfFkg==

SHA1: aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d qvTGHdzF6KLavt4PO0gs2a6pQ00=

SHA224: ea09ae9cc6768c50fcee903ed054556e5bfc8347907f12598aa24193 6gmunMZ2jFD87pA+0FRVblv8g0eQfxJZiqJBkw==

SHA256: 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 LPJNul+wow4m6DsqxbninhsWHlwfp0JecwQzYpOLmCQ=

SHA384: 59e1748777448c69de6b800d7a33bbfb9ff1b463e44354c3553bcdb9c666fa90125a3c79f90397bdf5f6a13de828684f WeF0h3dEjGnea4ANejO7+5/xtGPkQ1TDVTvNucZm+pASWjx5+QOXvfX2oT3oKGhP

SHA3_224: b87f88c72702fff1748e58b87e9141a42c0dbedc29a78cb0d4a5cd81 uH+IxycC//F0jli4fpFBpCwNvtwpp4yw1KXNgQ==

SHA3_256: 3338be694f50c5f338814986cdf0686453a888b84f424d792af4b9202398f392 Mzi+aU9QxfM4gUmGzfBoZFOoiLhPQk15KvS5ICOY85I=

SHA3_384: 720aea11019ef06440fbf05d87aa24680a2153df3907b23631e7177ce620fa1330ff07c0fddee54699a4c3ee0ee9d887 cgrqEQGe8GRA+/Bdh6okaAohU985B7I2MecXfOYg+hMw/wfA/d7lRpmkw+4O6diH

SHA3_512: 75d527c368f2efe84


> In this case the input data is "00". Can you run the program again, and this time use the data input of "The quick brown fox jumps over the lazy dog". Prove that:

* MD5 hash value is "9e107d9d372bb6826bd81d3542a419d6"
* SHA-1 hash value is "2fd4e1c67a2d28fced849ee1bb76e7391b93eb12"
* SHA-256 hash value is "d7a8fbb307d7809469ca9abcb0082e4f8d5651e46d3cdb762d02d0bf37c9e592"


> How many hex characters does MD5, SHA-1 and SHA-256, and how would you determine number of characters used?

## Adding salt
One problem with hashing methods, is that we get the same hash output for the same input. This can allow an intruder to match the hash to the input string. To overcome this, we can add a salt value to the hashing process. This can be to append or prepend the data onto the input data. We obviously need to store the salt value with the hash value, in order to check a hash. 

<img src='graphics/g_hash_08.png' width="800px">

In [25]:
# 03_03.py
from cryptography.hazmat.primitives import hashes
import binascii
import sys
from cryptography.hazmat.backends import default_backend

st = "hello"
salt = "N20"
hex=False
showhex="No"

def show_hash(name,type,data,salt):
  digest = hashes.Hash(type,backend=default_backend())
  digest.update(salt)
  digest.update(data)
  res=digest.finalize()
  hex=binascii.b2a_hex(res).decode()
  b64=binascii.b2a_base64(res).decode()
  print (f"{name}: {hex} {b64}")

if (showhex=="yes"): hex=True

try:
	if (hex==True): data = binascii.a2b_hex(st)
	else: data=st.encode()


	print ("Data: ",st)
	print (" Hex: ",binascii.b2a_hex(data).decode())
	print()

	show_hash("MD5",hashes.MD5(),data,salt.encode())
	show_hash("SHA1",hashes.SHA1(),data,salt.encode())	
	show_hash("SHA224",hashes.SHA224(),data,salt.encode())
	show_hash("SHA256",hashes.SHA256(),data,salt.encode())
	show_hash("SHA384",hashes.SHA384(),data,salt.encode())
	show_hash("SHA3_224",hashes.SHA3_224(),data,salt.encode())
	show_hash("SHA3_256",hashes.SHA3_256(),data,salt.encode())
	show_hash("SHA3_384",hashes.SHA3_384(),data,salt.encode())
	show_hash("SHA3_512",hashes.SHA3_512(),data,salt.encode())
	show_hash("SHA512",hashes.SHA512(),data,salt.encode())
	show_hash("SHA512_224",hashes.SHA512_224(),data,salt.encode())
	show_hash("SHA512_256",hashes.SHA512_256(),data,salt.encode())

except Exception as e:
    print(e)

Data:  hello
 Hex:  68656c6c6f

MD5: bfd9929c0794146bae6dcccb4317e99c v9mSnAeUFGuubczLQxfpnA==

SHA1: 710fd92d6fa82d1851f9691ba48f29124890761d cQ/ZLW+oLRhR+WkbpI8pEkiQdh0=

SHA224: 079bb5febb6d8f6d8cdfdd8e63360aa1803170325ed6738bbf62b189 B5u1/rttj22M392OYzYKoYAxcDJe1nOLv2KxiQ==

SHA256: 5bf84899a6394b4d1ceaf8a28ffe61425a40ddef1a1563321437f8dd388ba0d2 W/hImaY5S00c6viij/5hQlpA3e8aFWMyFDf43TiLoNI=

SHA384: f5b0fd54d84d07531b579030015b699073efec74491cd62d0e78b217465d77a4cdd9ea5974622ceaecb4acd49da2cf0d 9bD9VNhNB1MbV5AwAVtpkHPv7HRJHNYtDniyF0Zdd6TN2epZdGIs6uy0rNSdos8N

SHA3_224: b9473057d8131ff11cf1cffeb5f4bdebe972baa131344a3b86c22030 uUcwV9gTH/Ec8c/+tfS96+lyuqExNEo7hsIgMA==

SHA3_256: 071cdb333657cce85c765a287dbfa3388e430fce8b48b398a08444674965edde BxzbMzZXzOhcdloofb+jOI5DD86LSLOYoIREZ0ll7d4=

SHA3_384: a1f631cb30bdebc5020cfdc0f7fe59dba5702f3c6f0f418bc756c759ae5e645d213ce81810fed5fc3473793d37beff3d ofYxyzC968UCDP3A9/5Z26VwLzxvD0GLx1bHWa5eZF0hPOgYEP7V/DRzeT03vv89

SHA3_512: bf033772d578b5b5f

> Verify that the hash value changes for different salt values.

> Rather than a string for the salt value. Can you modify the program, so that it has a random salt value with 16 bytes?

## Variable length hashes (XOF)
There are some hashing methods that support a variable number of bytes in the output hash. These include Blake2b, Blake2s, SHAKE128 and SHAKE256:

<img src='graphics/g_hash_07.png' width="800px">

In [26]:
# 03_04.py
from cryptography.hazmat.primitives import hashes
import binascii
import sys
from cryptography.hazmat.backends import default_backend

st = "hello"
hex=False
showhex="No"

def show_hash(name,type,data):
  digest = hashes.Hash(type,backend=default_backend())
  digest.update(data)
  res=digest.finalize()
  hex=binascii.b2a_hex(res).decode()
  b64=binascii.b2a_base64(res).decode()
  print (f"{name}: {hex} {b64}")

if (showhex=="yes"): hex=True

try:
	if (hex==True): data = binascii.a2b_hex(st)
	else: data=st.encode()


	print ("Data: ",st)
	print (" Hex: ",binascii.b2a_hex(data).decode())
	print()

	show_hash("Blake2p (64 bytes)",hashes.BLAKE2b(64),data)
	show_hash("Blake2s (32 bytes)",hashes.BLAKE2s(32),data)
	show_hash("SHAKE128 (64 bytes)",hashes.SHAKE128(64),data)
	show_hash("SHAKE256 (64 bytes)",hashes.SHAKE256(64),data)

except Exception as e:
    print(e)

Data:  hello
 Hex:  68656c6c6f

Blake2p (64 bytes): e4cfa39a3d37be31c59609e807970799caa68a19bfaa15135f165085e01d41a65ba1e1b146aeb6bd0092b49eac214c103ccfa3a365954bbbe52f74a2b3620c94 5M+jmj03vjHFlgnoB5cHmcqmihm/qhUTXxZQheAdQaZboeGxRq62vQCStJ6sIUwQPM+jo2WVS7vlL3Sis2IMlA==

Blake2s (32 bytes): 19213bacc58dee6dbde3ceb9a47cbb330b3d86f8cca8997eb00be456f140ca25 GSE7rMWN7m294865pHy7Mws9hvjMqJl+sAvkVvFAyiU=

SHAKE128 (64 bytes): 8eb4b6a932f280335ee1a279f8c208a349e7bc65daf831d3021c213825292463c59e22d0fe2c767cd7cacc4df42dd5f6147f0c5c512ecb9b933d14b9cc1b2974 jrS2qTLygDNe4aJ5+MIIo0nnvGXa+DHTAhwhOCUpJGPFniLQ/ix2fNfKzE30LdX2FH8MXFEuy5uTPRS5zBspdA==

SHAKE256 (64 bytes): 1234075ae4a1e77316cf2d8000974581a343b9ebbca7e3d1db83394c30f221626f594e4f0de63902349a5ea5781213215813919f92a4d86d127466e3d07e8be3 EjQHWuSh53MWzy2AAJdFgaNDueu8p+PR24M5TDDyIWJvWU5PDeY5AjSaXqV4EhMhWBORn5Kk2G0SdGbj0H6L4w==



> Run the program and verify the hashes produced.

> Modify the program so that that we get a hash with 16 bytes, and verify that the length is correct.

> Modify the program so that that we get a hash with 512 bytes, and verify that the length is correct.

## LM and NTLM Hash
Previous Microsoft Windows systems have used the LM and NTLM hash to store user passwords. The method is supported in the passlib library:

In [None]:
import passlib.hash;
string="hello"
print "LM Hash:"+passlib.hash.lmhash.encrypt(string)
print "NT Hash:"+passlib.hash.nthash.encrypt(string)

> Compute the LM and NTLM hash for "edinburgh" and "glasgow". How many bytes are in the hash?
> Observe what happens to the hash when we use an input of "aaaaaa", "aaaaaaa", "aaaaaaaa", and "aaaaaaaaaaaa"?