# CMP 5006 - Information Security 


## Cryptographic Hashes


### Alejandro Proano, PhD. 

In [None]:
## MAP
a = dict()
# the address in memory of 1 is given by hash(x)
a['x'] = 1
a['y'] = 2

## SET
b = set()
# the address in memory of x is determined by hash(1)
b.add(1)
b.add(2)

## What is a Cryptographic Hash Function?

- A mathematical algorithm that maps data of arbitrary size to a fixed-size output
- Designed to be a one-way function (practically impossible to invert)
- Key properties:
  1. Deterministic
  2. Quick to compute
  3. Infeasible to reverse
  4. Small changes in input cause large changes in output
  5. Collision-resistant


## Properties of Cryptographic Hash Functions

### 1. Pre-image Resistance

Given a hash value h, it should be computationally infeasible to find any message m such that:

$\text{hash}(m) = h$

### 2. Second Pre-image Resistance

Given an input $m_1$, it should be computationally infeasible to find a different input $m_2$ such that:

$\text{hash}(m_1) = \text{hash}(m_2)$

### 3. Collision Resistance

It should be computationally infeasible to find two different messages $m_1$ and $m_2$ such that:

$\text{hash}(m_1) = \text{hash}(m_2)$


## Common Cryptographic Hash Functions

1. MD5 (deprecated)
2. SHA-1 (deprecated)
3. SHA-2 family (SHA-256, SHA-512)
4. SHA-3 family
5. BLAKE2
6. BLAKE3


## Example: SHA-256

SHA-256 produces a 256-bit (32-byte) hash value, typically rendered as a 64-digit hexadecimal number.


In [14]:
import hashlib
from scipy.spatial import distance

message = "Hello, Crypto!"
hash_object = hashlib.sha256(message.encode())
hex_dig = hash_object.hexdigest()
print(hex_dig)

message1 = "Hello, Crypto!!"
hash_object1 = hashlib.sha256(message1.encode())
hex_dig1 = hash_object1.hexdigest()
print(hex_dig1)

print(distance.hamming(list(hex_dig), list(hex_dig1))*len(list(hex_dig)), len(list(hex_dig)))

de0640f1dc17ca1b01fb9eba3019ed07c12e2af4ae990ecb36aa669898a9fd40
b70e01f32faf33c2f7fb33025e7dd3839f54bd70d62b1d046d65359959ad66f6
58.0 64


## Applications of Cryptographic Hashes

1. Password storage
2. Data integrity verification
3. Digital signatures
4. Proof of work in blockchain
5. Hash-based message authentication codes (HMAC)


## Password Storage Example

Instead of storing plaintext passwords, store their hashes:


In [4]:
def hash_password(password):
    return hashlib.sha256(password.encode()).hexdigest()

def verify_password(stored_hash, provided_password):
    return stored_hash == hash_password(provided_password)

In [15]:
# Usage
stored_hash = hash_password("mySecurePassword123")
print(verify_password(stored_hash, "mySecurePassword123"))  # True
print(verify_password(stored_hash, "wrongPassword"))        # False

True
False


## Salt in Password Hashing

To prevent rainbow table attacks, use a salt:

$hash = H(password || salt)$

Python example:


In [6]:
import os

def hash_password(password):
    salt = os.urandom(32)
    key = hashlib.pbkdf2_hmac('sha256', password.encode(), salt, 100000)
    return salt + key

def verify_password(stored_password, provided_password):
    salt = stored_password[:32]
    key = stored_password[32:]
    new_key = hashlib.pbkdf2_hmac('sha256', provided_password.encode(), salt, 100000)
    return key == new_key

In [7]:
stored_hash = hash_password("mySecurePassword123")
print(verify_password(stored_hash, "mySecurePassword123"))  # True
print(verify_password(stored_hash, "wrongPassword")) 

True
False


## Hash Collision Example

Birthday Paradox: In a room of 23 people, there's a 50% chance two share a birthday.

For a hash function with n-bit output, we expect to find a collision after approximately $2^{n/2}$ hashes due to the birthday attack.

For SHA-256 (256-bit output), we'd need approximately $2^{128}$ hashes to find a collision with 50% probability.


In [16]:
(2**128)

340282366920938463463374607431768211456

## Merkle–Damgård Construction

Many hash functions (MD5, SHA-1, SHA-2) use this construction:

1. Pad the message
2. Split into fixed-size blocks
3. Process blocks sequentially with a compression function
4. Output the final state as the hash

```
       m₁           m₂           m₃
       |            |            |
       v            v            v
IV --> f --> h₁ --> f --> h₂ --> f --> Hash
```

Where f is the compression function, and IV is the initial value.
