# Cryptography (CC4017) -- Week 5

## Exercice 1

Use OpenSSL to calculate the SHA256 value of the pdf slides of this week’s class. Check if it equals:

> d51b15eeed16158b0a2d0d50c92e3b34f62140b7627b88dca62d4a27e8f0f569

Command used:
> openssl dgst -sha256 cryptoSlidesW5.pdf

Result:
> d51b15eeed16158b0a2d0d50c92e3b34f62140b7627b88dca62d4a27e8f0f569


### Exercice 1.1

What does this tell you about the integrity of the file?

> Since the SHA256 value mathces the expected hask, it indicates that the file has not been altered/tampered with. The file has maintained its integrity. 

### Exercice 1.2

Suppose you alter the first 4 bytes of the original pdf file, and recompute the SHA256 value of this altered file.
How many bytes do you expect to be affected by this change?

> In hash functions like SHA256, even a small change in the input (change 4 bytes of the PDF) will completely change the resulting hash. The avalanche effect causes a small change in the input to propagate throughout the entire hash output, making the new SHA256 look completely diferent from the original.
>
> Changing 4 bytes of the original pdf file only affects the hash representation of the file, that will change dramatically. So, the number of bytes affected in the has is 32 bytes, since SHA256 produces a 256-bit output hash, regardless of the file size.



## Exercice 2

Use python to crack the security of predictable passwords in crack_hash.py

- The file has the twenty most common passwords of 2019. 
- The code produces hash values of passwords (salted and non-salted), then they are shuffled.
- From the shuffled hashes and the list of most common passwords, retrieve the original passwords!
- Is it faster to attack salted or unsalted hashes?
- Include a succinct analysis of how long it takes to do these attacks.

<br>

>When comparing the velocity of attack for salted or unsalted hashes, we can see that unsalted hashes are much faster to attack because there's no need to account for the salt. We can just directly hash each password and compare it with the hash list. On the other side, salted hashes require brute-forcing the salt, which significantly increases the number of possibilities we need to check, the attack is slower.
>
>Unsalted attack completes very quickly, as it's a direct comparison. Salted attacks, however, takes longer because we need to try all possible 256 salt values for each password. The total time complexity for salted hashes would be approximately 256 times slower than unsalted hashes due to the brute-force nature of the salt attack.
>
>In the code given the salt value is the same for all the passwords (os.urandom(1)), using the same value diminishes the benefit of salting because cracking the salt for one password, can be used to crack all the other, that explains how the salted hashes were cracked quickly, the process is almost as fast as cracking the unsalted hashes.

## Exercice 3

Use the tool available [here](http://alf.nu/SHA1) (or any other tool that works) to construct two PDFs with the same SHA-1 value.
Check out the SHAttered paper and explain how the attack works.

<br>

> The SHAttered attack utilizes a number of mathematical weaknesses in SHA-1′s compression function. This attack is based on chosen-prefix collision, an attacker can forge two files/messages with different content that leads to the same hash value. That is not the same as a random collision, this happens when you hash two different input randomly and after all it results in the one.
>
> The process involves several steps:
>
> Collision Generation: Attackers create two different documents by altering the content. In order to achieve this, one usually has to change the contents of a subset of bytes in input files while fitting them into place during for SHA-1 hash function allowing it return the same (original) result.
>
> Differential Path: The SHA-1 method is built with the differential path, meaning if some modifications are certain not to affect your final hash output. To achieve this, one must be clear on the mathematical properties of SHA-1 and also understand how said hashing algorithm actually functions.
>
> Finalization: Both papers get finalized to ensure that they still end up with the same SHA-1 hash and thus each meets all conditions of being a valid file in our example PDFs
>
> The SHAttered attack is significant because it exposes the flaws in SHA-1 and emphasizes the need for more robust hashing algorithms, like SHA-256 or SHA-3. The ability to hash two distinct files to the same value can have detrimental effects on certifications, digital signatures, and data integrity in general.
>
>

## Exercice 4

A length extension attack works as follows.
- Application generates secret key k, which is kept hidden
- At some point application computes h = H(k||m) for some message m and publishes (m, h).
- Intuitively it should be impossible for some attacker to compute 
H (k||m′) for m ̸= m′
.
- However, for some hash functions, it is possible to compute such a value using only (M, h). This technique has
been explained in theoretical classes for the SHA-2 family. Demonstrate the attack by constructing:
    - A Python program that generates k, computes h = SHA2(k||m) for some m and saves k, m and h into different files.
    - Another Python program that reads m and h (but not k!) and generates some m′ and h′ into different files. Is must be the case that SHA2 (k||m′) = h′ and that m ̸= m′.

In [1]:
import os
import hashlib

k = os.urandom(16)
m = b"Hello, world!"

h = hashlib.sha256(k + m).hexdigest()

with open('key.txt', 'wb') as key_file:
    key_file.write(k)

with open('message.txt', 'wb') as message_file:
    message_file.write(m)

with open('hash.txt', 'w') as hash_file:
    hash_file.write(h)

print("\n Key, message, and hash written to files.\n")


 Key, message, and hash written to files.



In [4]:
import hashlib
import struct

with open('message.txt', 'rb') as message_file:
    m = message_file.read()

with open('hash.txt', 'r') as hash_file:
    h = hash_file.read()

m_prime = b"Bye, world!"

message_length = len(m)

def sha2_padding(message):
    original_byte_length = len(message)
    original_bit_length = original_byte_length * 8
    padding = b'\x80'
    padding += b'\x00' * ((56 - (original_byte_length + 1) % 64) % 64)
    padding += struct.pack(b'>Q', original_bit_length)
    return padding

padding = sha2_padding(m)

h_bytes = bytes.fromhex(h)
sha256 = hashlib.sha256()
sha256 = hashlib.new('sha256', h_bytes)
sha256 = sha256.copy()

sha256.update(m_prime)

h_prime = sha256.hexdigest()

with open('message_prime.txt', 'wb') as message_prime_file:
    message_prime_file.write(m_prime)

with open('hash_prime.txt', 'w') as hash_prime_file:
    hash_prime_file.write(h_prime)

print("\n New message and hash written to files.\n")


 New message and hash written to files.

