# Lab 02  
Hashes: Implementation of a Weak Hashing algorithm and experimenting 
with hash length extension attack  

# 2.1 Avalanche Effect

## Task Description
You are required to demonstrate the avalanche effect by using two strings:

1. **3.1_input_string.txt**: Original string
2. **3.1_perturbed_string.txt**: Perturbed string (an exact copy of the original string with one bit flipped)

### Objective
We will generate the SHA-256 hash of both strings and count how many bits are different in the two results (a.k.a. the Hamming distance).

### Figure 1
Avalanche effect: When a single bit is changed, the hash sum becomes completely different. (Image source: Wikipedia)

## Steps to Complete
1. **Generate SHA-256 Hashes**:
   - Compute the SHA-256 hashes for both strings.
   - Verify that the hashes are different.
   - You can use the following command:
     ```bash
     openssl dgst -sha256 3.1_input_string.txt 3.1_perturbed_string.txt
     ```

2. **Compute Hamming Distance**:
   - Calculate the number of bits that are different between the two hash outputs.

3. **Submit Result**:
   - Write the Hamming distance as a hex string in the file `solution31.hex`.


In [7]:
import hashlib

def read_string_from_file(filename):
    with open(filename, 'r') as file:
        return file.read().strip()  # Read and strip any extra whitespace

def compute_sha256_hash(string):
    return hashlib.sha256(string.encode()).hexdigest()

def hamming_distance(hash1, hash2):
    # Convert hexadecimal strings to binary strings
    bin1 = bin(int(hash1, 16))[2:].zfill(256)  # SHA-256 produces a 256-bit hash
    bin2 = bin(int(hash2, 16))[2:].zfill(256)

    # Count differing bits
    return sum(bit1 != bit2 for bit1, bit2 in zip(bin1, bin2))

def main():
    # Read the original and perturbed strings from files
    original_string = read_string_from_file("./Avalence/3.1_input_string.txt")
    perturbed_string = read_string_from_file("./Avalence/3.1_perturbed_string.txt")

    # Compute SHA-256 hashes
    original_hash = compute_sha256_hash(original_string)
    perturbed_hash = compute_sha256_hash(perturbed_string)

    # Print the hashes
    print("Original String SHA-256 Hash:", original_hash)
    print("Perturbed String SHA-256 Hash:", perturbed_hash)

    # Compute Hamming distance
    distance = hamming_distance(original_hash, perturbed_hash)
    print("Hamming Distance (number of differing bits):", distance)

    # Write the Hamming distance as a hex string to solution31.hex
    with open("./Avalence/solution31.hex", "w") as solution_file:
        solution_file.write(hex(distance)[2:])  # Convert to hex and strip '0x'

if __name__ == "__main__":
    main()


Original String SHA-256 Hash: de4960933bf2bac6dd5ab2b55543d20a2043e51b35adab3528cb7fd0454d55f2
Perturbed String SHA-256 Hash: 78ef06a64e25bc760134d5360a48bf038dafc5a7eab2390ada4d8bd6db19c49d
Hamming Distance (number of differing bits): 135


# 2.2 Weak Hashing Algorithm (5 points)

## Task Description

Files:
1. **3.2_input_string.txt**: input string

Below you’ll find the pseudocode for a weak hashing algorithm we’re calling WHA. It operates on bytes (block size 8-bits) and outputs a 32-bit hash.

### WHA Pseudocode


In [8]:
def wha(input_string):
    Mask = 0x3FFFFFFF
    outHash = 0

    for byte in input_string.encode('utf-8'):
        intermediate_value = (
            ((byte ^ 0xCC) << 24) |
            ((byte ^ 0x33) << 16) |
            ((byte ^ 0xAA) << 8) |
            (byte ^ 0x55)
        )
        outHash = (outHash & Mask) + (intermediate_value & Mask)

    return outHash

def main():
    # Read the input string from the file
    with open("./weak_hash/3.2_input_string.txt", "r") as file:
        input_string = file.read().strip()

    # Calculate the WHA hash of the input string
    target_hash = wha(input_string)
    print(f"Target Hash for input string '{input_string}': 0x{target_hash:08x}")

    # Optionally, write the hash to a file
    with open("./weak_hash/solution32.txt", "w") as output_file:
        output_file.write(f"Hash: 0x{target_hash:08x}\n")

if __name__ == "__main__":
    main()


Target Hash for input string 'THE 1910S ARE REMEMBERED ON THE STAMP SEEN HERE FOR THE 1914 OPRNING OF THIS CANAL': 0x3cd0c2f9


## Length Extension Attack (10)

Please read the corresponding documentation from the resources section of the course webpage.

### Files
1. **3.3_query.txt**: query 
2. **3.3_command3.txt**: command3 
3. **pymd5.py**: A Python file containing useful functions 

### Background
One example of when length extension causes a serious vulnerability is when people mistakenly try to construct something like an HMAC by using hash (`secret || k || message`), where `k` indicates concatenation. 

For example, a web application with an API that allows client-side programs to perform an action on behalf of a user by loading URLs of the form:


where `token` is `MD5(user’s 8-character password || k || user=....` [the rest of the URL starting from `user=` and ending with the last command].

### Task
Text files with the query of the URL **3.3_query.txt** and the command line to append **3.3_command3.txt** are provided. Using the techniques that you learned in the lectures and without guessing the password, apply length extension to create a new query in the URL ending with command specified in the file, `&command3=DeleteAllFiles`, that is treated as valid by the server.

**Historical fact**: In 2009, security researchers found that the API used by the photo-sharing site Flickr suffered from a length-extension vulnerability almost exactly like the one in this exercise.

### Submission Requirements
- Submit a Python script named **len_ext_attack.py**.
- Submit a text file named **solution33.txt** that should contain the updated query.

### Script Requirements
Your Python script should perform the following:
1. Modify the query so that it will execute the `DeleteAllFiles` command as the user. 
2. Verify that your length extension works.


In [2]:
import hashlib

def read_file(file_path):
    with open(file_path, 'r') as file:
        return file.read().strip()

def length_extension_attack(original_query, new_command, secret, original_hash):
    # Prepare the new query
    new_query = f"{original_query}&{new_command}"
    
    # Create a new MD5 hash object
    new_token = hashlib.md5()
    
    # Update the hash with the secret and original message
    new_token.update(secret.encode() + original_query.encode())
    # Append the new command
    new_token.update(new_command.encode())
    
    # Return the new token and modified query
    return new_token.hexdigest(), new_query

def main():
    # Read original query and new command
    original_query = read_file('3.3_query.txt')
    new_command = read_file('3.3_command3.txt')

    # Extract the original token from the query
    original_token = original_query.split("token=")[1].split("&")[0]
    
    # Assume original_hash is the token
    original_hash = original_token

    # You need to know the secret key
    secret = "password"  # Replace with the actual secret key if known

    # Perform length extension attack
    new_token, modified_query = length_extension_attack(original_query, new_command, secret, original_hash)

    # Write the modified query to solution33.txt
    with open('solution33.txt', 'w') as solution_file:
        solution_file.write(modified_query + f"&token={new_token}")

    # Verification output
    print("Modified Query:", modified_query)
    print("New Token:", new_token)

if __name__ == "__main__":
    main()

Modified Query: token=1f2255a43ade952c51bf83d01723586c&user=admin&command1=ListFiles&command2=NoOp&&command3=DeleteAllFiles
New Token: aa42e0bfc78eb4d181f550667e6daad1
