# Computational Theory Assignment

In [40]:
# Necessary imports

import numpy as np

## Problem 1

### SHA-1 Parity Function

The `Parity` function, as defined in **Section 4.1.1** of the Secure Hash Standard (FIPS 180-4), implements a fundamental SHA-1 bitwise operation. It performs a bitwise XOR (exclusive OR) across three 32-bit input values, expressed as `x ^ y ^ z`.

**Mathematical Properties:**
- XOR is **associative**: `(x ^ y) ^ z = x ^ (y ^ z)`
- XOR is **commutative**: `x ^ y = y ^ x`
- These properties mean the order of operations doesn't affect the result

**Cryptographic Purpose:**  
The Parity function provides bit mixing in rounds 20-39 and 60-79 of the SHA-1 compression function. It ensures each output bit depends on the corresponding input bits from all three operands, creating diffusion in the hash computation.

**Technical Details:**
- **Input**: Three 32-bit integers (masked with `0xFFFFFFFF` to ensure 32-bit operation)
- **Output**: A 32-bit unsigned integer
- **Operation**: Bitwise XOR - returns 1 if an odd number of input bits are 1, otherwise 0
- **Python Implementation**: Uses `&` for masking and `^` for XOR

**How It Works - Truth Table Example:**
```
x | y | z | Parity(x,y,z)
--|---|---|------------
0 | 0 | 0 |     0      (zero 1s)
0 | 0 | 1 |     1      (one 1)
0 | 1 | 0 |     1      (one 1)
0 | 1 | 1 |     0      (two 1s)
1 | 0 | 0 |     1      (one 1)
1 | 0 | 1 |     0      (two 1s)
1 | 1 | 0 |     0      (two 1s)
1 | 1 | 1 |     1      (three 1s)

#### Example Usage

```python
result = Parity(0b1111, 0b1010, 0b0101)
print(result)  # Output: 0
# Explanation: 1111 ^ 1010 = 0101, then 0101 ^ 0101 = 0000 (0 in decimal)
```

In this example, the function computes the XOR of three binary numbers and returns the integer result.

#### Code

In [41]:
def Parity(x, y, z):
    """
    SHA-1 Parity function.

    Parameters:
        x (int): first value (interpreted as binary).
        y (int): second value (interpreted as binary).
        z (int): third value (interpreted as binary).

    Returns:
        int: The result of the parity as a 32-bit unsigned int.
    """
    x = x & 0xFFFFFFFF
    y = y & 0xFFFFFFFF
    z = z & 0xFFFFFFFF
    
    # SHA-1 Parity function = bitwise XOR of each bit position
    result = x ^ y ^ z   # ^ is the bitwise XOR operator in Python

    return int(result)

#### Testing

In [42]:
print(Parity(0x00000000, 0x00000000, 0x00000000))
def test_parity():
    assert Parity(0x00000000, 0x00000000, 0x00000000) == 0, "All zeros should return 0"
    assert Parity(0x00000001, 0x00000000, 0x00000000) == 1, "Single one should return 1"
    assert Parity(0x00000001, 0x00000001, 0x00000000) == 0, "Two ones should return 0"
    assert Parity(0x00000001, 0x00000001, 0x00000001) == 1, "Three ones should return 1"
    assert Parity(0x00000005, 0x00000002, 0x00000001) == 6, "Mixed bits test"
    assert Parity(0x0000000F, 0x0000000A, 0x00000005) == 0, "4-bit test"
    print('All test cases passed.')

0


In [43]:
# Run all tests
test_parity()

All test cases passed.


### SHA-1 Choose (Ch) Function

The `Ch` function, defined in **Section 4.1.1** of the Secure Hash Standard, implements the SHA-1 "choose" operation. This function conditionally selects bits: for each bit position, if the corresponding bit in `x` is 1, the result takes the bit from `y`; otherwise, it takes the bit from `z`.

**Formula:** `Ch(x, y, z) = (x & y) ^ (~x & z)`  
**Alternative expression:** `(x AND y) XOR (NOT x AND z)`

**How It Works:**
- Think of `x` as a selector or mask
- Where `x` has a 1-bit: choose the corresponding bit from `y`
- Where `x` has a 0-bit: choose the corresponding bit from `z`
- This creates a conditional bit selection mechanism

**Cryptographic Purpose:**  
The Ch function provides **non-linearity** in rounds 0-19 of SHA-1's compression function. Non-linear functions are crucial for cryptographic security because they prevent attackers from using linear algebra to reverse or predict hash values. The conditional selection based on `x` creates complex dependencies between input bits.

**Technical Details:**
- **Input**: Three 32-bit integers
- **Output**: A 32-bit unsigned integer  
- **Important**: The bitwise NOT (`~x`) in Python produces negative numbers, so we must mask with `0xFFFFFFFF` to get proper 32-bit behavior: `(~x & 0xFFFFFFFF)`

**Truth Table**
```
x | y | z | (x & y) │ (~x & z) │ Ch(x,y,z)
--|---|---|---------|----------|----------
0 | 0 | 0 |    0    |    0     |     0    
0 | 0 | 1 |    0    |    1     |     1    
0 | 1 | 0 |    0    |    0     |     0    
0 | 1 | 1 |    0    |    1     |     1    
1 | 0 | 0 |    0    |    0     |     0    
1 | 0 | 1 |    0    |    0     |     0    
1 | 1 | 0 |    1    |    0     |     1    
1 | 1 | 1 |    1    |    0     |     1    
```

#### Example Usage

```python
result = Ch(0b1111, 0b1010, 0b0101)
print(result)  # Output: 10 (binary: 0b1010)
# Since all bits in x are 1, we select all bits from y (0b1010)
```

In this example, `x=0b1111` acts as a selector that chooses all bits from `y`.

In [44]:
def Ch(x, y, z):
    """
    SHA-1 choose (Ch) function.

    Parameters:
        x (int): first value (interpreted as binary).
        y (int): second value (interpreted as binary).
        z (int): third value (interpreted as binary).

    Returns:
        int: The result of the choose function as a 32-bit unsigned int.
    """
    x = x & 0xFFFFFFFF
    y = y & 0xFFFFFFFF
    z = z & 0xFFFFFFFF

    # SHA-1 choose function: (x & y) ^ (~x & z)
    result = (x & y) ^ ((~x & 0xFFFFFFFF) & z)

    return result

In [45]:
Ch(0b101, 0b010, 0b001)

def test_Ch():
    assert Ch(0b0, 0b0, 0b0) == 0, "All zeros should return 0"
    assert Ch(0b1, 0b0, 0b0) == 0, "x=1, y=0, z=0 should return 0"
    assert Ch(0b1, 0b1, 0b0) == 1, "x=1, y=1, z=0 should return 1"
    assert Ch(0b0, 0b1, 0b1) == 1, "x=0, y=1, z=1 should return 1"
    assert Ch(0b101, 0b010, 0b001) == 0, "Mixed bits test, testing all false returns"
    assert Ch(0b1111, 0b1010, 0b0101) == 10, "4-bit test"
    print('All test cases passed.')

In [46]:
# Run all tests
test_Ch()

All test cases passed.


### SHA-1 Majority (Maj) Function

The `Maj` function, defined in **Section 4.1.1** of the Secure Hash Standard, implements the SHA-1 "majority" operation. For each bit position, it returns the majority value among three inputs: if at least two of the three inputs have a 1-bit, the result is 1; otherwise, the result is 0.

**Formula:** `Maj(x, y, z) = (x & y) ^ (x & z) ^ (y & z)`  
**Alternative formula:** `(x & y) | (x & z) | (y & z)` (equivalent but different form)

**Truth Table**
```
x | y | z | Maj(x,y,z)
--|---|---|------------
0 | 0 | 0 |     0      
0 | 0 | 1 |     0      
0 | 1 | 0 |     0      
0 | 1 | 1 |     1      
1 | 0 | 0 |     0      
1 | 0 | 1 |     1      
1 | 1 | 0 |     1      
1 | 1 | 1 |     1      
```

**Cryptographic Purpose:**  
The Maj function is used in rounds 40-59 of SHA-1's compression function. It provides another form of **non-linearity** and ensures that the output depends on agreement between multiple input values. This creates more complex mixing of bits compared to simple XOR operations, enhancing resistance to cryptanalysis.

**Technical Details:**
- **Input**: Three 32-bit integers (each masked with `0xFFFFFFFF`)
- **Output**: A 32-bit unsigned integer
- **Operation**: Applied independently to each of the 32 bit positions
- **Property**: More "democratic" than Ch - all three inputs have equal influence

#### Example Usage

```python
result = Maj(0b1111, 0b1010, 0b0101)
print(result)  # Output: 15 (binary: 0b1111)
# For each bit position, at least two inputs have a 1, so result is all 1s
```

This example demonstrates how Maj returns 1 for each bit position where the majority of inputs are 1.

#### Code

In [47]:
def Maj(x, y, z):
    """
    SHA-1 majority (Maj) function.

    Parameters:
        x (int): first value (interpreted as binary).
        y (int): second value (interpreted as binary).
        z (int): third value (interpreted as binary).

    Returns:
        int: The result of the majority function as a 32-bit unsigned int.
    """
    x = x & 0xFFFFFFFF
    y = y & 0xFFFFFFFF
    z = z & 0xFFFFFFFF

    result = (x & y) ^ (x & z) ^ (y & z)

    return result

#### Testing

In [48]:
def test_Maj():
    assert Maj(0b0, 0b0, 0b0) == 0, "All zeros should return 0"
    assert Maj(0b1, 0b0, 0b0) == 0, "Only one bit set should return 0"
    assert Maj(0b1, 0b1, 0b0) == 1, "Two bits set should return 1"
    assert Maj(0b1, 0b1, 0b1) == 1, "All bits set should return 1"
    assert Maj(0b101, 0b010, 0b001) == 1, "Mixed bits test"
    assert Maj(0b1111, 0b1010, 0b0101) == 15, "4-bit test"
    assert Maj(0b111, 0b110, 0b111) == 7, "Example of full bits"
    print('All test cases passed.')

In [49]:
# Run all tests
test_Maj()

All test cases passed.


### SHA-1 Sigma0 Function

The `Sigma0` function, as defined in Section 4.1.2 of the Secure Hash Standard is a bitwise operation used in SHA-1 and similar hash algorithms. It performs a combination of right rotations on the input value and XORs the results. 

XOR is associative and communicative, meaning the order in which the results are calculated does not matter. For instance, `(a ^ b) ^ c == (b ^ c) ^ a`. 

For SHA-1, Sigma0 is typically defined as:

`Sigma0(x) = ROTR^2(x) ^ ROTR^13(x) ^ ROTR^22(x)`

where `ROTR^n(x)` means rotate the bits of `x` right by `n` positions. This is not the same as a right bitshift as the `n` leftmost bits are placed in order in the rightmost position, and the remaining bits are placed in the leftmost position.

Each input should be a 32-bit integer. The function returns the result as a binary string.

#### Example Usage

```python
result = Sigma0(0b11110000111100001111000011110000)
print(result)  # Output: binary string
```

In this example, the function applies the three rotations and XORs the results.

#### Code

In [50]:
def ROTR(x, n):
    """Right rotate a 32-bit integer x by n bits."""
    x = np.uint32(x)
    return np.uint32((x >> n) | (x << (32 - n)))  # wrap-around

def Sigma0(x):
    """
    SHA-1 Sigma0 function.

    Parameters:
        x (int): 32-bit integer input.

    Returns:
        str: The result of Sigma0 as a binary string.
    """
    x = np.uint32(x)
    result = ROTR(x, 2) ^ ROTR(x, 13) ^ ROTR(x, 22)
    return bin(int(result))

#### Testing

In [51]:
def test_Sigma0():
    # Test with all zeros
    assert Sigma0(0b0) == '0b0', "All zeros should return 0b0"
    # Test with all ones (32 bits)
    assert Sigma0(0xFFFFFFFF) == '0b11111111111111111111111111111111', "All ones should return all ones (since all rotations are the same)"
    # Test with a pattern
    assert isinstance(Sigma0(0b11110000111100001111000011110000), str), "Should return a binary string"
    print('All test cases passed.')

In [52]:
# Run all tests
test_Sigma0()

All test cases passed.


### SHA-1 Sigma1 Function

The `Sigma1` function, as defined in Section 4.1.2 of the Secure Hash Standard is another bitwise operation used in SHA-1 and works in the exact same way as `Sigma0` defined above, but with different right bit rotation values. It performs a combination of right rotations on the input value and XORs the results. 

XOR is associative and communicative, meaning the order in which the results are calculated does not matter. For instance, `(a ^ b) ^ c == (b ^ c) ^ a`.

For SHA-1, Sigma1 is typically defined as:

`Sigma1(x) = ROTR^6(x) ^ ROTR^11(x) ^ ROTR^25(x)`

where `ROTR^n(x)` means rotate the bits of `x` right by `n` positions.

Each input should be a 32-bit integer. The function returns the result as a binary string.

#### Example Usage

```python
result = Sigma1(0b11110000111100001111000011110000)
print(result)  # Output: binary string
```

In this example, the function applies the three rotations and XORs the results.

#### Code

In [53]:
def Sigma1(x):
    """
    SHA-1 Sigma1 function.

    Parameters:
        x (int): 32-bit integer input.

    Returns:
        int: The result of Sigma1 as a 32-bit unsigned int.
    """
    x = x & 0xFFFFFFFF
    result = ROTR(x, 6) ^ ROTR(x, 11) ^ ROTR(x, 25)
    return result

#### Testing

In [54]:
def test_Sigma1():
    # Test with all zeros
    assert Sigma1(0b0) == 0, "All zeros should return 0"
    # Test with all ones (32 bits)
    assert Sigma1(0xFFFFFFFF) == 0xFFFFFFFF, "All ones should return all ones (since all rotations are the same)"
    # Test with a pattern
    assert isinstance(Sigma1(0b11110000111100001111000011110000), (int, np.uint32)), "Should return an int or uint32"
    print('All test cases passed.')

In [55]:
# Run all tests
test_Sigma1()

All test cases passed.


### SHA-1 Lower-case sigma0 Function

The `sigma0` function, as defined in Section 4.1.2 of the Secure Hash Standard is a SHA-1 bitwise operation that makes use of two right rotation operation and one bitshift operation and XORs the results.

A bitshift to the right of `n` on a 32-bit integer `x` (`SHR^n(x)`) moves all bits `x` bits to the right, and places the rightmost `x` bits in-order into the leftmost `x` bits of the integer.

XOR is associative and communicative, meaning the order in which the results are calculated does not matter. For instance, `(a ^ b) ^ c == (b ^ c) ^ a`.

For SHA-1, sigma0 is defined as:

`sigma0(x) = ROTR^7(x) ^ ROTR^18(x) ^ SHR^3(x)`

where `ROTR^n(x)` means rotate the bits of `x` right by `n` positions,

and `SHR^n(x)` means shift the bits of `x` right by `n` postions.

Each input should be a 32-bit integer. The function returns the result as a binary string.

#### Example Usage

```python
result = sigma0(0b11110000111100001111000011110000)
print(result)  # Output: binary string
```

In this example, the function applies the two rotations and one shift, and XORs the results.

#### Code

In [56]:
def ROTR(x, n):
    """Rotate a 32-bit integer x right by n bits."""
    x = x & 0xFFFFFFFF
    return ((x >> n) | (x << (32 - n))) & 0xFFFFFFFF

def SHR(x, n):
    """Right shift a 32-bit integer x by n bits."""
    x = x & 0xFFFFFFFF
    return (x >> n) & 0xFFFFFFFF

In [57]:
def sigma0(x):
    """
    SHA-1 sigma0 function.

    Parameters:
        x (int): 32-bit integer input.

    Returns:
        int: The result of sigma0 as a 32-bit unsigned int.
    """
    x = x & 0xFFFFFFFF
    result = ROTR(x, 7) ^ ROTR(x, 18) ^ SHR(x, 3)
    return result

#### Testing

In [58]:
def test_sigma0():
    # Test with all zeros
    assert sigma0(0b0) == 0, "All zeros should return 0"
    # Test with all ones (32 bits)
    assert sigma0(0xFFFFFFFF) == 0x1FFFFFFF, "All ones should return 29 ones, as 3 bits have been shifted out"
    # Test with a pattern
    assert isinstance(sigma0(0b11110000111100001111000011110000), (int, np.uint32)), "Should return an int or uint32"
    print('All test cases passed.')

In [59]:
# Run all tests
test_sigma0()

All test cases passed.


### SHA-1 Lower-case sigma1 Function

The `sigma1` function, as defined in Section 4.1.2 of the Secure Hash Standard is a SHA-1 bitwise operation that makes use of both right rotations and bitshifts and XORs the results, the same as sigma0 but with different bit rotations (ROTR) and a different shift amounts. For SHA-1, sigma is defined as:

`sigma0(x) = ROTR^17(x) ^ ROTR^19(x) ^ SHR^10(x)`

where `ROTR^n(x)` means rotate the bits of `x` right by `n` positions,

and `SHR^n(x)` means shift the bits of `x` right by `n` postions.

XOR is associative and communicative, meaning the order in which the results are calculated does not matter. For instance, `(a ^ b) ^ c == (b ^ c) ^ a`.

Each input should be a 32-bit integer. The function returns the result as a binary string.

#### Example Usage

```python
result = sigma0(0b11110000111100001111000011110000)
print(result)  # Output: binary string
```

In this example, the function applies the two rotations and one shift, and XORs the results.

#### Code

In [60]:
def sigma1(x):
    """
    SHA-1 sigma1 function.

    Parameters:
        x (int): 32-bit integer input.

    Returns:
        int: The result of sigma1 as a 32-bit unsigned int.
    """
    x = x & 0xFFFFFFFF
    result = ROTR(x, 17) ^ ROTR(x, 19) ^ SHR(x, 10)
    return result

#### Testing

In [61]:
def test_sigma1():
    # Test with all zeros
    assert sigma1(0b0) == 0, "All zeros should return 0"
    # Test with all ones (32 bits)
    assert sigma1(0xFFFFFFFF) == 0x003FFFFF, "All ones should return 22 ones (since all rotations are the same and 10 out of the 22 bits are shifted out)"
    # Test with a pattern
    assert isinstance(sigma1(0b11110000111100001111000011110000), (int, np.uint32)), "Should return an int or uint32"
    print('All test cases passed.')

In [62]:
# Run all tests
test_sigma1()

All test cases passed.


## Problem 2

#### Problem 2: Fractional Parts of Cube Roots

#### Description

The goal of problem 2 is to calculate the 64 words (constant 32-bit values, expressed as 8-character hexadecimals) defined on page 11 of the Secure Hash Standard.

These constants are the first 32 bits of the fractional portion of the cube root of the first 64 prime numbers.

In order to calculate these constants, one must:
1) Calculate the first 64 prime numbers.
2) Calculate the cube root of each prime.
3) Extract the first 32 bits of the fractional portion of these prime numbers.
4) Represent the 32-bit words in hexadecimal.

A more in-depth analyis of each portion of the problem is defined below:

##### Calculating Prime Numbers

A prime number is any positive integer with only two factors: 1 and n.

The problem requires the calculation of the first `n` primes.

Rules can be extrapolated to guide the solution based on this definition:

- Any prime number greater than `2` must be odd, as any other even number is divisible by `2`. This halves our search space and saves on memory.
- If `n = 0`, return `[]`.
- If `n = 1`, return `[2]`, as `2` is the first prime number.
- For each prime number found, multiples of these numbers are not primes (e.g. for `7`, the numbers `[14, 21, 28...]` are not primes). In addition, even multiples of these primes do not need to be checked as they are divisible by 2 (e.g. `[14, 28, 42...]`).
- If `x` is our limit, we only need to check for primes up to and including `SQRT(x)`, as the factors multiplied by these factors to add up to the number have already been checked.

##### Additional Refinements

To save on memory, we can represent the search space (i.e. the potential prime numbers) as a boolean array `is_prime`. Since positive even numbers excluding `2` are not prime, the array `is_prime` only needs to represent the odd numbers greater than `2` and less than or equal to the search space. 

`is_prime` at each index is initialised to `True`, as each of these numbers is a candidate, and must be proven not to be prime. `is_prime[0]` represents the prime state of the number `3`, `is_prime[1]` represents the prime state of the number `5`, and so on. Therefore, to calculate the value of the number `candidate` represented at any index `i` of `is_prime` is defined as `candidate = 3 + 2 * i`.

Prime numbers are difficult to calculate due to their unpredictable nature. Therefore, defining a function to calculate a search space that is guaranteed to contain the first `n` primes with respect to `n` without the search space being too large is quite difficult.

However, with some online research, a practical calculation to estimate the rough value of the `n`th prime is `n * ln(n)`. This is a rough estimate, and is often too small of a search space, however the search space can be doubled if `n` primes are not found.

##### Calculating Cube Roots

`numpy` has built-in functionality to calculate the cube roots of each value in a `numpy` array. This function is extremely efficient, making use of vectorised operations. 

`np.cbrt(primes(64))` returns a new `numpy` array with the cube roots of the first 64 prime numbers.

##### Extracting 32-bit Fractionals

Once the cube roots have been calculated, the fractional portion of each cube root must be extracted and converted to a `32-bit` representation.

The fractional portion of a number is the part after the decimal point. For example, if the cube root is `2.7123`, the fractional portion is `0.7123`. This can be calculated by subtracting the floor of the number from the number itself: `fractional_part = cube_root - floor(cube_root)`.

To extract the first 32 bits of this fractional portion, the fractional value must be scaled to the range of a 32-bit unsigned integer. This is achieved by multiplying the fractional portion by `2^32` (or `1 << 32`), which shifts the significant bits into the integer range.

The result is then floored to remove any remaining fractional component and cast to an unsigned 32-bit integer (`np.uint32`). This gives us exactly 32 bits of the fractional portion.

The complete process can be summarized as:
1. Extract fractional parts: `fractional_parts = cube_roots - np.floor(cube_roots)`
2. Scale to `32-bit` range: `bits = np.floor(fractional_parts * (1 << 32)).astype(np.uint32)`
3. Convert to hexadecimal: `hex_values = [f"{b:08x}" for b in bits]`

##### Converting to Hexadecimal

Finally, each 32-bit value is converted to its hexadecimal representation using Python's format string `f"{b:08x}"`, which produces an 8-character lowercase hexadecimal string (since 32 bits = 8 hex digits).

#### Description of testing

To test the `primes(n)` function, I gathered a [file](./test/primes.0000) with the first `100,000` prime numbers from a public GitHub repository by user `srmalins` at repository [https://github.com/srmalins/primelists/tree/master](https://github.com/srmalins/primelists/tree/master), and wrote a test to compare the values to those returned by the function.

To test the `get_sha1_constants()` function, I extracted the 64 hex values defined in the Secure Hash Standard to a text [file](./test/sha1_constants.txt), and wrote a text to compare the values returned by the function.

In [63]:
"""
Parameters:
    n (int): The index of the prime number to estimate.

Returns:
    int: The estimated value as an unsigned int (positive integer).
"""
def estimate_nth_prime(n):
    return np.uint32(n * np.log(n))  # Returns a positive integer, equivalent to unsigned

In [64]:
def primes(n):
    
    if (n < 0):
        raise ValueError("n must be a non-negative integer.")
    if (n == 0):
        return []
    if (n == 1):
        return [2]
    
    primes_list = [2]
    
    # Set practical search space
    search_size = estimate_nth_prime(n)
    
    # We should define an array of booleans that represent the odd numbers from 3 to search_size.
    # is_Prime[0] represents if 3 is prime or not.
    array_size = (search_size - 3) // 2 + 1  # Number of odd numbers from 3 to search_size
    
    # Assume each number is prime until proven otherwise.
    is_prime = [True] * array_size  # is_prime[0] for 3, is_prime[1] for 5, etc.

    current_i = 0  # Track the current index being processed

    while len(primes_list) < n:
        if current_i >= len(is_prime):
            # Double the search space if we've processed all current indices
            search_size *= 2
            new_array_size = (search_size - 3) // 2 + 1
            old_array_size = len(is_prime)
            
            # Extend the is_prime list to contain spaces for the new odd numbers in the new range (potential primes)
            is_prime.extend([True] * (new_array_size - old_array_size))
            
            # Now cross out multiples of previous primes in the new range
            # Only odd multiples need to be checked, as even multiples are divisible by 2 and therefore not prime
            new_start_index = old_array_size
            for p in primes_list:
                for i in range(new_start_index, len(is_prime)):
                    # Calculate candidate number from index
                    num = 3 + 2 * i
                    if num % p == 0:
                        is_prime[i] = False
        
        if is_prime[current_i]:
            prime_candidate = 3 + 2 * current_i  # Map index to actual odd number
            primes_list.append(prime_candidate)
            # Mark multiples of the found prime as non-prime
            for multiple in range(prime_candidate * prime_candidate, search_size + 1, prime_candidate):
                if multiple % 2 == 1:  # Only cross out odd multiples
                    index = (multiple - 3) // 2
                    if index < len(is_prime):
                        is_prime[index] = False
        
        current_i += 1

    return primes_list

In [65]:
def get_sha1_constants():
    cube_roots = np.cbrt(primes(64))
    fractional_parts = cube_roots - np.floor(cube_roots)
    bits = np.floor(fractional_parts * (1 << 32)).astype(np.uint32)
    hex_values = [f"{b:08x}" for b in bits]
    return hex_values

In [66]:
def test_primes_with_file(n):
    with open('test/primes.0000', 'r') as file:
        expected_primes = [int(line.strip()) for line in file.readlines()]
    
    computed_primes = primes(n)
    print("Computed primes:", computed_primes)
    print("Expected primes:", expected_primes[:n])

    assert computed_primes == expected_primes[:n], f"Expected {expected_primes[:n]}, but got {computed_primes}"

    print("All test cases passed.")

In [67]:
def test_sha1_constants_from_file():
    with open('test/sha1_constants.txt', 'r') as file:
        expected_constants = [line.strip() for line in file.readlines() if line.strip()]
    computed_constants = get_sha1_constants()
    print("Computed constants:", computed_constants)
    print("Expected constants:", expected_constants)
    assert computed_constants == expected_constants, f"Expected {expected_constants}, but got {computed_constants}"
    print("All test cases passed.")

In [68]:
test_primes_with_file(20)
test_primes_with_file(64)
test_primes_with_file(100)
test_primes_with_file(1000)

Computed primes: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
Expected primes: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
All test cases passed.
Computed primes: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311]
Expected primes: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311]
All test cases passed.
Computed primes: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131,

In [69]:
test_sha1_constants_from_file()

Computed constants: ['428a2f98', '71374491', 'b5c0fbcf', 'e9b5dba5', '3956c25b', '59f111f1', '923f82a4', 'ab1c5ed5', 'd807aa98', '12835b01', '243185be', '550c7dc3', '72be5d74', '80deb1fe', '9bdc06a7', 'c19bf174', 'e49b69c1', 'efbe4786', '0fc19dc6', '240ca1cc', '2de92c6f', '4a7484aa', '5cb0a9dc', '76f988da', '983e5152', 'a831c66d', 'b00327c8', 'bf597fc7', 'c6e00bf3', 'd5a79147', '06ca6351', '14292967', '27b70a85', '2e1b2138', '4d2c6dfc', '53380d13', '650a7354', '766a0abb', '81c2c92e', '92722c85', 'a2bfe8a1', 'a81a664b', 'c24b8b70', 'c76c51a3', 'd192e819', 'd6990624', 'f40e3585', '106aa070', '19a4c116', '1e376c08', '2748774c', '34b0bcb5', '391c0cb3', '4ed8aa4a', '5b9cca4f', '682e6ff3', '748f82ee', '78a5636f', '84c87814', '8cc70208', '90befffa', 'a4506ceb', 'bef9a3f7', 'c67178f2']
Expected constants: ['428a2f98', '71374491', 'b5c0fbcf', 'e9b5dba5', '3956c25b', '59f111f1', '923f82a4', 'ab1c5ed5', 'd807aa98', '12835b01', '243185be', '550c7dc3', '72be5d74', '80deb1fe', '9bdc06a7', 'c19bf174'

## Problem 3

Write a generator function block_parse(msg) that processes messages according to section 5.1.1 and 5.2.1 of the Secure Hash Standard. The function should accept a bytes object called msg. At each iteration, it should yield the next 512-bit block of msg as a bytes object. Ensure that the final block (or final two blocks) include the required padding of msg as specified in the standard. Test the generator with messages of different lengths to confirm proper padding and block output.

#### Solution Description

The `block_parse` function implements SHA-1 message preprocessing according to sections 5.1.1 and 5.2.1 of the Secure Hash Standard. It takes a message as bytes and yields 512-bit (64-byte) blocks with proper padding to ensure the total message length is a multiple of 512 bits.

##### Key Components

1. **Message Processing**: The function first yields complete 512-bit blocks from the original message without modification. A generator function is a method that incrementally returns or yields its response to the caller.

2. **Padding Logic**: For the remaining bytes (less than 64), it applies SHA-1 padding:
   - Appends a '1' bit (0x80 byte) immediately after the message
   - Pads with zeros until the total length (excluding the 64-bit length field) is congruent to 448 bits modulo 512 bits
   - Appends the original message length in bits as a 64-bit big-endian integer

3. **Two-Block Handling**: If adding the '1' bit would make the current block exceed 448 bits (56 bytes), the function creates two final blocks:
   - First block: remaining message + '1' bit + zeros to fill 512 bits
   - Second block: 448 bits of zeros + 64-bit length field

##### Implementation Details

- **Generator Design**: Uses `yield` to return blocks, making it memory-efficient for large messages
- **Length Calculation**: Message length is computed as `len(msg) * 8` to get bits (conversion from bytes to bits)
- **Boundary Conditions**: Handles edge cases like empty messages, messages that fit exactly in one block, and messages requiring multiple blocks
- **Testing**: Comprehensive tests verify correct padding for various message lengths including boundary cases

##### Example

For a 3-byte message "abc":
- Original: 24 bits
- Padded: "abc" + 0x80 + 53 zeros + 64-bit length (24)
- Result: One 64-byte block

For a 56-byte message:
- First block: 56 bytes of message
- Second block: 0x80 + 55 zeros + 64-bit length (448)
- Result: Two 64-byte blocks

This ensures SHA-1 can process messages of any length by standardizing them into 512-bit blocks.

In [70]:
def block_parse(msg):
    # Get the original message length in bits
    msg_len_bits = len(msg) * 8
    
    # Yield complete 512-bit blocks from the original message
    block_size = 64  # 512 bits = 64 bytes
    offset = 0
    
    while offset + block_size <= len(msg):
        yield msg[offset:offset + block_size]
        offset += block_size
    
    # Process the remaining bytes (less than 64 bytes)
    remaining = msg[offset:]
    
    # Start building the final block(s) with padding
    # Step 1: Append the '1' bit (0x80 = 10000000 in binary)
    padded = remaining + b'\x80'
    
    # Step 2: Calculate how many zero bytes we need
    # We need to leave room for the 8-byte (64-bit) length field
    # Total length should be a multiple of 64 bytes
    
    # If we have room for the length in the current block
    if len(padded) <= 56:  # 56 bytes message + padding + 8 bytes length = 64 bytes
        # Pad with zeros until we have 56 bytes
        padded += b'\x00' * (56 - len(padded))
        # Append the message length as a 64-bit big-endian integer
        padded += msg_len_bits.to_bytes(8, byteorder='big')
        yield padded
    else:
        # First block: pad the current data to 64 bytes
        padded += b'\x00' * (64 - len(padded))
        yield padded
        
        # Second block: 56 bytes of zeros + 8 bytes of length
        final_block = b'\x00' * 56
        final_block += msg_len_bits.to_bytes(8, byteorder='big')
        yield final_block


#### Testing

In [71]:
def test_block_parse():
    """Test the block_parse generator with various message lengths."""
    
    # Test 1: Empty message
    print("Test 1: Empty message")
    blocks = list(block_parse(b''))
    assert len(blocks) == 1, "Empty message should produce 1 block"
    assert len(blocks[0]) == 64, "Block should be 64 bytes"
    assert blocks[0][0] == 0x80, "First byte should be padding bit"
    print(f"  Blocks: {len(blocks)}, Last 8 bytes (length): {blocks[0][-8:].hex()}")
    
    # Test 2: Short message (3 bytes = "abc")
    print("\nTest 2: Short message 'abc'")
    blocks = list(block_parse(b'abc'))
    assert len(blocks) == 1, "Short message should produce 1 block"
    assert len(blocks[0]) == 64, "Block should be 64 bytes"
    assert blocks[0][:3] == b'abc', "Should start with original message"
    assert blocks[0][3] == 0x80, "Fourth byte should be padding bit"
    msg_len = int.from_bytes(blocks[0][-8:], byteorder='big')
    assert msg_len == 24, "Message length should be 24 bits (3 bytes)"
    print(f"  Blocks: {len(blocks)}, Message length in bits: {msg_len}")
    
    # Test 3: Message that fits exactly in one block minus padding (55 bytes)
    print("\nTest 3: 55-byte message")
    msg = b'a' * 55
    blocks = list(block_parse(msg))
    assert len(blocks) == 1, "55-byte message should fit in 1 block with padding"
    assert len(blocks[0]) == 64
    msg_len = int.from_bytes(blocks[0][-8:], byteorder='big')
    assert msg_len == 55 * 8, "Message length should be 440 bits"
    print(f"  Blocks: {len(blocks)}, Message length in bits: {msg_len}")
    
    # Test 4: Message that requires exactly 2 blocks (56 bytes)
    print("\nTest 4: 56-byte message (boundary case)")
    msg = b'a' * 56
    blocks = list(block_parse(msg))
    assert len(blocks) == 2, "56-byte message should require 2 blocks"
    assert len(blocks[0]) == 64 and len(blocks[1]) == 64
    msg_len = int.from_bytes(blocks[1][-8:], byteorder='big')
    assert msg_len == 56 * 8, "Message length should be 448 bits"
    print(f"  Blocks: {len(blocks)}, Message length in bits: {msg_len}")
    
    # Test 5: Message longer than one block (100 bytes)
    print("\nTest 5: 100-byte message")
    msg = b'a' * 100
    blocks = list(block_parse(msg))
    assert len(blocks) == 2, "100-byte message should produce 2 blocks"
    assert all(len(block) == 64 for block in blocks), "All blocks should be 64 bytes"
    msg_len = int.from_bytes(blocks[-1][-8:], byteorder='big')
    assert msg_len == 100 * 8, "Message length should be 800 bits"
    print(f"  Blocks: {len(blocks)}, Message length in bits: {msg_len}")
    
    # Test 6: Large message (200 bytes)
    print("\nTest 6: 200-byte message")
    msg = b'x' * 200
    blocks = list(block_parse(msg))
    expected_blocks = 4  # 3 full blocks + 1 partial with padding
    assert len(blocks) == expected_blocks, f"200-byte message should produce {expected_blocks} blocks"
    assert all(len(block) == 64 for block in blocks), "All blocks should be 64 bytes"
    msg_len = int.from_bytes(blocks[-1][-8:], byteorder='big')
    assert msg_len == 200 * 8, "Message length should be 1600 bits"
    print(f"  Blocks: {len(blocks)}, Message length in bits: {msg_len}")
    
    print("\nAll tests passed.")

In [72]:
test_block_parse()

Test 1: Empty message
  Blocks: 1, Last 8 bytes (length): 0000000000000000

Test 2: Short message 'abc'
  Blocks: 1, Message length in bits: 24

Test 3: 55-byte message
  Blocks: 1, Message length in bits: 440

Test 4: 56-byte message (boundary case)
  Blocks: 2, Message length in bits: 448

Test 5: 100-byte message
  Blocks: 2, Message length in bits: 800

Test 6: 200-byte message
  Blocks: 4, Message length in bits: 1600

All tests passed.


## Problem 4

## Problem 4: SHA-1 Hash Implementation

### Description

The SHA-1 algorithm defined in Section 6.1 of the Secure Hash Standard processes the input message through the following steps:

1. **Preprocessing**: Pad the message and parse it into 512-bit blocks using the `block_parse` function from Problem 3, described in Section 6.1.1.

2. **Initialise Hash Values**: Start with the initial hash values defined in Section 5.3.1 of the Secure Hash Standard, described in Section 6.1.1:
   - H0 = 0x67452301
   - H1 = 0xEFCDAB89  
   - H2 = 0x98BADCFE
   - H3 = 0x10325476
   - H4 = 0xC3D2E1F0

3. **Process Each Block**: The processing algorithm is defined in Section 6.1.2. For each 512-bit block:
   - Break the block into sixteen words (32-bit integers) W[0..15]
   - Extend to 80 words using: W[t] = sigma1(W[t-2]) + W[t-7] + sigma0(W[t-15]) + W[t-16] for t = 16 to 79
   - Initialize working variables: a = H0, b = H1, c = H2, d = H3, e = H4
   - For t = 0 to 79:
     - Select the appropriate function and constant based on t:
       - 0 ≤ t ≤ 19: f = Ch(b, c, d), K = 0x5A827999
       - 20 ≤ t ≤ 39: f = Parity(b, c, d), K = 0x6ED9EBA1
       - 40 ≤ t ≤ 59: f = Maj(b, c, d), K = 0x8F1BBCDC
       - 60 ≤ t ≤ 79: f = Parity(b, c, d), K = 0xCA62C1D6
     - T = ROTL(a, 5) + f + e + K + W[t]
     - e = d, d = c, c = ROTL(b, 30), b = a, a = T
   - Update hash values: H0 += a, H1 += b, H2 += c, H3 += d, H4 += e

4. **Output**: Concatenate H0, H1, H2, H3, H4 as words and return as hex string.

### Code

In [73]:
def ROTL(x, n):
    """Rotate left by n bits (32-bit)."""
    return ((x << n) | (x >> (32 - n))) & 0xFFFFFFFF

In [74]:
def sha1_hash(message):
    """
    Compute the SHA-1 hash of a message.
    
    Parameters:
        message (bytes): The input message to hash.
        
    Returns:
        str: The 160-bit SHA-1 hash as a 40-character hexadecimal string.
    """
    # Get the padded blocks
    blocks = list(block_parse(message))
    
    # Initial hash values
    H = [
        0x67452301,
        0xEFCDAB89,
        0x98BADCFE,
        0x10325476,
        0xC3D2E1F0
    ]
    
    # Process each block
    for block in blocks:
        # Break block into 16 32-bit big-endian words
        W = []
        for i in range(16):
            word = int.from_bytes(block[i*4:(i+1)*4], byteorder='big')
            W.append(word)
        
        # Extend to 80 words
        for t in range(16, 80):
            W.append((sigma1(W[t-2]) + W[t-7] + sigma0(W[t-15]) + W[t-16]) & 0xFFFFFFFF)
        
        # Initialize working variables
        a, b, c, d, e = H
        
        # Main loop
        for t in range(80):
            if t < 20:
                f = Ch(b, c, d)
                K = 0x5A827999
            elif t < 40:
                f = Parity(b, c, d)
                K = 0x6ED9EBA1
            elif t < 60:
                f = Maj(b, c, d)
                K = 0x8F1BBCDC
            else:
                f = Parity(b, c, d)
                K = 0xCA62C1D6
            
            T = (ROTL(a, 5) + f + e + K + W[t]) & 0xFFFFFFFF
            e = d
            d = c
            c = ROTL(b, 30)
            b = a
            a = T
        
        # Update hash values
        H[0] = (H[0] + a) & 0xFFFFFFFF
        H[1] = (H[1] + b) & 0xFFFFFFFF
        H[2] = (H[2] + c) & 0xFFFFFFFF
        H[3] = (H[3] + d) & 0xFFFFFFFF
        H[4] = (H[4] + e) & 0xFFFFFFFF
    
    # Produce final hash
    hash_bytes = b''
    for h in H:
        hash_bytes += int(h).to_bytes(4, byteorder='big')
    
    return hash_bytes.hex()

##### Testing

In [75]:
#### Testing

def test_sha1_hash():
    # Still need to add tests
    return

In [76]:
test_sha1_hash()

## Problem 5

## End