## Computational Theory Assessment

In [141]:
import numpy as np

## Problem 1: Binary Words and Operations

**Brief:** Implement the following functions in Python. Use numpy to ensure that all variables and values are treated as 32-bit integers. These functions are defined in the Secure Hash Standard (see page 10). Document each function with a clear docstring, explain its purpose and behaviour in Markdown, and test it with appropriate examples to verify correctness.

## Problem 1 Introduction:

**Based on the Secure Hash Standard, FIPS PUB 180-4**

The Secure Hash Standard (FIPS 180-4) defines a family of cryptographic hash algorithms, including **SHA-224, SHA-256, SHA-384,** and **SHA-512**. These algorithms use carefully designed bitwise operations to achieve diffusion and non-linearity, which are critical for cryptographic security.
**SHA-256** in particular relies on a fixed set of 32-bit operations such as:
* Bitwise logical functions (Ch, Maj, Parity)
* Fixed right rotations
* Logical right shifts
* Large and small Σ (“Sigma”) functions
* 32-bit unsigned integer arithmetic

These operations together form the backbone of the **SHA-256** compression function and message schedule expansion.
The functions below are helper utilities that directly implement the primitives defined in **Section 4 of the Secure Hash Standard**.


#### **unsigned_32(x)**
Casts a value to a NumPy uint32. Ensures all arithmetic and bitwise operations stay within 32-bit unsigned integer bounds (as required by SHA-256)

SHA-256 uses modulo 2^32 arithmetic for every operation, so consistently forcing values into a 32-bit type ensures:
* Correct overflow behaviour
* Proper masking of intermediate values
* Reproducibility across platforms
This step is essential because Python’s integers are unbounded by default.

In [142]:
def unsigned_32(x):
    return np.uint32(x);

#### **Parity(x, y, z)**
Implements the parity **XOR** function: **x ⊕ y ⊕ z**. 

Parity is a simple nonlinear combination used in some hashing and cryptographic constructions (although SHA-256 does not use Parity, it appears in SHA-1).It serves as a symmetric bitwise mixing function —> each output bit is 1 if an odd number of inputs have that bit set.

In [143]:
def Parity(x, y, z):
    return unsigned_32(x) ^ unsigned_32(y) ^ unsigned_32(z)

#### **Ch(x, y, z)** — Choose Function
SHA-256 "choice" function: **Ch(x,y,z)=(x∧y)⊕(¬x∧z)** -> as defined in FIPS 180-4, Section 4.1.2. 

Selects bits from y when x bit is 1, otherwise from z.Models a bitwise conditional.
This function acts like a **bitwise selector**:
* If a bit in x is 1 → choose the corresponding bit from y
* If the bit in x is 0 → choose the bit from z

**Cryptographic purpose:**
* Adds nonlinearity
* Makes the output depend on specific input bits
* Prevents simple algebraic prediction of internal states

Ch is used heavily in the main SHA-256 compression loop.

In [144]:
def Ch(x, y, z):
    return unsigned_32((unsigned_32(x) & unsigned_32(y)) ^ (~unsigned_32(x) & unsigned_32(z)))

#### **Maj(x, y, z)** — Majority Function
SHA-256 "majority" function: **Maj(x,y,z)=(x∧y)⊕(x∧z)⊕(y∧z)**

Each output bit becomes whichever value (0 or 1) appears at least twice among the input bits -> This makes it a majority vote.

**Cryptographic purpose:**
* Mixes three internal state values
* Propagates structure from multiple inputs
* Resists bit-flipping predictability

**Maj** is also used in every round of the compression function.

In [145]:
def Maj(x, y, z):
    return unsigned_32((unsigned_32(x) & unsigned_32(y)) ^ (unsigned_32(x) & unsigned_32(z)) ^ (unsigned_32(y) & unsigned_32(z)))

#### **rotr(x, n)** — Right Rotation
Performs a 32-bit right rotate: **(x≫n)∣(x≪(32−n))**

Rotates bits instead of shifting in zeros. Core operation in SHA-256’s mixing steps. Different from a logical shift, rotation wraps around the dropped bits.

**Why rotations matter:**
* They are nonlinear with respect to shifts
* They ensure every output bit depends on multiple positions of the input
* They preserve all bits (unlike shifts, which lose information)

Rotations are a core building block of the **Σ** and **σ** functions.

In [146]:
def rotr(x, n):
    x = unsigned_32(x)
    return unsigned_32((x >> n) | (x << unsigned_32(32 - n)))

#### **Sigma0(x)** — Uppercase **Σ₀**
Large rotation-based mixing function: **(x)=ROTR^2(x) ⊕ ROTR^13(x) ⊕ ROTR^22(x)**

**Purpose:** This is a large rotation mix function that produces heavy diffusion across the internal state. It is applied repeatedly inside the compression function.

In [147]:
def Sigma0(x):
    return rotr(x, 2) ^ rotr(x, 13) ^ rotr(x, 22)

#### **Sigma1(x)** — Uppercase **Σ₁**
Large rotation-based mixing function: **(x)=ROTR^6(x) ⊕ ROTR^11(x) ⊕ ROTR^25(x)**

**Purpose:** Just like Σ₀, this function spreads bits widely and mixes state words to avoid linear relationships.

In [148]:
def Sigma1(x):
    return unsigned_32(rotr(x, 6) ^ rotr(x, 11) ^ rotr(x, 25))

#### **sigma0(x)** — Lowercase **σ₀**
Small mixing function using rotations and shifts: **(x)=ROTR^7(x) ⊕ ROTR^18(x) ⊕ (x≫3)**

Used when expanding the message schedule array.

**Purpose:** Mix input words to create additional message schedule values & introduce nonlinearity early, before compression

**Note:** uses a logical right shift in addition to rotations.

In [149]:
def sigma0(x):
    x = unsigned_32(x)
    return unsigned_32(rotr(x, 7) ^ rotr(x, 18) ^ (x >> 3))

#### **sigma1(x)** — Lowercase **σ₁**
Small mixing function using rotations and shifts: **(x)=ROTR^17(x) ⊕ ROTR^19(x) ⊕ (x≫10)**

**Purpose:** Also part of the message schedule expansion. Helps propagate entropy from earlier message words into later ones.

In [150]:
def sigma1(x):
    x = unsigned_32(x)
    return unsigned_32(rotr(x, 17) ^ rotr(x, 19) ^ (x >> 10))

### Problem 1 Tests

In [151]:
print("=== Testing unsigned_32 ===")
value = unsigned_32(np.uint32(0xffffffff) + np.uint32(1))
print("Input: 0xFFFFFFFF + 1")
print("32-bit wrapped output:", unsigned_32(value))
print("Expected: 0x00000000\n")

print("\n=== Testing Parity ===")
print("Input: 0b1010, 0b0101, 0b1100")
print("Output:", bin(Parity(np.uint32(0b1010), np.uint32(0b0101), np.uint32(0b1100))))
print("Expected: 0b1010 ^ 0b0101 ^ 0b1100 =", bin(0b1010 ^ 0b0101 ^ 0b1100))

print("\n=== Testing Ch ===")
print("If x bit = 1 → choose y bit")
print("If x bit = 0 → choose z bit\n")
print("Example:")
print("x = 0xFFFFFFFF, y = 0x12345678, z = 0x87654321")
print("Output:", hex(Ch(np.uint32(0xffffffff), np.uint32(0x12345678), np.uint32(0x87654321))))
print("Expected: y → 0x12345678")

print("\n=== Testing Maj ===")
print("Example bits: x=1, y=1, z=0 → majority is 1")
print("Example bits: x=0, y=0, z=1 → majority is 0\n")
print("Input: 0b1010, 0b1100, 0b1000")
print("Output:", bin(Maj(np.uint32(0b1010), np.uint32(0b1100), np.uint32(0b1000))))
print("Expected majority:", bin(0b1000))

print("\n=== Testing rotr ===")
print("Input: x = 0b0001, n = 1")
print("Output:", bin(rotr(np.uint32(0b0001), 1)))
print("Expected: 0x80000000 bit pattern")
print("\nInput: x = 0x12345678, n = 4")
print("Output:", hex(rotr(np.uint32(0x12345678), 4)))
print("Expected:", hex(0x81234567))

print("\n=== Testing Sigma0 ===")
x = np.uint32(0x6a09e667)
print("Input: x =", hex(x))
print("Output:", hex(Sigma0(x)))
print("This mixes the word using 3 rotations.")

print("\n=== Testing Sigma1 ===")
x = np.uint32(0xbb67ae85)
print("Input: x =", hex(x))
print("Output:", hex(Sigma1(x)))
print("Used every round inside SHA-256 compression.")

print("\n=== Testing sigma0 ===")
x = np.uint32(0x428a2f98)
print("Input:", hex(x))
print("Output:", hex(sigma0(x)))
print("This is used in the message schedule (W[t]).")

print("\n=== Testing sigma1 ===")
x = np.uint32(0x71374491)
print("Input:", hex(x))
print("Output:", hex(sigma1(x)))
print("Also used in message schedule expansion.")

=== Testing unsigned_32 ===
Input: 0xFFFFFFFF + 1
32-bit wrapped output: 0
Expected: 0x00000000


=== Testing Parity ===
Input: 0b1010, 0b0101, 0b1100
Output: 0b11
Expected: 0b1010 ^ 0b0101 ^ 0b1100 = 0b11

=== Testing Ch ===
If x bit = 1 → choose y bit
If x bit = 0 → choose z bit

Example:
x = 0xFFFFFFFF, y = 0x12345678, z = 0x87654321
Output: 0x12345678
Expected: y → 0x12345678

=== Testing Maj ===
Example bits: x=1, y=1, z=0 → majority is 1
Example bits: x=0, y=0, z=1 → majority is 0

Input: 0b1010, 0b1100, 0b1000
Output: 0b1000
Expected majority: 0b1000

=== Testing rotr ===
Input: x = 0b0001, n = 1
Output: 0b10000000000000000000000000000000
Expected: 0x80000000 bit pattern

Input: x = 0x12345678, n = 4
Output: 0x81234567
Expected: 0x81234567

=== Testing Sigma0 ===
Input: x = 0x6a09e667
Output: 0xce20b47e
This mixes the word using 3 rotations.

=== Testing Sigma1 ===
Input: x = 0xbb67ae85
Output: 0x758db092
Used every round inside SHA-256 compression.

=== Testing sigma0 ===
Input

## Problem 2: Fractional Parts of Cube Roots
**Brief:** Use numpy to calculate the constants listed at the bottom of page 11 of the Secure Hash Standard, following the steps below. These are the first 32 bits of the fractional parts of the cube roots of the first 64 prime numbers.
1. Write a function called primes(n) that generates the first n prime numbers.
2. Use the function to calculate the cube root of the first 64 primes.
3. For each cube root, extract the first thirty-two bits of the fractional part.
4. Display the result in hexadecimal.
5. Test the results against what is in the Secure Hash Standard.



## Problem 2 Introduction:

**The Secure Hash Standard (FIPS PUB 180-4)** defines **SHA-256** as part of the **SHA-2** family of cryptographic hash functions. **SHA-256** operates on 32-bit words and relies on:
* Initial hash values (H0–H7)
* Round constants (K0–K63)

Both are derived from irrational numbers to ensure unpredictability and uniform bit distribution.
**Specifically:** Initial hash values **(H0–H7)** are the first 32 bits of the fractional part of the square roots of the first 8 prime numbers. Round constants **(K0–K63)** are the first 32 bits of the fractional part of the cube roots of the first 64 prime numbers.

The code below generates the **SHA-256** round constants (K) step by step.

#### **primes(n)** Function:

* Generates the first n prime numbers.
* Starts from 2 and checks divisibility to determine primality.

Used to obtain deterministic, well-known numbers as input to the cube-root and square-root operations in **SHA-256**. Why in SHA-256: Prime numbers are the basis for generating constants to reduce predictability and avoid hidden patterns.

In [152]:
def primes(n):
    primes = []
    num = 2
    while len(primes) < n:
        for i in range(2,num):
            if num % i == 0:
                break
        
            primes.append(num)
        num += 1
    return primes

#### **cube_roots(primes)** Function:
Computes the cube root of each prime. **SHA-256** specifies that round constants **(K0–K63)** are based on the fractional part of the cube roots of the first 64 primes.

In [153]:
def cube_root(primes):
    cube_roots = [] 
    cube_roots = np.cbrt(primes)
    return cube_roots

#### **frac_32()** Function:

* Extracts the fractional part of the cube roots using np.modf().
* Scales the fractional part to 32 bits (frac * 2^32) and converts to 32-bit unsigned integers (np.uint32).
* Produces the K constants used in each SHA-256 round.

**Why in SHA-256:** Using the fractional part ensures a uniform, non-repeating bit pattern for cryptographic mixing.

In [154]:
def frac_32():
    p = primes(64)
    cube_roots = cube_root(p)
    frac, _= np.modf(cube_roots)
    constants = np.floor(frac * (2**32)).astype(np.uint32)
    return constants 

#### **hex_conversion(constants)** Function:
Converts the 32-bit integers into 8-character hexadecimal strings, zero-padded. **SHA-256** specifications and reference implementations often display constants in hexadecimal. Makes constants readable and comparable against the official standard values.

In [155]:
def hex_conversion(constants):
    hex_constants = []
    for c in constants:
        hex_constants.append(hex(c)[2:].zfill(8))
    return hex_constants

In [156]:
K = np.array([
    0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
    0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, 
    0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, 
    0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, 
    0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
    0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
    0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
    0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2
], dtype=np.uint32)

In [157]:
## test primes
prime_nums = primes(64)
print(prime_nums)
print(len(prime_nums))

## test cubes
cubes = cube_root(prime_nums)
print(cubes)

## test fracs 
frac, nonfrac= np.modf(cubes)
print("fractional:", frac)
print("non fractional:", nonfrac)

## test constants
constants = np.floor(frac * (2**32)).astype(np.uint32)
print("const:", constants)

## test hex constants
hex_constants = hex_conversion(constants)
print("Hex constants", hex_constants)

def compare_constants(official_constants, hex_constants):
    return np.array_equal(official_constants, hex_constants)

print(len(K))
print(len(K))
print(K)
print(compare_constants(K, hex_constants))

[3, 5, 5, 5, 7, 7, 7, 7, 7, 9, 11, 11, 11, 11, 11, 11, 11, 11, 11, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 15, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 21]
64
[1.44224957 1.70997595 1.70997595 1.70997595 1.91293118 1.91293118
 1.91293118 1.91293118 1.91293118 2.08008382 2.22398009 2.22398009
 2.22398009 2.22398009 2.22398009 2.22398009 2.22398009 2.22398009
 2.22398009 2.35133469 2.35133469 2.35133469 2.35133469 2.35133469
 2.35133469 2.35133469 2.35133469 2.35133469 2.35133469 2.35133469
 2.46621207 2.57128159 2.57128159 2.57128159 2.57128159 2.57128159
 2.57128159 2.57128159 2.57128159 2.57128159 2.57128159 2.57128159
 2.57128159 2.57128159 2.57128159 2.57128159 2.66840165 2.66840165
 2.66840165 2.66840165 2.66840165 2.66840165 2.66840165 2.66840165
 2.66840165 2.66840165 2.66840165 2.66840165 2.66840165 2.66840165
 2.66840165 2.66840165 2.66840165 2.75892418]
fractional: [0.44224957 0.70997595

## Problem 3: Padding

**Brief:** Write a generator function block_parse(msg) that processes messages according to section 5.1.1 and 5.2.1 of the Secure Hash Standard. The function should accept a bytes object called msg. At each iteration, it should yield the next 512-bit block of msg as a bytes object. Ensure that the final block (or final two blocks) include(s) the required padding of msg as specified in the standard. Test the generator with messages of different lengths to confirm proper padding and block output.

###  Problem 3 Introduction

Before hashing can begin, the message needs to be prepared in a specific way - everything needs to be the right size and format. This preparation has three steps, and we're focusing on the first two: padding and parsing.

#### Padding the Message
The goal: make the message length a specific multiple (either 512 bits or 1024 bits, depending on the SHA algorithm used).

**For SHA-1, SHA-224, and SHA-256 (512-bit blocks):**

If the message is ℓ bits long here's what happens:
* Add a "1" bit to the end of your message
* Add k zero bits, where k is chosen so that (ℓ + 1 + k) equals 448 when you divide by 512 and look at the remainder
* Add a 64-bit representation of ℓ (the original message length)

**The result: (ℓ + 1 + k + 64) is always 512 bits.**

#### Parsing the Message

Once padded, you need to break the message into chunks (blocks) for processing.
For **SHA-256** and similar algorithms, you split your padded message into N blocks of 512 bits each. Each 512-bit block is then subdivided into sixteen 32-bit words (since 16 × 32 = 512).

So if you have block i, it contains words M₀⁽ⁱ⁾, M₁⁽ⁱ⁾, M₂⁽ⁱ⁾, ... M₁₅⁽ⁱ⁾.

In [158]:
def file_to_msg(filepath):
    # Reading a file in binary mode
    f = open(filepath, 'rb')
    msg = f.read()
    return msg

In [159]:
def block_parse(msg):
    no_bytes=64
    # Initialize bit counter
    no_bits = len(msg) * 8
    position = 0

    # split msg in 512-bit (64 byte) blocks & yield them
    while position < len(msg):
        block = msg[position:position+no_bytes]
        # check that the block is not a partial block - stops last block from being yielded 
        if len(block) == no_bytes:
            position += no_bytes
            yield block
        else:
            break

    # breaking out of the while loop when a partial block is found means block variable does not contain that partial block
    # get the remaining bytes in the partial block:
    block = msg[position:]  # This will be empty if position == len(msg)
    
    # check that the msg bytes objec is not empty - if it is, assign empty bytes object to last (and only) block
    if no_bits == 0:
        block = bytes()

    # Scenario 1: are there at least 9 bytes available?
    if (64 - len(block)) >= 9:
       yield (block
              + bytes([0x80])
              + bytes([0x00] * (64 - len(block) - 1 - 8))
              + no_bits.to_bytes(8, byteorder='big'))
       
    # Scenario 2: there are between 8 and 1 bytes available, inclusive.
    elif (64 - len(block)) >= 1:
       yield block + bytes([0x80]) + bytes([0x00] * (64 - len(block) - 1))
       yield bytes([0x00] * 56) + no_bits.to_bytes(8, byteorder='big')
       
    # Scenario 3: There were no bytes available in the last block.
    else:
       yield bytes([0x80] + ([0x00] * 55)) + no_bits.to_bytes(8, byteorder='big')

## Problem 4: Hashes

**Brief:** Write a function hash(current, block) that calculates the next hash value given the current hash value and the next message block according to section 6.2.2 SHA-256 Hash Computation on page 22 of the Secure Hash Standard.

### Introduction Problem 4:

In [None]:
import warnings
warnings.filterwarnings('ignore', category=RuntimeWarning)

def hash(current, block):
    # Read block as a array of 32-bit unsigned ints in big endian.
    block = np.frombuffer(block, dtype='>u4') 

    # Assign room for a 64 element 32-bit int array.
    W = np.zeros(64, dtype=np.uint32)

    # The first 16 elements of W come from the message block.
    for t in range(16):
        W[t] = block[t]
    #
    for t in range(16, 64):
        W[t] = sigma1(W[t-2]) + W[t-7] + sigma0(W[t-15]) + W[t-16]
        
    # assign current hash values to 8 variables a->h
    a = current[0]
    b = current[1]
    c = current[2]
    d = current[3]
    e = current[4]
    f = current[5]
    g = current[6]
    h = current[7]

    # loop where: T1 & T2 calculated & values of h->a get cascaded down to next next variable
    # this is done to ensure original data (message) is well-mixed with an asymetric hash 
    for t in range(64):
        T1 = h + Sigma1(e) + Ch(e, f, g) + K[t] + W[t]
        T2 = Sigma0(a) + Maj(a, b, c)
        h = g
        g = f
        f = e
        e = d + T1
        d = c
        c = b
        b = a
        a = T1 + T2

    # calculate new hash using hashed variables a->h
    H = np.array([
        a + current[0], b + current[1], c + current[2], d + current[3],
        e + current[4], f + current[5], g + current[6], h + current[7],
    ], dtype=np.uint32)

    return H

# get first 8 hash codes from official hash codes of the secure hash standard (Section 5.3.3)
H = np.array([
    0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a,
    0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19,
], dtype=np.uint32)

# get message from file
msg = file_to_msg('abc.txt')

# get padded blocks from block_parse() & loop through them
for block in block_parse(msg):
    # for each block, apply the hash
    H = hash(H, block) 

# test
msg = b"abc"
for block in block_parse(msg):
    H = hash(H, block) 

# print the result in hex (more readable format)
result = ''.join(f'{x:08x}' for x in H)
print("Result: ", result)
print("Expected:   ", "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad")
print("Match:", result == "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad")


Result:  1de51655e1a8ca5f570c062c6fef497dd8110762bd8a7a48dd95482f7d22e0a2
Expected:    ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
Match: False


## Problem 5: Passwords