Task 1 : Binary Representations.

This notebook implements various binary manipulation functions for 32-bit unsigned integers:
1. `rotl`: Rotate bits left
2. `rotr`: Rotate bits right
3. `ch`: Choose bits based on selector
4. `maj`: Majority vote of bits

## Function Implementations

In [38]:
def rotl(x: int, n: int = 1) -> int:
    """
    Rotate a 32-bit unsigned integer to the left by n positions.
    
    Args:
        x: The integer to rotate (must be a 32-bit unsigned integer)
        n: Number of positions to rotate left (default: 1)
    
    Returns:
        The rotated integer
    """
    x = x & 0xFFFFFFFF  # Ensure 32-bit
    n = n % 32  # Normalize rotation amount
    return ((x << n) | (x >> (32 - n))) & 0xFFFFFFFF

In [39]:
# Test rotl function
print("Testing rotl function:")
test_num = 0x12345678
print(f"Original number: {hex(test_num)}")
print(f"Rotated left 4 bits: {hex(rotl(test_num, 4))}")
print(f"Rotated left 8 bits: {hex(rotl(test_num, 8))}")
print(f"Rotated left 16 bits: {hex(rotl(test_num, 16))}")

Testing rotl function:
Original number: 0x12345678
Rotated left 4 bits: 0x23456781
Rotated left 8 bits: 0x34567812
Rotated left 16 bits: 0x56781234


In [40]:
def rotr(x: int, n: int = 1) -> int:
    """
    Rotate a 32-bit unsigned integer to the right by n positions.
    
    Args:
        x: The integer to rotate (must be a 32-bit unsigned integer)
        n: Number of positions to rotate right (default: 1)
    
    Returns:
        The rotated integer
    """
    x = x & 0xFFFFFFFF  # Ensure 32-bit
    n = n % 32  # Normalize rotation amount
    return ((x >> n) | (x << (32 - n))) & 0xFFFFFFFF

In [41]:
# Test rotr function
print("Testing rotr function:")
test_num = 0x12345678
print(f"Original number: {hex(test_num)}")
print(f"Rotated right 4 bits: {hex(rotr(test_num, 4))}")
print(f"Rotated right 8 bits: {hex(rotr(test_num, 8))}")
print(f"Rotated right 16 bits: {hex(rotr(test_num, 16))}")

Testing rotr function:
Original number: 0x12345678
Rotated right 4 bits: 0x81234567
Rotated right 8 bits: 0x78123456
Rotated right 16 bits: 0x56781234


In [42]:
def ch(x: int, y: int, z: int) -> int:
    """
    Choose bits from y where x has 1s, and from z where x has 0s.
    
    Args:
        x: The selector integer
        y: First input integer
        z: Second input integer
    
    Returns:
        The resulting integer after bit selection
    """
    return (x & y) ^ (~x & z)

In [43]:
# Test ch function
print("Testing ch function:")
x = 0xFFFFFFFF
y = 0xAAAAAAAA  # Pattern of alternating 1s and 0s
z = 0x55555555  # Inverse pattern of y

print(f"x: {hex(x)}")
print(f"y: {hex(y)}")
print(f"z: {hex(z)}")
print(f"ch(x,y,z): {hex(ch(x,y,z))}")

# Test with x = 0
x = 0x00000000
print(f"\nx: {hex(x)}")
print(f"y: {hex(y)}")
print(f"z: {hex(z)}")
print(f"ch(x,y,z): {hex(ch(x,y,z))}")

Testing ch function:
x: 0xffffffff
y: 0xaaaaaaaa
z: 0x55555555
ch(x,y,z): 0xaaaaaaaa

x: 0x0
y: 0xaaaaaaaa
z: 0x55555555
ch(x,y,z): 0x55555555


In [44]:
def maj(x: int, y: int, z: int) -> int:
    """
    Take majority vote of bits in x, y, and z.
    
    Args:
        x: First input integer
        y: Second input integer
        z: Third input integer
    
    Returns:
        Integer with 1s where majority (2 or more) inputs have 1s
    """
    return (x & y) ^ (x & z) ^ (y & z)

In [45]:
# Test maj function
print("Testing maj function:")
x = 0xFFFFFFFF
y = 0xAAAAAAAA
z = 0x55555555

print(f"x: {hex(x)}")
print(f"y: {hex(y)}")
print(f"z: {hex(z)}")
print(f"maj(x,y,z): {hex(maj(x,y,z))}")

# Test with different patterns
x = 0x00000000
print(f"\nx: {hex(x)}")
print(f"y: {hex(y)}")
print(f"z: {hex(z)}")
print(f"maj(x,y,z): {hex(maj(x,y,z))}")

Testing maj function:
x: 0xffffffff
y: 0xaaaaaaaa
z: 0x55555555
maj(x,y,z): 0xffffffff

x: 0x0
y: 0xaaaaaaaa
z: 0x55555555
maj(x,y,z): 0x0


Task 2

In [46]:
def hash_function(s):
    """
    Python implementation of the hash function from 
    The C Programming Language by Brian Kernighan and Dennis Ritchie.
    
    Args:
        s (str): The string to hash
        
    Returns:
        int: The hash value
    """
    hashval = 0
    for char in s:
        hashval = ord(char) + 31 * hashval
    return hashval % 101

In [47]:
# Test with some sample strings
test_strings = [
    "hello",
    "world",
    "python",
    "hash",
    "function",
    "algorithm",
    "kernighan",
    "ritchie",
    "programming",
    "language"
]

print("String\t\tHash Value")
print("-" * 30)
for s in test_strings:
    print(f"{s:<15}\t{hash_function(s)}")

String		Hash Value
------------------------------
hello          	17
world          	34
python         	91
hash           	15
function       	100
algorithm      	76
kernighan      	37
ritchie        	26
programming    	89
language       	68


In [48]:
# Test for collisions
all_values = {}
for i in range(1000):
    test_str = f"test{i}"
    hash_val = hash_function(test_str)
    if hash_val in all_values:
        all_values[hash_val].append(test_str)
    else:
        all_values[hash_val] = [test_str]

# Count collisions
collisions = sum(len(strings) - 1 for strings in all_values.values() if len(strings) > 1)
print(f"\nCollision test with 1000 strings:")
print(f"Number of unique hash values: {len(all_values)}")
print(f"Number of collisions: {collisions}")
print(f"Collision rate: {collisions/1000:.2%}")


Collision test with 1000 strings:
Number of unique hash values: 101
Number of collisions: 899
Collision rate: 89.90%


Task 3: SHA256

In [49]:
def calculate_sha256_padding(file_path: str) -> None:
    """
    Calculate and print the SHA256 padding that would be applied to a file.
    
    The SHA256 padding consists of:
    1. A '1' bit (0x80 byte)
    2. Enough '0' bits to make the total length a multiple of 512 bits
    3. The original message length as a 64-bit big-endian integer
    
    Args:
        file_path: Path to the file to calculate padding for
    """
    # Get the file size in bytes
    file_size = 0
    with open(file_path, 'rb') as f:
        # Read the file in chunks to handle large files
        chunk = f.read(8192)
        while chunk:
            file_size += len(chunk)
            chunk = f.read(8192)
    
    # Calculate the file size in bits
    file_size_bits = file_size * 8
    
    # Calculate padding
    # First byte of padding is always 0x80 (a '1' bit followed by 7 '0' bits)
    padding = [0x80]
    
    # Calculate how many bytes of zeros we need
    # The total length needs to be a multiple of 512 bits (64 bytes)
    # We need to reserve 8 bytes (64 bits) for the length field
    # So we need to pad to: (n*64 - 8 - 1) bytes, where n is some integer
    # and 1 is for the 0x80 byte we already added
    
    # Calculate how many bytes we need to add to get to a multiple of 64 bytes
    # minus 9 bytes (1 for 0x80 and 8 for the length)
    remainder = (file_size + 1) % 64
    zero_bytes_needed = 64 - 8 - remainder if remainder <= 56 else 128 - 8 - remainder
    
    # Add the zero bytes
    padding.extend([0x00] * zero_bytes_needed)
    
    # Add the original length as a 64-bit big-endian integer
    # We need to represent the length in bits, not bytes
    for i in range(7, -1, -1):
        # Extract each byte of the 64-bit length
        padding.append((file_size_bits >> (i * 8)) & 0xFF)
    
    # Print the padding in hex format
    padding_hex = ' '.join(f'{byte:02X}' for byte in padding)
    
    # Format the output to match the example (with line breaks every 26 bytes)
    formatted_output = ''
    for i in range(0, len(padding_hex), 78):  # 26 bytes = 26*3 chars (including spaces)
        formatted_output += padding_hex[i:i+78] + '\n'
    
    print(f"SHA256 padding for file '{file_path}':")
    print(formatted_output.strip())

In [50]:
# Test the SHA256 padding function with a simple example
import os

# Create a test file with content "abc"
test_file = "test_abc.txt"
with open(test_file, "wb") as f:
    f.write(b"abc")

# Calculate and print the padding
calculate_sha256_padding(test_file)

# Clean up the test file
os.remove(test_file)

SHA256 padding for file 'test_abc.txt':
80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 18


Task 4: Prime Numbers

This task implements two different algorithms to find prime numbers:
1. Sieve of Eratosthenes - An efficient algorithm that marks non-prime numbers in a range
2. Trial Division - A simple algorithm that checks each number for divisibility

In [None]:
def sieve_of_eratosthenes(n: int) -> list[int]:
    """
    Find all prime numbers up to n using the Sieve of Eratosthenes algorithm.
    
    The algorithm works by:
    1. Creating a boolean array of size n+1, initially all True
    2. Starting from 2, for each prime number:
       - Mark all its multiples as non-prime
    3. Collect all numbers that remain marked as True
    
    Args:
        n: Upper limit to find primes up to
    
    Returns:
        List of prime numbers up to n
    """
    # Create a boolean array "is_prime[0..n]" and initialize
    # all entries it as true. A value in is_prime[i] will
    # finally be false if i is Not a prime, else true.
    is_prime = [True] * (n + 1)
    is_prime[0] = is_prime[1] = False
    
    for i in range(2, int(n ** 0.5) + 1):
        if is_prime[i]:
            # Update all multiples of i
            for j in range(i * i, n + 1, i):
                is_prime[j] = False
    
    # Collect all prime numbers
    return [i for i in range(n + 1) if is_prime[i]]