# Computational Theory - Shane Walsh - G00406694

## Task 1: Binary Representations
Create four functions in Python, demonstrating their use with examples and tests.
1. The function rotl(x, n=1) that rotates the bits in a 32-bit unsigned integer to the left n places.

2. The function rotr(x, n=1) that rotates the bits in a 32-bit unsigned integer to the right n places.

3. The function ch(x, y, z) that chooses the bits from y where x has bits set to 1 and bits in z where x has bits set to 0.

4. The function maj(x, y, z) which takes a majority vote of the bits in x, y, and z.

The output should have a 1 in bit position i where at least two of x, y, and z have 1's in position i.
All other output bit positions should be 0.

## 1. The function rotl(x, n=1)

This function goes through a process of rotating bits in a 32-bit unsigned integer to a direction. In this case, to the left by a a set number (such as 1 or 2) of positions. To implement this, we first need to understand the syntax for Bitwise shifting.

Bitwise left shift is <<
Bitwise right shift is >>

In [14]:
# Bitwise shifting of a number
x = 0b1100
print("Base value of X: " + bin(x)) # Base value
print("Shifted to the left by 2: " + bin(x << 2)) # Shift left by 2
print("Shifted to the right by 2: " + bin(x >> 2)) # Shift right by 2


Base value of X: 0b1100
Shifted to the left by 2: 0b110000
Shifted to the right by 2: 0b11


Now its important that to avoid losing any bits during the shift process, we can avoid this with what's known as masking. This ensures that our result doesn't grow larger by mistake and maintains the 32 bit size of my result.

In [1]:
# Masking examples here
x = 0b1100

The final step to tie this together is bitwise operations, which are your classic AND, OR, NOT and XOR gates. These allow us to control our results dynamically......

In [2]:
# Examples of bitwise operations
x = 0b1100
y = 0b1010
print(bin(x & y)) # Bitwise AND
print(bin(x | y)) # Bitwise OR
print(bin(x ^ y)) # Bitwise XOR
print(bin(~x)) # Bitwise NOT

0b1000
0b1110
0b110
-0b1101


In the end our rotation function looks like the following. It shifts bits left by n positions with (x << n). It then shifts bits that would be lost when shifting left, to the right with (x >> (32 - n)). We put an OR operator between these to combine them. Lastly, we use an AND operator to bring in our masking for ensuring to result stays at 32 bits. 

In [3]:
import numpy as np
import math

def rotl(x, n=1):
    return ((x << n) | (x >> (32 - n))) & 0xFFFFFFFF # << shifts the bits to left, >> shifts the bits to right.

x = 0b1100
print(f'rotl(0b1100, 2): {bin(rotl(x, 2))}')
print(f'rotl(0b1100, 3): {bin(rotl(x, 3))}')

rotl(0b1100, 2): 0b110000
rotl(0b1100, 3): 0b1100000


# Demonstrating the use of Rotl(x, n=1) 

## 2. The function rotr(x, n=1)

In [4]:
def rotr(x, n=1):
    return (x >> n) | ((x << (32 - n)) & 0xFFFFFFFF)

x = 0b1100

# Test the function.
print(f'rotr(0b1100, 2): {bin(rotr(x, 2))}')

rotr(0b1100, 2): 0b11


## 3. The function ch(x, y, z)

For this function we need to make use of bit selection logic using bitwise operators on our chosen binary values. In this particular implementation: 

- For each bit position of x being 1, we choose the associated or corresponding bit from y.
- For each bit position of x being 0, we choose the corresponding bit from z.

Let me further break down how to accomplish this: 
Using (x & y) will take bits from y where x has 1s.
Using  ()

In [5]:
def ch(x, y, z): # choose between x, y, z. 
    return (x & y) ^ (~x & z) # & is bitwise AND, ^ is bitwise XOR, ~ is bitwise NOT. So this returns x AND y XOR NOT x AND z.

x = 0b1100
y = 0b1010
z = 0b1001

print(f'ch(0b1100, 0b1010, 0b1001): {bin(ch(x, y, z))}')

ch(0b1100, 0b1010, 0b1001): 0b1001


## 4. The function maj(x, y, z)

In [6]:
def maj(x, y, z): # Takes a majority vote of x, y, z.
    return (x & y) ^ (x & z) ^ (y & z) # ^ means XOR. So this returns x AND y XOR x AND z XOR y AND z.

x = 0b1100
y = 0b1010
z = 0b1001

x1 = 0b1100
y1 = 0b1010
z1 = 0b1010

# Examples
print(f'maj(0b1100, 0b1010, 0b1001): {bin(maj(x, y, z))}')
print(f'maj(0b1100, 0b1010, 0b1010): {bin(maj(x1, y1, z1))}')

maj(0b1100, 0b1010, 0b1001): 0b1000
maj(0b1100, 0b1010, 0b1010): 0b1010


## Task 2: Hash Functions
The following hash function is from The C Programming Language by Brian Kernighan and Dennis Ritchie.
Convert it to Python, test it, and suggest why the values 31 and 101 are used.

unsigned hash(char *s) {
    unsigned hashval;
    for (hashval = 0; *s != '\0'; s++)
        hashval = *s + 31 * hashval;
    return hashval % 101;
}

I'll now convert this hash function from the C Programming language to python. This will require quite a few changes to the code. We don't need to declare variable types. While the hash function in C uses pointer's and null terminator checking, Python makes this much simpler as we can just iterate directly over the string. The Hash function that we're converting in this process is from page 128 - 129 of [The C Programming Language](https://seriouscomputerist.atariverse.com/media/pdf/book/C%20Programming%20Language%20-%202nd%20Edition%20(OCR).pdf) by Brian Kernighan and Dennis Ritchie.

In [7]:
def Hash_string(s): # No need to declare var, just use it with the input, no pointer.
    hashval = 0 # No use of 'unsigned' in Python.
    # Iterate directly over string.
    for c in s:
        hashval = ord(c) + 31 * hashval # ord() returns ASCII value of char, 31 is prime number, good for hashing.
    return hashval % 101 # Modulus is unchanged, 101 is prime number, reduces collisions.

The final conversion would look like this. Rather than dereferencing the character pointer in C, we can use Python's [ord() function](https://docs.python.org/3.4/library/functions.html#ord) to pull out the relevant [ASCII value](https://www.ascii-code.com/) of a character. THe modulus element on the hashVal has no necessary changes.

The numbers 31 and 101 are prime numbers. To my knowledge, using prime numbers for hash values gives us a smoother distribution. 101 helps determine our hash table size and reduce collisions which can quite easily come up. 

In [8]:
# Testing Hash function
print(f'Hash_string("hello"): {Hash_string("hello")}')
print(f'Hash_string("world"): {Hash_string("world")}')
print(f'Hash_string("hello world"): {Hash_string("hello world")}')

Hash_string("hello"): 17
Hash_string("world"): 34
Hash_string("hello world"): 13


## Task 3: SHA256

Write a Python function that calculates the SHA256 padding for a given file.  
The function should take a file path as input.  
It should print, in hex, the padding that would be applied to it.  
The [specification](https://doi.org/10.6028/NIST.FIPS.180-4) states that the following should be appended to a message:  

- a`1` bit;
- enough `0` bits so the length in bits of padded message is the smallest possible multiple of 512;
- the length in bits of the original input as a big-endian 64-bit unsigned integer.

The example in the specification is a file containing the three bytes `abc`:  

```python
01100001 01100010 01100011
```

The output would be:  

```python
80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 18
```

So for this task, we need to calculate the SHA256 padding for a file supplied in the method as input. Ideally, I should adhere to the NIST FIPS 180-4 specification during this process. --- Add Ref here. In the end I'll be returning the padding as a collection of bytes.

We first need to read in the file as a collection of bytes. This is farely easy and selfexplanatory to implement. Right after that we calculate the original content length in bits. 

In [9]:

# Create a file called file.txt with our three bytes in it for example purposes.
file = 'file.txt'

with open(file, 'w') as f: # For testing purposes, write some text to a file.
    f.write('abc')

# Read in the file
with open(file, 'rb') as f:
    data = f.read()

# Calculate length in bits of the file
length = len(data) * 8

print(f'Length of file in bits: {length}')

Length of file in bits: 24


Following the length calculation, we should follow the SHA-256 specification for applying padding. First off, I'll add a single '1' bit as a hexidecimal (0x80). We need to add enough '0' bits so the total length of our file (including both the original length, the padding and 64 bits) is a multiple of 512 bits. 

In [10]:
### Example Code here


We use the calculation padding_needed = 56 - ((len(data) + 1) % 64) to ensure we leave exactly 64 bits (or 8 bytes) at the end of the file. 

In [11]:
### Example Code here

Following this, we add back the original length of the file as a 64 bit big-endian unsigned integer. 

In [12]:
### Final Sha256 Method

# Calculate the SHA256 padding of a string from a giving file.
def SHA256_padding(file):
    # Read in the file
    with open(file, 'rb') as f:
        data = f.read()

    # Calculate length in bits of the file, multiply by 8 to get bits as there is 8 bits in a byte.
    original_length = len(data) * 8

    # Add a '1' bit to the end of the data, in hex this would be 0x80 or 10000000 in python.
    padding = bytearray([0x80])

    # Add enough '0' bits so the total length is 448 mod 512 bits.
    # So the original length + 1 + padding) % 64 = 56, we need 56 to be the remainder. 
    padding_zeros = 56 - ((len(data) + 1) % 64)
    # if we reach the 56-bye mark, add another block. So we dynamically add padding needed to the file. 
    if padding_zeros < 0:
        padding_zeros += 64.

    # Add original length of file in bits, this must be a 64 big-endian unsigned integer. So as to not overflow.
    padding.extend([0] * padding_zeros)
    padding.extend(original_length.to_bytes(8, byteorder='big'))


    # Print this padding in hexidecimal. 
    print("SHA256 padding in hexidecimal format: ")
    for i, byte in enumerate(padding):
        print(f'{byte:02x}', end=' ')
        if (i + 1) % 25 == 0: # Print 25 bytes per line for simplicity and readability.
            print()
        if len(padding) % 25 != 0:
            print() # Print a newline if the padding is not a multiple of 25 bytes.
    
    return padding # Return the padding at the end. 

## Example explanation

In [13]:
### Example usage with a test file. 
def test_sha256_padding():
    # Create a test file with "abc"
    with open("test_file.txt", "wb") as f:
        f.write(b"abc")
    
    # Calculate and print the padding
    padding = SHA256_padding("test_file.txt")
    return padding

# Test the function
padding = test_sha256_padding()


SHA256 padding in hexidecimal format: 
80 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 

00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 

00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
18 


## Task 4: Prime Numbers

Calculate the first 100 prime numbers using two different algorithms.  
Any algorithms that are well-established and works correctly are okay to use.  
Explain how the algorithms work.

Insert links here for each algorithm

### Division with square roots
The basic process for this algorithm for checking if a number is prime involves finding out if its divisible by any smaller numbers than it. We do this through using its square roots. This is useful, as we only need to check if a number divides into our square root of n because: 
- If its a multiple of n, then either a or b must be included in its square root. 

We than proceed to check each number in sequence, adding it to our list if its a prime and leaving it be if it isn't one. We continue this process until we hopefully reach 100 prime numbers. 


In [1]:
import numpy as np
import matplotlib.pyplot as plt
import time

def test_prime(n):
    # Test if a number is prime by checking if it is divisible by any number up to its square root. n is the number to test.

    # We only need to test numbers of the form "6k±1" since all primes higher than a 3 are of this form. 
    # In order to do this, we must check the edge cases of 2 and 3 separately. Once we've checked 2 we know its not even.
    if n <= 1: # Check if n is less than or equal to 1, if so, return False.
        return False
    if n <= 3: # Check if n is less than or equal to 3, if so, return True.
        return True
    if n % 2 == 0 or n % 3 == 0: # Check if n is divisible by 2 or 3, if so, return False.
        return False
    
    