# Computational Theory - Shane Walsh - G00406694

## Task 1: Binary Representations
Create four functions in Python, demonstrating their use with examples and tests.
1. The function rotl(x, n=1) that rotates the bits in a 32-bit unsigned integer to the left n places.

2. The function rotr(x, n=1) that rotates the bits in a 32-bit unsigned integer to the right n places.

3. The function ch(x, y, z) that chooses the bits from y where x has bits set to 1 and bits in z where x has bits set to 0.

4. The function maj(x, y, z) which takes a majority vote of the bits in x, y, and z.

The output should have a 1 in bit position i where at least two of x, y, and z have 1's in position i.
All other output bit positions should be 0.

## 1. The function rotl(x, n=1)

This function goes through a process of rotating bits in a 32-bit unsigned integer to a direction. In this case, to the left by a a set number (such as 1 or 2) of positions. To implement this, we first need to understand the syntax for Bitwise shifting.

Bitwise left shift is <<
Bitwise right shift is >>

In [32]:
# Bitwise shifting of a number
x = 0b1100
print("Base value of X: " + bin(x)) # Base value
print("Shifted to the left by 2: " + bin(x << 2)) # Shift left by 2
print("Shifted to the right by 2: " + bin(x >> 2)) # Shift right by 2


Base value of X: 0b1100
Shifted to the left by 2: 0b110000
Shifted to the right by 2: 0b11


Now its important that to avoid losing any bits during the shift process, we can avoid this with what's known as masking. This ensures that our result doesn't grow larger by mistake and maintains the 32 bit size of my result.

In [1]:
# Masking examples here
x = 0b1100

The final step to tie this together is bitwise operations, which are your classic AND, OR, NOT and XOR gates. These allow us to control our results dynamically......

In [2]:
# Examples of bitwise operations
x = 0b1100
y = 0b1010
print(bin(x & y)) # Bitwise AND
print(bin(x | y)) # Bitwise OR
print(bin(x ^ y)) # Bitwise XOR
print(bin(~x)) # Bitwise NOT

0b1000
0b1110
0b110
-0b1101


In the end our rotation function looks like the following. It shifts bits left by n positions with (x << n). It then shifts bits that would be lost when shifting left, to the right with (x >> (32 - n)). We put an OR operator between these to combine them. Lastly, we use an AND operator to bring in our masking for ensuring to result stays at 32 bits. 

In [3]:
import numpy as np
import math

def rotl(x, n=1):
    return ((x << n) | (x >> (32 - n))) & 0xFFFFFFFF # << shifts the bits to left, >> shifts the bits to right.

x = 0b1100
print(f'rotl(0b1100, 2): {bin(rotl(x, 2))}')
print(f'rotl(0b1100, 3): {bin(rotl(x, 3))}')

rotl(0b1100, 2): 0b110000
rotl(0b1100, 3): 0b1100000


# Demonstrating the use of Rotl(x, n=1) 

## 2. The function rotr(x, n=1)

In [4]:
def rotr(x, n=1):
    return (x >> n) | ((x << (32 - n)) & 0xFFFFFFFF)

x = 0b1100

# Test the function.
print(f'rotr(0b1100, 2): {bin(rotr(x, 2))}')

rotr(0b1100, 2): 0b11


## 3. The function ch(x, y, z)

For this function we need to make use of bit selection logic using bitwise operators on our chosen binary values. In this particular implementation: 

- For each bit position of x being 1, we choose the associated or corresponding bit from y.
- For each bit position of x being 0, we choose the corresponding bit from z.

Let me further break down how to accomplish this: 
Using (x & y) will take bits from y where x has 1s.
Using  ()

In [5]:
def ch(x, y, z): # choose between x, y, z. 
    return (x & y) ^ (~x & z) # & is bitwise AND, ^ is bitwise XOR, ~ is bitwise NOT. So this returns x AND y XOR NOT x AND z.

x = 0b1100
y = 0b1010
z = 0b1001

print(f'ch(0b1100, 0b1010, 0b1001): {bin(ch(x, y, z))}')

ch(0b1100, 0b1010, 0b1001): 0b1001


## 4. The function maj(x, y, z)

In [6]:
def maj(x, y, z): # Takes a majority vote of x, y, z.
    return (x & y) ^ (x & z) ^ (y & z) # ^ means XOR. So this returns x AND y XOR x AND z XOR y AND z.

x = 0b1100
y = 0b1010
z = 0b1001

x1 = 0b1100
y1 = 0b1010
z1 = 0b1010

# Examples
print(f'maj(0b1100, 0b1010, 0b1001): {bin(maj(x, y, z))}')
print(f'maj(0b1100, 0b1010, 0b1010): {bin(maj(x1, y1, z1))}')

maj(0b1100, 0b1010, 0b1001): 0b1000
maj(0b1100, 0b1010, 0b1010): 0b1010


## Task 2: Hash Functions
The following hash function is from The C Programming Language by Brian Kernighan and Dennis Ritchie.
Convert it to Python, test it, and suggest why the values 31 and 101 are used.

unsigned hash(char *s) {
    unsigned hashval;
    for (hashval = 0; *s != '\0'; s++)
        hashval = *s + 31 * hashval;
    return hashval % 101;
}

I'll now convert this hash function from the C Programming language to python. This will require quite a few changes to the code. We don't need to declare variable types. While the hash function in C uses pointer's and null terminator checking, Python makes this much simpler as we can just iterate directly over the string. The Hash function that we're converting in this process is from page 128 - 129 of [The C Programming Language](https://seriouscomputerist.atariverse.com/media/pdf/book/C%20Programming%20Language%20-%202nd%20Edition%20(OCR).pdf) by Brian Kernighan and Dennis Ritchie.

In [7]:
def Hash_string(s): # No need to declare var, just use it with the input, no pointer.
    hashval = 0 # No use of 'unsigned' in Python.
    # Iterate directly over string.
    for c in s:
        hashval = ord(c) + 31 * hashval # ord() returns ASCII value of char, 31 is prime number, good for hashing.
    return hashval % 101 # Modulus is unchanged, 101 is prime number, reduces collisions.

The final conversion would look like this. Rather than dereferencing the character pointer in C, we can use Python's [ord() function](https://docs.python.org/3.4/library/functions.html#ord) to pull out the relevant [ASCII value](https://www.ascii-code.com/) of a character. THe modulus element on the hashVal has no necessary changes.

The numbers 31 and 101 are prime numbers. To my knowledge, using prime numbers for hash values gives us a smoother distribution. 101 helps determine our hash table size and reduce collisions which can quite easily come up. 

In [8]:
# Testing Hash function
print(f'Hash_string("hello"): {Hash_string("hello")}')
print(f'Hash_string("world"): {Hash_string("world")}')
print(f'Hash_string("hello world"): {Hash_string("hello world")}')

Hash_string("hello"): 17
Hash_string("world"): 34
Hash_string("hello world"): 13


## Task 3: SHA256

Write a Python function that calculates the SHA256 padding for a given file.  
The function should take a file path as input.  
It should print, in hex, the padding that would be applied to it.  
The [specification](https://doi.org/10.6028/NIST.FIPS.180-4) states that the following should be appended to a message:  

- a`1` bit;
- enough `0` bits so the length in bits of padded message is the smallest possible multiple of 512;
- the length in bits of the original input as a big-endian 64-bit unsigned integer.

The example in the specification is a file containing the three bytes `abc`:  

```python
01100001 01100010 01100011
```

The output would be:  

```python
80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 18
```

So for this task, we need to calculate the SHA256 padding for a file supplied in the method as input. Ideally, I should adhere to the NIST FIPS 180-4 specification during this process. --- Add Ref here. In the end I'll be returning the padding as a collection of bytes.

We first need to read in the file as a collection of bytes. This is farely easy and selfexplanatory to implement. Right after that we calculate the original content length in bits. 

In [9]:

# Create a file called file.txt with our three bytes in it for example purposes.
file = 'file.txt'

with open(file, 'w') as f: # For testing purposes, write some text to a file.
    f.write('abc')

# Read in the file
with open(file, 'rb') as f:
    data = f.read()

# Calculate length in bits of the file
length = len(data) * 8

print(f'Length of file in bits: {length}')

Length of file in bits: 24


Following the length calculation, we should follow the SHA-256 specification for applying padding. First off, I'll add a single '1' bit as a hexidecimal (0x80). We need to add enough '0' bits so the total length of our file (including both the original length, the padding and 64 bits) is a multiple of 512 bits. 

In [10]:
### Example Code here


We use the calculation padding_needed = 56 - ((len(data) + 1) % 64) to ensure we leave exactly 64 bits (or 8 bytes) at the end of the file. 

In [11]:
### Example Code here

Following this, we add back the original length of the file as a 64 bit big-endian unsigned integer. 

In [12]:
### Final Sha256 Method

# Calculate the SHA256 padding of a string from a giving file.
def SHA256_padding(file):
    # Read in the file
    with open(file, 'rb') as f:
        data = f.read()

    # Calculate length in bits of the file, multiply by 8 to get bits as there is 8 bits in a byte.
    original_length = len(data) * 8

    # Add a '1' bit to the end of the data, in hex this would be 0x80 or 10000000 in python.
    padding = bytearray([0x80])

    # Add enough '0' bits so the total length is 448 mod 512 bits.
    # So the original length + 1 + padding) % 64 = 56, we need 56 to be the remainder. 
    padding_zeros = 56 - ((len(data) + 1) % 64)
    # if we reach the 56-bye mark, add another block. So we dynamically add padding needed to the file. 
    if padding_zeros < 0:
        padding_zeros += 64.

    # Add original length of file in bits, this must be a 64 big-endian unsigned integer. So as to not overflow.
    padding.extend([0] * padding_zeros)
    padding.extend(original_length.to_bytes(8, byteorder='big'))


    # Print this padding in hexidecimal. 
    print("SHA256 padding in hexidecimal format: ")
    for i, byte in enumerate(padding):
        print(f'{byte:02x}', end=' ')
        if (i + 1) % 25 == 0: # Print 25 bytes per line for simplicity and readability.
            print()
        if len(padding) % 25 != 0:
            print() # Print a newline if the padding is not a multiple of 25 bytes.
    
    return padding # Return the padding at the end. 

## Example explanation

In [13]:
### Example usage with a test file. 
def test_sha256_padding():
    # Create a test file with "abc"
    with open("test_file.txt", "wb") as f:
        f.write(b"abc")
    
    # Calculate and print the padding
    padding = SHA256_padding("test_file.txt")
    return padding

# Test the function
padding = test_sha256_padding()


SHA256 padding in hexidecimal format: 
80 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 

00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
00 

00 
00 
00 
00 
00 
00 
00 
00 
00 
00 
18 


## Task 4: Prime Numbers

Calculate the first 100 prime numbers using two different algorithms.  
Any algorithms that are well-established and works correctly are okay to use.  
Explain how the algorithms work.

Insert links here for each algorithm

## Division with square roots
The basic process for this algorithm for checking if a number is prime involves finding out if its divisible by any smaller numbers than it. We do this through using its square roots. This is useful, as we only need to check if a number divides into our square root of n because: 
- If its a multiple of n, then either a or b must be included in its square root. 




In [14]:
# Insert edge cases here

We than proceed to check each number in sequence, adding it to our list if its a prime and leaving it be if it isn't one. We continue this process until we hopefully reach 100 prime numbers. 

In [15]:
# Insert main while loop here

In [16]:
import numpy as np
import matplotlib.pyplot as plt
import time

def test_prime(n):
    # Test if a number is prime by checking if it is divisible by any number up to its square root. n is the number to test.

    # We only need to test numbers of the form "6k±1" since all primes higher than a 3 are of this form. 
    # In order to do this, we must check the edge cases of 2 and 3 separately. Once we've checked 2 we know its not even.
    if n <= 1: # Check if n is less than or equal to 1, if so, return False.
        return False
    if n <= 3: # Check if n is less than or equal to 3, if so, return True.
        return True
    if n % 2 == 0 or n % 3 == 0: # Check if n is divisible by 2 or 3, if so, return False.
        return False
    
    # Check all numbers of the form "6k±1" up to the square root of n.
    i = 5 # Start at 5, since we've already checked 2 and 3.
    while i * i <= n: # Check if i squared is less than or equal to n.
        if n % i == 0 or n % (i + 2) == 0: # Check if n can be divided by i or i + 2, return False if so.
            return False
        i += 6 # Increment i by 6, since we're checking numbers of the "6k±1" form.
    return True

    
    

Here is a simple function for running that algorithm a number of times equal to the user's desire.

In [17]:
def find_primes(count):
    # Find the first x number of primes based on the count given.
    primes = [] # Create an empty list to store our found primes.
    num = 2 # Start at 2, the first prime number.

    # Loop until we have found the desired number of primes.
    while len(primes) < count:
        if test_prime(num):
            primes.append(num)
        num += 1

    return primes

# Testing with Examples

In [18]:
primes = find_primes(100)

print(primes)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541]


## Sieve Algorithm
THe previous method works overall, but it's quite a brute force approach to the issue. Instead we could explore using the Sieve approach (reference links here). Instead of testing every individual number as we progress through sequentially, we can eliminate the composite numbers systematically, allowing us to widdle it down to the 100 primes we need. 

We start with a list of numbers, we'll choose a minimum and a maximum of 2 and 1000.

At first, we'll assume all numbers are primes. Then as we loop through them all, we'll mark all the multiples of each prime as false or not prime. Any numbers that aren't marked by this are found as primes. This approach is quite a bit more effective and efficient in comparison to our last algorithm. We only need to search up to a certain limit, because larger factors of a prime would have been marked by their smaller counterparts already. Additionally, each number is marked only once for each of its prime numbers, rather than repeatedly parsing over familiar numbers within the previous algorithm.

In [19]:
def findPrimesUsingSieve(count):
    # Find the first x number of primes based on the count given.
    limit = 1000 # Set a limit for the sieve.

    # Initialize the sieve list.
    sieve = [True] * (limit + 1) # Create a list of True values, one for each number up to the limit.
    # Set the first two values to False, since they are not prime.
    sieve[0] = False 
    sieve[1] = False

    # Loop through the sieve and set all multiples of each number to False.
    for i in range(2, int(np.sqrt(limit)) + 1): # Loop through up to square root of limit.
        if sieve[i]: # If the number is prime.
            for j in range(i * i, limit + 1, i): # Loop through all multiples of the number.
                sieve[j] = False # Set the multiple to False.

    # Loop through the sieve and add all prime numbers to the list.
    primes = [] # Create empty list to store found primes.
    for i in range(limit + 1):
        if sieve[i]: # If number is prime.
            primes.append(i) # Add number to list.
            if len(primes) == count: # Once we reach count, break loop.
                break

    return primes


print(findPrimesUsingSieve(100))

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541]


In [20]:
# Compare the results to ensure they match
print(f"\nBoth algorithms produce the same results: {find_primes(100) == findPrimesUsingSieve(100)}")


Both algorithms produce the same results: True


Its worth noting that the Sieve algorithm seems far more effective, particularly in terms of speed. This is generally due to the fact that it eliminates multiple composite numbers with each of its loops, instead of individually. It marks off all the multiples of the known primes, avoiding redundancy. As you can imagine, this is especially effective for larger numbers that possess many factors. Sieve has a handy amorphous nature to it, working across the entire range of numbers rather than singularly. 

## Task 5: Roots
Calculate the first 32 bits of the fractional part of the square roots of the first 100 prime numbers.

So our aim with this task is to take the first 100 prime numbers (which we've gathered in our previous task) and calculate the first 32 bits of eah of their fractional parts of their square roots. We can do this similarly to how the SHA-256 algorithm generates its constants (Add link here). 

### Square Roots
First off, lets get the square root of each prime number. To do this in python, you multiply against the power of 0.5 of your prime. As shown below, for example purposes I've used 7 as its a fairly common prime number. Using the exponentiation operator with 0.5 gives us the square root, in this case 2.6457513110645907.

In [21]:
prime = 7
root = prime ** 0.5 # Square root is written as a power of 0.5 in Python.
print(f'root: {root}')

root: 2.6457513110645907


## Fractional Parts
After this we need to extract the fractional parts of our square root. The fractional parts is (as the name suggests) anything after the decimal point. So for the square root of 7: 2.6457513110645907, it'd be 6457513110645907. To extract the fractional part of our square rooted prime in python, I think my best option is to use math.floor and take that away from our root, that way I only get the fractional part. 

In [22]:
frac = root - math.floor(root) # Subtract the floor of the root from the root to get the fractional part.
print(f'frac: {frac}')

frac: 0.6457513110645907


### 32 Bits Representation
Next up is where we dive into bits territory. In this case, we need to represent our fractional part (such as .6457513110645907) as a 32 bit value (like ). We can do this by multiplying our fractional part from above by 2 to the power of 32. What this does is it effectively shifts the decimal point 32 digits over to the right, giving us our desired 32 bits of a fractional part.

Lastly, I'll use int() just to truncate down the found result to keep it within integer format. 

In [23]:
frac = frac * (2 ** 32) # Multiply the fractional part by 2^32 to get the fractional part in integer form.
print(f'frac: {frac}')

# Convert to integer through truncation so we only have the integer part.
bits = int(frac)
print(f'bits: {bits}')

frac: 2773480762.37154
bits: 2773480762


### Find Square Root - Method
Now that we've accomplished this algorithm step by step. I'll combine this together into a singular method.

In [24]:
def findSquareRoot(prime):
    root = prime ** 0.5 # Square root is written as a power of 0.5 in Python.

    frac = root - math.floor(root) # Subtract the floor of the root from the root to get the fractional part.

    frac = frac * (2 ** 32) # Multiply the fractional part by 2^32 to get the fractional part in integer form.
    
    bits = int(frac) # Convert to integer through truncation so we only have the integer part.
    return bits


### The Final Product
Now I'm going to feed in my primes calculated in the previous task using one of my algorithms into this algorithm we've developed and collect the results.

In [25]:
primes = findPrimesUsingSieve(100) # Find the first 100 primes using the sieve method, alternatively I could use the other method.

# calculate fractional parts of the square roots
frac32Results = []
for prime in primes:
    bits = findSquareRoot(prime)
    frac32Results.append(bits)

# Print results
print("First 32 bits of fractional parts of square roots of first 100 primes:")
for prime, bits in zip(primes, frac32Results):
    print(f"{prime:6} → {bits:032b}")

First 32 bits of fractional parts of square roots of first 100 primes:
     2 → 01101010000010011110011001100111
     3 → 10111011011001111010111010000101
     5 → 00111100011011101111001101110010
     7 → 10100101010011111111010100111010
    11 → 01010001000011100101001001111111
    13 → 10011011000001010110100010001100
    17 → 00011111100000111101100110101011
    19 → 01011011111000001100110100011001
    23 → 11001011101110111001110101011101
    29 → 01100010100110100010100100101010
    31 → 10010001010110010000000101011010
    37 → 00010101001011111110110011011000
    41 → 01100111001100110010011001100111
    43 → 10001110101101000100101010000111
    47 → 11011011000011000010111000001101
    53 → 01000111101101010100100000011101
    59 → 10101110010111111001000101010110
    61 → 11001111011011001000010111010011
    67 → 00101111011100110100011101111101
    71 → 01101101000110000010011011001010
    73 → 10001011010000111101010001010111
    79 → 11100011011000001011010110010110
    8

## Task 6: Proof of Work
Find the word(s) in the English language with the greatest number of 0 bits at the beginning of their SHA256 hash digest.
Include proof that any word you list is in at least one English dictionary.

https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt



### Download English words
I decided to acquire my english words from a common [GitHub Repository](https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt). This one compiles words from various dictionaries into one, so it sounded succint and sufficiently inclusive for the task at hand. With this dictionary, we should have a broad enough vocabulary of words to check over. This program downloads the word list from the repository and..

In [26]:
import hashlib
import requests
import os
import time
from collections import defaultdict

def getEnglishDict():
    englishDict = "words_alpha.txt"

    # Download the word list if it doesn't exist
    if not os.path.exists(englishDict):
        print("Downloading English Dict...")
        url = "https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt"
        response = requests.get(url)
        with open(englishDict, "w") as f:
            f.write(response.text)
        print("Download complete.")
    
    # Read the word list
    with open(englishDict, "r") as f:
        words = [line.strip().lower() for line in f if line.strip()]
    
    return words
    
print (getEnglishDict())




### Count Leading Zeros in a Hash
So to do this we need to look at the binary version of each of our hashes, going from left to right. As we do this, we count up all zeros until we inevitably hit a 1. There's a little more work prior to this however, namely calculating the hash of a word and converting that to binary. Its vital as well that we don't have '0b' there as a prefix, so we take that out else our counting method will break at 'b'.

In [27]:
# Calculate SHA256 hash of word and count leading zeros amount.
def countLeadingZeros(word):

    # calculate the hash (SHa256)
    hash = hashlib.sha256(word.encode())
    hash = hash.hexdigest()

    # Convert hex to string of binaries (we don't want '0b' on there)
    binary = bin(int(hash, 16))[2:].zfill(256)

    # Count up leading zeros
    leadingZeros = 0
    for bit in binary: 
        if bit == '0':
            leadingZeros += 1 # add up 1 if bit found to be a 0
        else:
            break

    return word, leadingZeros, hash

### Calculate Word With Most Leading Zeros

In [28]:
# Find the words that have the most leading zeros in their SHA256 hash
def findWordsWithLeadingZeros(wordsList, resultsNum =10, maxWords=1000000):
    
    print(f"Searching through {maxWords} words for the top {resultsNum} with the most leading zeros in their SHA256 hash...")

    # Keep track of words
    result = defaultdict(list)
    maxZeros = 0 # Keep track of max zeros


    # Search words (limited to maxWords for efficiency)
    for i, word in enumerate(wordsList[:maxWords]): # Enumerate through list of words.
        word, zeros, hash = countLeadingZeros(word) # Count leading zeros for each word.
        
        # Keep track of max zeros and add to results
        if zeros >= maxZeros:
            max_zeros = zeros
            result[zeros].append((word, zeros, hash))

    # Get top 10 results
    topResults = []
    for zeros in sorted(result.keys(), reverse=True):
        topResults.extend(result[zeros])
        if len(topResults) >= resultsNum:
            break

    return topResults[:resultsNum]


### Execution

In [29]:
words = getEnglishDict() # Get the English dictionary words
print(f"Loaded {len(words)} English words") # Print number of words loaded.

# Find top words with most leading zeros in their SHA256 hash
top_words = findWordsWithLeadingZeros(words)

print("\nTop words with most leading zero bits in SHA256 hash:")
print("=" * 70) # For formatting purposes.
print(f"{'Word':<20} {'Zero Bits':<10} {'SHA256 Hash'}") # Print a header.
print("-" * 70) # For formatting purposes.

for word, zeros, hash in top_words: # Print the top words with leading zeros.
    print(f"{word:<20} {zeros:<10} {hash[:16]}...") # Print the word, zeros, and first 16 characters of the hash.



Loaded 370105 English words
Searching through 1000000 words for the top 10 with the most leading zeros in their SHA256 hash...

Top words with most leading zero bits in SHA256 hash:
Word                 Zero Bits  SHA256 Hash
----------------------------------------------------------------------
goaltenders          18         00002e68c9d3d1fc...
guilefulness         16         0000d79e1c6964e6...
mismatchment         16         0000bb6ede9f29a0...
duppa                15         0001d81b1189c6a3...
mountable            15         00019347bddcfe0c...
palala               15         0001c4dbc2962bc9...
suavely              15         00014bf337341909...
alpestral            14         0002e77383f5798c...
courteously          14         000325bc7eb7fe65...
epizoarian           14         00020028b3d9ada5...


### Proof Word is in a Dictionary

In [30]:
def showProof(word):

    print("\nDictionary proof:")
    print(f"Local: ")
    print(f"The word '{word}' is included in the words_alpha.txt dictionary.")
    print("Source: https://github.com/dwyl/english-words")

    # Online site alternatives for greater proof if my txt ins't enough.
    print(f"Online: ")
    print(f"- Merriam-Webster: https://www.merriam-webster.com/dictionary/{word}")

# Verify the top word(s) are in a dictionary
for word, zeros, hash in top_words:
    showProof(word)


Dictionary proof:
Local: 
The word 'goaltenders' is included in the words_alpha.txt dictionary.
Source: https://github.com/dwyl/english-words
Online: 
- Merriam-Webster: https://www.merriam-webster.com/dictionary/goaltenders

Dictionary proof:
Local: 
The word 'guilefulness' is included in the words_alpha.txt dictionary.
Source: https://github.com/dwyl/english-words
Online: 
- Merriam-Webster: https://www.merriam-webster.com/dictionary/guilefulness

Dictionary proof:
Local: 
The word 'mismatchment' is included in the words_alpha.txt dictionary.
Source: https://github.com/dwyl/english-words
Online: 
- Merriam-Webster: https://www.merriam-webster.com/dictionary/mismatchment

Dictionary proof:
Local: 
The word 'duppa' is included in the words_alpha.txt dictionary.
Source: https://github.com/dwyl/english-words
Online: 
- Merriam-Webster: https://www.merriam-webster.com/dictionary/duppa

Dictionary proof:
Local: 
The word 'mountable' is included in the words_alpha.txt dictionary.
Source: h

## Task 7: Turning Machines

Design a Turing Machine that adds 1 to a binary number on its tape.
The machine should start at the left-most non-blank symbol.
It should treat the right-most symbol as the least significant bit.

For example, suppose the following is on the tape at the start:

100111

Your Turing machine should leave the following on the tape when it completes:
101000

## Task 8: Computational Complexity
Implement bubble sort in Python, modifying it to count the number of comparisons made during sorting.
Use this function to sort all permutations of the list:
L = [1, 2, 3, 4, 5]

For each permutation, print the permutation itself followed by the number of comparisons required to sort it.

### Background

Bubble sort is a well known algorithm across programming. For the unfamilliar, it works by stepping through a list of numbers or units repeatedly and comparing their adjacent elements, we then swap them if they're order is incorrect. Its name derives from the fact that elements slowly "bubble" to the top of the list as the iterations sort it. Bubble sort isn't always the most efficient choice for larger datasets, but I find its easily understandable and effective to implement. Its memory overhead is minimal enough and great for learning to understand sorting algorithms.

### Array Traversal
The first step to accomplishing a general implementation of bubble sort is the traversal process. We should always start at the very beginning of our array. We compare its one adjacent element (to the right). If our initial is greater than the second for example, they are swapped. 

### Comparisons or "Bubbling" Process

After that is the "bubbling" step so to speak. This is a continuation of before. So we progress each time through each pair of adjacent elements until we reach the end. Each pair swapping if necessary as they go, After one entire pass through, our largest element of the list should now be at the top.. or else something has gone horribly wrong. 

In [None]:
# Snippit code of bubbling up process
def bubbleUp(arr):
    arrLength = len(arr)
    
    print(f"Initial array: {arr}")
    
    # Single pass demonstration
    for j in range(arrLength-1):
        # Compare adjacent elements
        if arr[j] > arr[j+1]:
            # Swap if they are in the wrong order
            print(f"  Swap {arr[j]} and {arr[j+1]}")
            arr[j], arr[j+1] = arr[j+1], arr[j]
            print(f"  Array after swap: {arr}")
        else:
            print(f"  No swap needed for {arr[j]} and {arr[j+1]}")
    
    print(f"After one pass: {arr}")
    print(f"Notice how the largest element ({max(arr)}) has bubbled up to the end!")
    
    return arr

# Example usage
sample_array = [5, 1, 4, 2, 3]
bubbleUp(sample_array)

Initial array: [5, 1, 4, 2, 3]
  Swap 5 and 1
  Array after swap: [1, 5, 4, 2, 3]
  Swap 5 and 4
  Array after swap: [1, 4, 5, 2, 3]
  Swap 5 and 2
  Array after swap: [1, 4, 2, 5, 3]
  Swap 5 and 3
  Array after swap: [1, 4, 2, 3, 5]
After one pass: [1, 4, 2, 3, 5]
Notice how the largest element (5) has bubbled up to the end!


[1, 4, 2, 3, 5]

In [None]:
# Snippit code of bubbling down process
def bubbleDown(arr):
    arrLength = len(arr)
    
    print(f"Initial array: {arr}")
    
    # Single pass demonstration (starting from the end)
    for j in range(arrLength-1, 0, -1):
        # Compare adjacent elements
        if arr[j] < arr[j-1]:
            # Swap if they are in the wrong order
            print(f"  Swap {arr[j]} and {arr[j-1]}")
            arr[j], arr[j-1] = arr[j-1], arr[j]
            print(f"  Array after swap: {arr}")
        else:
            print(f"  No swap needed for {arr[j]} and {arr[j-1]}")
    
    print(f"After one pass: {arr}")
    print(f"Notice how the smallest element ({min(arr)}) has bubbled down to the beginning!")
    
    return arr

# Example usage
sample_array = [5, 3, 4, 2, 1]
bubbleDown(sample_array)

Initial array: [5, 3, 4, 2, 1]
  Swap 1 and 2
  Array after swap: [5, 3, 4, 1, 2]
  Swap 1 and 4
  Array after swap: [5, 3, 1, 4, 2]
  Swap 1 and 3
  Array after swap: [5, 1, 3, 4, 2]
  Swap 1 and 5
  Array after swap: [1, 5, 3, 4, 2]
After one pass: [1, 5, 3, 4, 2]
Notice how the smallest element (1) has bubbled down to the beginning!


[1, 5, 3, 4, 2]

### Subsequent Passes
After this things are pretty self explanatory, the algorithm repeats the previous process for the remaining elements until there's no longer any incorrect positioning. This of course, ignores the elements sorted to the end in previous passes.

In [None]:
# Snippit code of subsequent passes
def visualizePasses(arr):
    arrLength = len(arr)
    
    print(f"Initial array: {arr}")
    
    # Traverse through all array elements
    for i in range(arrLength):
        # Track if any swaps were made in this pass
        swapped = False
        
        print(f"\nPass {i+1}:")
        print(f"  Starting: {arr}")
        
        # Last i elements are already in place
        for j in range(0, arrLength-i-1):
            # Compare adjacent elements
            if arr[j] > arr[j+1]:
                # Swap if they are in the wrong order
                print(f"  Swap {arr[j]} and {arr[j+1]}")
                arr[j], arr[j+1] = arr[j+1], arr[j]
                swapped = True
        
        print(f"  After pass {i+1}: {arr}")
        
        # If no swapping occurred in this pass, array is sorted
        if not swapped:
            print(f"  No swaps in this pass - array is sorted!")
            break
    
    return arr

# Example usage
sample_array = [5, 1, 4, 2, 8]
visualizePasses(sample_array)

Initial array: [5, 1, 4, 2, 8]

Pass 1:
  Starting: [5, 1, 4, 2, 8]
  Swap 5 and 1
  Swap 5 and 4
  Swap 5 and 2
  After pass 1: [1, 4, 2, 5, 8]

Pass 2:
  Starting: [1, 4, 2, 5, 8]
  Swap 4 and 2
  After pass 2: [1, 2, 4, 5, 8]

Pass 3:
  Starting: [1, 2, 4, 5, 8]
  After pass 3: [1, 2, 4, 5, 8]
  No swaps in this pass - array is sorted!


[1, 2, 4, 5, 8]

### Optimization
A minor addition I tried here was using a swapped boolean to give this algorithm the ability to stop early if nothing was swapped in an entire pass through the array. This implies its already sorted, so we can abort the process early. 

### Final Product

In [5]:
def bubbleSort(arr): # Takes in a list of numbers to sort.
    arrLength = len(arr) # Get the length of array.

    # Print the initial array for debugging purposes.
    print(f"Initial array: {arr}, length: {arrLength} \n")

    # Traverse through the array, for each pass.
    # The outer loop represents the number of passes through the array, while the inner loop represents the comparisons made in each pass.
    # The outer loop runs for the length of the array, while the inner loop runs for the length of the array minus the current pass number.
    for i in range(arrLength):
        swapped = False # Set swapped to False at start of each pass.
        for j in range(0, arrLength - i - 1):
            if arr[j] > arr[j + 1]:
                # Swap if element found is greater than next element
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
                swapped = True
        # Print array state after a pass, for debugging purposes.
        print(f"Pass {i+1}: {arr}")
        
        # If no swapping occurred in pass, array is sorted
        if not swapped:
            break
            
    return arr

# Initial test of our bubble sort function
arr = [64, 34, 25, 12, 22, 11, 90]
print("Unsorted array:", arr)
sorted_arr = bubbleSort(arr)
print("\nSorted array:", sorted_arr)



Unsorted array: [64, 34, 25, 12, 22, 11, 90]
Initial array: [64, 34, 25, 12, 22, 11, 90], length: 7 

Pass 1: [34, 25, 12, 22, 11, 64, 90]
Pass 2: [25, 12, 22, 11, 34, 64, 90]
Pass 3: [12, 22, 11, 25, 34, 64, 90]
Pass 4: [12, 11, 22, 25, 34, 64, 90]
Pass 5: [11, 12, 22, 25, 34, 64, 90]
Pass 6: [11, 12, 22, 25, 34, 64, 90]

Sorted array: [11, 12, 22, 25, 34, 64, 90]


### Various examples
Here are some general examples of the bubble sort in operation. Including already sorted arrays and reversed ones being passed into the method. 

In [9]:
# Test with some examples
test_array1 = [64, 34, 25, 12, 22, 11, 90]
print("Sorting test_array1:")
bubbleSort(test_array1)

print("\nSorting test_array2:")
test_array2 = [5, 1, 4, 2, 8]
bubbleSort(test_array2)

# Test with an already sorted array
print("\nSorting already sorted array:")
test_array3 = [1, 2, 3, 4, 5]
bubbleSort(test_array3)

# Test with a reverse sorted array
print("\nSorting reverse sorted array:")
test_array4 = [5, 4, 3, 2, 1]
bubbleSort(test_array4)

Sorting test_array1:
Initial array: [64, 34, 25, 12, 22, 11, 90], length: 7 

Pass 1: [34, 25, 12, 22, 11, 64, 90]
Pass 2: [25, 12, 22, 11, 34, 64, 90]
Pass 3: [12, 22, 11, 25, 34, 64, 90]
Pass 4: [12, 11, 22, 25, 34, 64, 90]
Pass 5: [11, 12, 22, 25, 34, 64, 90]
Pass 6: [11, 12, 22, 25, 34, 64, 90]

Sorting test_array2:
Initial array: [5, 1, 4, 2, 8], length: 5 

Pass 1: [1, 4, 2, 5, 8]
Pass 2: [1, 2, 4, 5, 8]
Pass 3: [1, 2, 4, 5, 8]

Sorting already sorted array:
Initial array: [1, 2, 3, 4, 5], length: 5 

Pass 1: [1, 2, 3, 4, 5]

Sorting reverse sorted array:
Initial array: [5, 4, 3, 2, 1], length: 5 

Pass 1: [4, 3, 2, 1, 5]
Pass 2: [3, 2, 1, 4, 5]
Pass 3: [2, 1, 3, 4, 5]
Pass 4: [1, 2, 3, 4, 5]
Pass 5: [1, 2, 3, 4, 5]


[1, 2, 3, 4, 5]