# Classic Computer Science Problems in Python
## Section 1: Fibonacci Sequences

#### Fibonacci sequence recursively: 
base case (stopping point) is when n becomes less than 2. While n is greater than or equal to two, call stack will double in size which leads to poor performance with higher values of n.

In [1]:
def fib(n: int) -> int:
    if n < 2:
        return n
    
    return fib(n - 2) + fib(n - 1)

test_val = 3
print('Starting from 0, Fibonacci to the {} position is {}'.format(test_val + 1, fib(test_val)))

Starting from 0, Fibonacci to the 4 position is 2


#### Fibonacci with memoization: 
stores previous two values in dictionary if they aren't already there, starts with 0 and 1 as base cases. Dictionary check and adding new values to dictionary can be done in constant time which leads to much better performance at higher values of n. Effectively 'prunes' the call stack to values that haven't been seen before.

In [2]:
memo = {0:0, 1:1}

def fib_memo(n: int) -> int:
    # print(f'{n} is on the call stack')
    if n not in memo:
        memo[n] = fib_memo(n - 2) + fib_memo(n - 1)
    
    return memo[n]

memo_test = 50
print('Starting from 0, Fibonacci to the {} position is {}'.format(memo_test + 1, fib_memo(memo_test)))

Starting from 0, Fibonacci to the 51 position is 12586269025


#### Fibonacci with built in caching decorator
caches results of function call automagically, takes a maxsize argument that limits the number of recent functions that the caching decorator should cache. Setting this to 'None' allows it to cache without limits. 

In [3]:
from functools import lru_cache

@lru_cache(maxsize=None)
def fib_cache(n: int) -> int:
    if n < 2:
        return n
    
    return fib_cache(n - 2) + fib_cache(n - 1)

cache_test = 20
print('Starting from 0, Fibonacci to the {} position is {}'.format(cache_test + 1, fib_cache(cache_test)))

Starting from 0, Fibonacci to the 21 position is 6765


#### Iterative approach to fibonacci
Rather than working back from n to 0 with the fibonacci pattern, this approach starts at 0 and works out to the nth fibonacci degree. Time complexity here is O(n) where n is the number of iterations between 0 and n

In [4]:
def it_fib(n: int) -> int:
    if n == 0:
        return
    
    prev = 0
    proc = 1
    
    for _ in range(1, n):
        prev, proc = proc, prev+proc
        
    return proc

it_test = 20
print('Starting from 0, Fibonacci to the {} position is {}'.format(it_test + 1, fib_cache(it_test)))

Starting from 0, Fibonacci to the 21 position is 6765


#### Iterative approach to fibonacci
Similar to previous, but uses yield to return values without breaking the inner loop. 

In [5]:
def yield_fib(n: int) -> int:
    yield 0
    if n > 0:
        yield 1
    
    prev = 0
    proc = 1
    
    for _ in range(1, n):
        prev, proc = proc, prev+proc
        yield proc
        
yield_test = 20
yield_bucket = []
for i in yield_fib(yield_test):
    yield_bucket.append(i)
    
print(yield_bucket)

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]


## Exercise:
1.  Write another function that solves for element n in a Fibonacci sequence. Then, write unit tests that evaluate its correctness and performance relative to the other previous versions.

In [40]:
from typing import TypeVar, Generic, List, Set
T = TypeVar('T')

class CustomStack(Generic[T]):
    def __init__(self):
        self._container: List[T] = []
        self._item_set: Set[T] = set()
    
    def push(self, item: T) -> None:
        self._container.append(item)
        self._item_set.add(item)
#         print(self._container)
        
    @property
    def get_last(self) -> int:
        print(self._container[-1])
        return self._container[-1]
    
    def pop(self) -> T:
        return self._container.pop()
    
    def __repr__(self) -> repr:
        return repr(self._container)
    
    def contains(self, item: T) -> bool:
#         print(item in self._item_set)
        return item in self._item_set
    

fib_stack = CustomStack()

def custom_fib(n: int) -> int:
    if not Stack.contains(n):
        if n < 2:
            Stack.push(n)
        else:
            Stack.push(custom_fib(n-1) + custom_fib(n-2))
        
    return Stack.get_last
    


In [42]:
fib_stack = CustomStack()
custom_fib(5, fib_stack)
print(fib_stack)

TypeError: custom_fib() missing 1 required positional argument: 'Stack'

### Results from different implementations
Most efficient is the lru_cache decorated function with the memoized version coming in close behind. Interestingly, there is a sizeable gap in efficiency between the memoized versions and the iterative approach. I wonder if there will be other opportunities to test these through this book.


For more magic functions, run %lsmagic in a code cell.

In [12]:
fib_stack = CustomStack()
%timeit for i in range(5, 15): fib(i)
%timeit for i in range(5, 15): fib_memo(i)
%timeit for i in range(5, 15): fib_cache(i)
%timeit for i in range(5, 15): it_fib(i)
%timeit for i in range(5, 15): custom_fib(i)

459 µs ± 7.63 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1.98 µs ± 24.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.56 µs ± 20.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
8.51 µs ± 10.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
4 ms ± 6.05 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [10]:
fib_stack = CustomStack()
%timeit for i in range(50, 100): fib_memo(i)
%timeit for i in range(50, 100): fib_cache(i)
%timeit for i in range(50, 100): it_fib(i)
%timeit for i in range(20, 30): custom_fib(i)

8.36 µs ± 33.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
6.26 µs ± 7.27 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
230 µs ± 1.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
4.5 s ± 14.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## Section 2: Basic Compression

#### Helpful notes:
The sentinel value is a form of in-band data that makes it possible to detect the end of the data when no out-of-band data (such as an explicit size indication) is provided. The value should be selected in such a way that it is guaranteed to be distinct from all legal data values, since otherwise the presence of such values would prematurely signal the end of the data (the semipredicate problem). 


A sentinel value is sometimes known as an "Elephant in Cairo", due to a joke where this is used as a physical sentinel. In safe languages, most uses of sentinel values could be replaced with option types, which enforce explicit handling of the exceptional case. (https://en.wikipedia.org/wiki/Sentinel_value)


Since every step appends an even number of bits to the bitstring, we can assume that the last value (1 in this case) will be the terminating character (sentinel) for the compressed data. By compressing from UTF-8 (8 bits) into a 2 bit representation, we wind up saving 75% more space.

In [1]:
import sys

class CompressedGene:
    def __init__(self, gene: str) -> None:
        self._compress(gene)
        
    def _compress(self, gene: str):
        self.bit_string: int = 1    # sentinal
        for nucleotide in gene.upper():
            self.bit_string <<=2    # shift bits left by 2
            if nucleotide == 'A':
                self.bit_string |= 0b00
            elif nucleotide == 'C':
                self.bit_string |= 0b01
            elif nucleotide == 'G':
                self.bit_string |= 0b10
            elif nucleotide == 'T':
                self.bit_string |= 0b11
            
            else:
                raise ValueError('Invalid Nucleotide: {}'.format(nucleotide))
                
    
    def decompress(self):
        bitstring_size = sys.getsizeof(self.bit_string)
        print('Size of compressed genees: {} bytes or {} bits'.format(bitstring_size, bitstring_size*8))
        
        gene: str = ''
        # From the range of 0 -> compressed bit string minus 1 in batches of 2
        # batches of two since each compressed character is 2 bits long
        # minus one because of the sentinal
        for i in range(0, self.bit_string.bit_length()-1, 2):
            # shift the window by i bits (+2) per tick
            # 0b11 -> stores these bit values in bits variable
            # 0b1111 -> stores 4 bit values in bits variable
            bits: int = self.bit_string >> i & 0b11
            if bits == 0b00:
                gene+='A'
            elif bits == 0b01:
                gene +='C'
            elif bits == 0b10:
                gene += 'G'
            elif bits == 0b11:
                gene += 'T'
            else:
                raise ValueError('Invalid bits: {}'.format(bits))
        
        gene_size = sys.getsizeof(gene)
        print('Size of original string: {} bytes or {} bits'.format(gene_size, gene_size*8))
        
        size_difference = 1 - bitstring_size / gene_size
        print('Size savings between compressed and uncompressed data: {0:.0%}'.format(size_difference ))

        return 'first 50 characters: ' + gene[50::-1] + '...'
    
    
    def __str__(self):
        return self.decompress()
                
                

In [2]:
# Savings will not be constant due to the way that python handles memory
test_string = 'TAGGATACCT' * 100
compressed = CompressedGene(test_string)
print(compressed)

Size of compressed genees: 292 bytes or 2336 bits
Size of original string: 1049 bytes or 8392 bits
Size savings between compressed and uncompressed data: 72%
first 50 characters: TTAGGATACCTTAGGATACCTTAGGATACCTTAGGATACCTTAGGATACCT...


## Section 3: Basic Encryption

#### Helpful notes:
secrets.token_bytes([nbytes=None]):
- Return a random byte string containing nbytes number of bytes. If nbytes is None or not supplied, a reasonable default is used.


To be secure against brute-force attacks, tokens need to have sufficient randomness. Unfortunately, what is considered sufficient will necessarily increase as computers get more powerful and able to make more guesses in a shorter period. As of 2015, it is believed that 32 bytes (256 bits) of randomness is sufficient for the typical use-case expected for the secrets module.


In this example, we generate 1 byte for each unit of length and string them into an int. When length == 1, min / max is (0, 256), when length == 2, min / max is (0, 256256) etc etc.


In [11]:
from secrets import token_bytes
from typing import Tuple

def random_key(length:int) -> int:
    # generate random bytes equal to the input length
    tb: bytes = token_bytes(length)
    print('random token bytes: {}'.format(tb))
    
    output: int = int.from_bytes(tb, "big")
    print('integer output: {}'.format(output))
    
    return int.from_bytes(tb, "big")

random_key(10)

random token bytes: b'\x07i\xc2\x9cf\xfeL\xa8\xe2\xce'
integer output: 35007496704409138160334


35007496704409138160334

#### Helpful notes:
Using the bitwise oper0ator takes each byte (14 = 00001110, 9 = 00001001) and results in a flipped bitstring (00001110 ^ 00001001 == 00000111) where each each bit in the resulting bitstring is == 1 if the compared bits are different and 0 if the compared bits are the same.

In [12]:
def encrypt(original: str) -> Tuple[int, int]:
    # Encodes the original data into a series of bytes
    original_bytes: bytes = original.encode()
        
    # Makes a random byte string seeded from the length of the original encoded string
    dummy: int = random_key(len(original_bytes))
        
    # Stores the int version of the original byte string into key variable
    original_key: int = int.from_bytes(original_bytes, "big")
    
    # XOR operation to scramble them bytes
    encrypted: int = original_key ^ dummy
        
    print('original string: {}'.format(original))
    print('encoded string: {}'.format(original_bytes))
    print('dummy: {}'.format(dummy))
    print('generated key: {}'.format(original_key))
    print('encrypted data: {}'.format(encrypted))
    
    return dummy, encrypted

encrypt('hello')

random token bytes: b'\xc9\xf6=R\xba'
integer output: 867419640506
original string: hello
encoded string: b'hello'
dummy: 867419640506
generated key: 448378203247
encrypted data: 693961309909


(867419640506, 693961309909)

#### Helpful Notes:
There are two arguments passed to int.from_bytes(). The first is which bytestring we should convert to an integer, the second argument is the endianness of those bytes. Endianness refers to the order of bytes (or sometimes bits) within a binary representation of a number. 


It can also be used more generally to refer to the internal ordering of any representation, such as the digits in a numeral system or the sections of a date. In English, numbers are written with their digits in big-endian order. Similarly, programming languages use big-endian digit ordering for numeric literals as well as big-endian language (“left” and “right”) for bit-shift operations, regardless of the endianness of the target architecture. This can lead to confusion when interacting with little-endian numbers.

As long as you use pass the same endianness to encode and decode, there shouldn't be any problems. However, if you don't control both the encoding and decoding ends of the process, the ordering could potentially cause issues.

In [13]:
def decrypt(key1: int, key2: int) -> str:
    decrypted: int = key1 ^ key2 
    # divide by 8 (bits back to bytes)
    # plus 7 to ensure we 'round up' to the nearest byte and avoid off-by-one errors
    temp: bytes = decrypted.to_bytes((decrypted.bit_length()+7) // 8, "big")
    return temp.decode()
    

In [15]:
key1, key2 = encrypt("the revolution will not be televised")
decrypted = decrypt(key1, key2)

print('Decrypted data: {}'.format(decrypted))

random token bytes: b'\x90\xcf<:\xd5d\x8c\x08.\xe8\x96%Jv\n\xe9\x81\x8f\x03\xaf\xf4\xfc\x136\xb4\xfa\x13\xe1.hW\\U\x9eD\x05'
integer output: 281316935784030283189264087979729345895439299940815990655419542239625333498667830494213
original string: the revolution will not be televised
encoded string: b'the revolution will not be televised'
dummy: 281316935784030283189264087979729345895439299940815990655419542239625333498667830494213
generated key: 226141798413000384005465770004076852290532832940813656133191763532329450882842565502308
encrypted data: 444198436630724350081609574688314157516555527571606231408245045349099864371328078127457
Decrypted data: the revolution will not be televised


## Section 4: Calculating Pi

#### Helpful notes:
You can calculate Pi with Leibniz forumla which posits that: 
##### $$pi = 4/1 - 4/3 + 4/5 - 4/7 + 4/9...$$



In [3]:
def calculate_pi(n_terms: int) -> float:
    numerator: float = 4.0
    denominator: float = 1.0
    pi: float = 0.0
    operation: float = 1.0
        
    for _ in range(n_terms):
        pi += operation*(numerator/denominator)
        denominator += 2
        operation*= -1.0

## Section 5: Towers of Hanoi

#### Helpful notes:
The Tower of Hanoi (also called the Tower of Brahma or Lucas' Tower and sometimes pluralized) is a mathematical game or puzzle. It consists of three rods and a number of disks of different sizes, which can slide onto any rod. The puzzle starts with the disks in a neat stack in ascending order of size on one rod, the smallest at the top, thus making a conical shape.

The objective of the puzzle is to move the entire stack to another rod, obeying the following simple rules:

- Only one disk can be moved at a time.
- Each move consists of taking the upper disk from one of the stacks and placing it on top of another stack or on an empty rod.
- No larger disk may be placed on top of a smaller disk.

With 3 disks, the puzzle can be solved in 7 moves. The minimal number of moves required to solve a Tower of Hanoi puzzle is 2n − 1, where n is the number of disks.

Stacks are a good data structure to use on this problem - they are modeled on the concept of LIFO (last in first out) where the last element added to the stack is the first one to be removed / popped. Similar to the way that spells and instants stack and resolve in MTG, or a stack of plates / trays at a buffet.

In [139]:
from typing import TypeVar, Generic, List
# used to get type hints later on
T = TypeVar('T')

class Stack(Generic[T]):
    def __init__(self) -> None:
        self._container: List[T] = []
    
    def push(self, item: T) -> None:
        self._container.append(item)
        
    def pop(self) -> T:
        return self._container.pop()
    
    def __repr__(self) -> repr:
        return repr(self._container)

In [156]:
num_discs: int = 3
tower_a: Stack[int] = Stack()
tower_b: Stack[int] = Stack()
tower_c: Stack[int] = Stack()
    
for i in range(1, num_discs + 1): 
    tower_a.push(i)

print('Tower A: {} \nTower B: {} \nTower C: {}'.format(tower_a, tower_b, tower_c))

Tower A: [1, 2, 3] 
Tower B: [] 
Tower C: []


In [157]:
# Move n-1 disks from start to temp using target
# Move 1 disk from start to target
# Move n-1 disks from temp to target using start

def hanoi(start_pile: Stack[int], target_pile: Stack[int], temp_pile: Stack[int], n: int, sig: str) -> None:
    print('\nN: {} \nFunction: {} '.format(n, sig))
    if n == 1:
        print('From: {} Temp: {} Target: {}'.format(start_pile, temp_pile, target_pile))
        print('Popping and pushing')
        target_pile.push(start_pile.pop())
        print('From: {} Temp: {} Target: {}'.format(start_pile, temp_pile, target_pile))
    else:
        hanoi(start_pile, temp_pile, target_pile, n-1, 'temp <-> target switch')
        hanoi(start_pile, target_pile, temp_pile, 1, 'base-case'),
        hanoi(temp_pile, target_pile, start_pile, n-1, 'start <-> temp switch')
    
        
hanoi(tower_a, tower_c, tower_b, num_discs, 'starting')



N: 3 
Function: starting 

N: 2 
Function: temp <-> target switch 

N: 1 
Function: temp <-> target switch 
From: [1, 2, 3] Temp: [] Target: []
Popping and pushing
From: [1, 2] Temp: [] Target: [3]

N: 1 
Function: base-case 
From: [1, 2] Temp: [3] Target: []
Popping and pushing
From: [1] Temp: [3] Target: [2]

N: 1 
Function: start <-> temp switch 
From: [3] Temp: [1] Target: [2]
Popping and pushing
From: [] Temp: [1] Target: [2, 3]

N: 1 
Function: base-case 
From: [1] Temp: [2, 3] Target: []
Popping and pushing
From: [] Temp: [2, 3] Target: [1]

N: 2 
Function: start <-> temp switch 

N: 1 
Function: temp <-> target switch 
From: [2, 3] Temp: [1] Target: []
Popping and pushing
From: [2] Temp: [1] Target: [3]

N: 1 
Function: base-case 
From: [2] Temp: [3] Target: [1]
Popping and pushing
From: [] Temp: [3] Target: [1, 2]

N: 1 
Function: start <-> temp switch 
From: [3] Temp: [] Target: [1, 2]
Popping and pushing
From: [] Temp: [] Target: [1, 2, 3]


## Exercises:
1.  Write another function that solves for element n in a Fibonacci sequence. Then, write unit tests that evaluate its correctness and performance relative to the other previous versions.
2.  Write an ergonomic wrapper around int that can be used generically as a sequence of bits (make it iterable and implement \_\_getitem__()). Reimplement CompressedGene using the wrapper.
3.  Write a solver for the towers of hanoi that works for any number of towers
4.  Use a one-time pad to encrypt and decrypt images