# Reinforcement

R-5.1 Execute the experiment from Code Fragment 5.1 and compare the results on your system to those we report in Code Fragment 5.2.

In [2]:
import sys

def test_list_memory_usage(n):
    data=[]
    for k in range(n):
        a = len(data)
        b = sys.getsizeof(data)
        print("Length: {0:3d}; Size in bytes: {1:4d}".format(a, b))
        data.append(None)

test_list_memory_usage(26)

# Initial size of empty list is 56 bytes (more efficient than that of the book)
# First added element increases list size by: 32 bytes (same as book)
# 5th element increases list size by: 32 bytes (showing that architecture of system is 64-bit 
# because 4 addresses stored in the list equals 32 bytes (8 bytes * 4 addresses))
# 

Length:   0; Size in bytes:   56
Length:   1; Size in bytes:   88
Length:   2; Size in bytes:   88
Length:   3; Size in bytes:   88
Length:   4; Size in bytes:   88
Length:   5; Size in bytes:  120
Length:   6; Size in bytes:  120
Length:   7; Size in bytes:  120
Length:   8; Size in bytes:  120
Length:   9; Size in bytes:  184
Length:  10; Size in bytes:  184
Length:  11; Size in bytes:  184
Length:  12; Size in bytes:  184
Length:  13; Size in bytes:  184
Length:  14; Size in bytes:  184
Length:  15; Size in bytes:  184
Length:  16; Size in bytes:  184
Length:  17; Size in bytes:  248
Length:  18; Size in bytes:  248
Length:  19; Size in bytes:  248
Length:  20; Size in bytes:  248
Length:  21; Size in bytes:  248
Length:  22; Size in bytes:  248
Length:  23; Size in bytes:  248
Length:  24; Size in bytes:  248
Length:  25; Size in bytes:  312


R-5.2 In Code Fragment 5.1,we perform an experiment to compare the length of a Python list to its underlying memory usage. Determining the sequence of array sizes requires a manual inspection of the output of that program. Redesign the experiment so that the program outputs only those values of k at which the existing capacity is exhausted. For example, on a system consistent with the results of Code Fragment 5.2, your program should output that the sequence of array capacities are 0, 4, 8, 16, 25, . . . .

In [9]:
import sys

def list_memory_usage(n):
    data=[]
    previous_size = None
    previous_length = None
    for _ in range(n):
        a = len(data)
        b = sys.getsizeof(data)

        if not previous_size:
            previous_size = b
    
        if b > previous_size:
            print("Length: {0:3d}; Size in bytes: {1:4d}; Diff: {2:4d}".format(previous_length, previous_size, b-previous_size))
            previous_size = b

        data.append(None)
        previous_length = a

list_memory_usage(1000)

Length:   0; Size in bytes:   56; Diff:   32
Length:   4; Size in bytes:   88; Diff:   32
Length:   8; Size in bytes:  120; Diff:   64
Length:  16; Size in bytes:  184; Diff:   64
Length:  24; Size in bytes:  248; Diff:   64
Length:  32; Size in bytes:  312; Diff:   64
Length:  40; Size in bytes:  376; Diff:   96
Length:  52; Size in bytes:  472; Diff:   96
Length:  64; Size in bytes:  568; Diff:   96
Length:  76; Size in bytes:  664; Diff:  128
Length:  92; Size in bytes:  792; Diff:  128
Length: 108; Size in bytes:  920; Diff:  160
Length: 128; Size in bytes: 1080; Diff:  160
Length: 148; Size in bytes: 1240; Diff:  192
Length: 172; Size in bytes: 1432; Diff:  224
Length: 200; Size in bytes: 1656; Diff:  256
Length: 232; Size in bytes: 1912; Diff:  288
Length: 268; Size in bytes: 2200; Diff:  320
Length: 308; Size in bytes: 2520; Diff:  352
Length: 352; Size in bytes: 2872; Diff:  384
Length: 400; Size in bytes: 3256; Diff:  448
Length: 456; Size in bytes: 3704; Diff:  512
Length: 52

R-5.3 Modify the experiment from Code Fragment 5.1 in order to demonstrate that Python’s list class occasionally shrinks the size of its underlying array when elements are popped from a list.

In [11]:
import sys

def test_list_shrinking_memory_usage(n):
    data=[]
    for k in range(n):
        a = len(data)
        b = sys.getsizeof(data)
        print("Length: {0:3d}; Size in bytes: {1:4d}".format(a, b))
        data.append(None)
    print("\nTesting shrinking:\n\n")
    for k in range((n//4)*3):
        data.pop()
        a = len(data)
        b = sys.getsizeof(data)
        print("Length: {0:3d}; Size in bytes: {1:4d}".format(a, b))

test_list_shrinking_memory_usage(50)

# Between length 25 and length 49, Python increased the list size twice,
# however, Python only shrink the size of the list 1 time when going from 49 to 25
# Note that when shrinking, overall size on length 15 is 216 compared to 184 and 248 on length 17,
# so it seems like it does some mathematical solution to define the size.

Length:   0; Size in bytes:   56
Length:   1; Size in bytes:   88
Length:   2; Size in bytes:   88
Length:   3; Size in bytes:   88
Length:   4; Size in bytes:   88
Length:   5; Size in bytes:  120
Length:   6; Size in bytes:  120
Length:   7; Size in bytes:  120
Length:   8; Size in bytes:  120
Length:   9; Size in bytes:  184
Length:  10; Size in bytes:  184
Length:  11; Size in bytes:  184
Length:  12; Size in bytes:  184
Length:  13; Size in bytes:  184
Length:  14; Size in bytes:  184
Length:  15; Size in bytes:  184
Length:  16; Size in bytes:  184
Length:  17; Size in bytes:  248
Length:  18; Size in bytes:  248
Length:  19; Size in bytes:  248
Length:  20; Size in bytes:  248
Length:  21; Size in bytes:  248
Length:  22; Size in bytes:  248
Length:  23; Size in bytes:  248
Length:  24; Size in bytes:  248
Length:  25; Size in bytes:  312
Length:  26; Size in bytes:  312
Length:  27; Size in bytes:  312
Length:  28; Size in bytes:  312
Length:  29; Size in bytes:  312
Length:  3

R-5.4 Our Dynamic Array class, as given in CodeFragment 5.3, does not support use of negative indices with `__getitem__`. Update that method to better match the semantics of a Python list.

In [19]:
import ctypes # provides low-level arrays 

class DynamicArray:

    def __init__(self):
        self._n = 0
        self._capacity = 1
        self._A = self._make_array(self._capacity)

    def __len__(self):
        return self._n
    
    def __getitem__(self, k):
        if k < 0:
            k = self._n + k
    
        if not 0 <= k < self._n:
            raise IndexError("invalid index")
        return self._A[k]

    def append(self, obj):
        if self._n == self._capacity:
            self._resize(2 * self._capacity)
        self._A[self._n] = obj
        self._n += 1

    def _resize(self, c):
        B = self._make_array(c)
        for k in range(self._n):
            B[k] = self._A[k]
        self._A = B
        self._capacity = c

    def _make_array(self, c):
        return (c * ctypes.py_object)()

a = DynamicArray()
for i in range(10):
    a.append(i)
    print(i, end=" ")
print()
print(a[-1])
print(a[-3])
print(a[-len(a)])
print(a[0])
print(a[1])

0 1 2 3 4 5 6 7 8 9 
9
7
0
0
1


R-5.5 Redo the justification of Proposition 5.1 assuming that the the cost of growing the array from size k to size 2k is 3k cyber-dollars. How much should each append operation be charged to make the amortization work?

Each append operation should be charged: 7.

Assuming that 1 cyber-dollar is enough to pay for the execution of each append operation, and assuming that the cost of growing the array from size k to size 2k is 3k cyber dollars, we can determine how many cyber dollars we need to charge by use of the following example:

If we have a list of 8 elements, where the cost of going from 4 to 8 elements has been paid, and we need to have saved enough in the 4 elements that have been added (from the 5th to the 8th element), and we also know that we need to duplicate the size of the list, being k=8, it is going to cost 3\*8 = 24 cyber-dollars, then we need to have saved 24/4 = 6 cyber-dollars per element. 

So, if we charge 7 cyber-dollars per append operation, in the example where k=8, we would have 7\*4 - 6\*4 = 24, where 12\*4 is the total amount of cyber-dollars charged from the 5th to the 8th element, and 1*4 is the amount of cyber-dollars used already to append the 5th to 8th elements. Thus, leaving with a credit of 24 cyber-dollars to be used when the array needs to grow from k to 2k, that is from 8 to 16 elements in this example.

R-5.6 Our implementation of insert for the DynamicArray class, as given in Code Fragment 5.5, has the following inefficiency. In the case when a resize occurs, the resize operation takes time to copy all the elements from an old array to a new array, and then the subsequent loop in the body of insert shifts many of those elements. Give an improved implementation of the insert method, so that, in the case of a resize, the elements are shifted into their final position during that operation, thereby avoiding the subsequent shifting.

In [36]:
import ctypes

class DynamicArray:

    def __init__(self):
        self._n = 0
        self._capacity = 1
        self._A = self._make_array(self._capacity)

    def __len__(self):
        return self._n
    
    def __getitem__(self, k):
        if k < 0:
            k = self._n + k
    
        if not 0 <= k < self._n:
            raise IndexError("invalid index")
        return self._A[k]

    def append(self, obj):
        if self._n == self._capacity:
            self._resize(2 * self._capacity)
        self._A[self._n] = obj
        self._n += 1

    def _resize(self, c, k = -1):
        B = self._make_array(c)
        shift = 1 if k > -1 else 0
    
        for i in range(k):
            B[i] = self._A[i]
    
        for j in range(self._n-1, k-1, -1):
            B[j+shift] = self._A[j]
    
        self._A = B
        self._capacity = c

    def _make_array(self, c):
        return (c * ctypes.py_object)()
        #return [None] * c  # for debugging
    
    def insert(self, k, value):
        if k < 0:
            k = self._n + 1 + k

        if self._n == self._capacity:
            self._resize(2 * self._capacity, k)
        else:
            for j in range(self._n, k, -1):
                self._A[j] = self._A[j-1]

        self._A[k] = value
        self._n += 1

    def __str__(self):
        return ",".join(str(self._A[x]) for x in range(self._n))


a = DynamicArray()
for i in range(5):  
    a.append(i)
print(a)
a.insert(0,5)
a.insert(3,6)
a.insert(7,7)
a.insert(4,8)
a.insert(-2,9)
print(a, a._capacity)



0,1,2,3,4
5,0,1,6,8,2,3,4,9,7 16


R-5.7 Let A be an array of size n ≥ 2 containing integers from 1 to n − 1, inclusive, with exactly one repeated. Describe a fast algorithm for finding the integer in A that is repeated.

In [10]:
# time: O(n)
# space: O(n)
def find_duplicate(A: list) -> int:
    found = {}
    for el in A:
        if found.get(el, None) is not None:
            return el
        else:
            found[el] = el
    return None

"""
Potential improvement for the above algorithm could be on the space side.

Another approach could be with the equation for the sum of number from 1 to n, which is (n * (n+1)) for n = length - 1 given that 
there one repeated element. By removing each element from the total, the remaining value will be a negative value for which the absolute 
value will be the repeated element. However, this algorithm would work only if the list A is an order or unorder list of incremental
integers from 1 to n - 1 with one repeated element
"""

def find_duplicate2(A: list) -> int:
    length = len(A)
    total = ((length - 1) * length) // 2
    for el in A:
        total -= el
    return abs(total)

A = [x for x in range(1, 15)]
A.insert(4, 7)

print(find_duplicate(A))
print(find_duplicate2(A))

15 [1, 2, 3, 4, 7, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
7
7


R-5.8 Experimentally evaluate the efficiency of the pop method of Python’s list class when using varying indices as a parameter, as we did for insert on page 205. Report your results akin to Table 5.5.

In [33]:
from time import time

def test_pop(n: int, pop_idx):
    test_set = [x for x in range(n)]
    operations = 0
    start = time()
    for _ in test_set:
        if pop_idx is not None:
            test_set.pop(pop_idx)
        else:
            test_set.pop()
        operations += 1
    return ((time() - start) / operations) * 1000000  # microseconds

print("{:^15}|{:^10}|{:^10}|{:^10}|{:^10}|{:^10}|".format("", 100, 1000, 10000, 100000, 500000))

for k in range(3):
    repr = { 0: "k = 0", 1: "k = n // 2", 2: "k = n"}
    func = { 0: lambda x: 0, 1: lambda x: x//2, 2: lambda x: None}

    print("{:^15}|".format(repr[k]), end="")

    for step in (100, 1000, 10000, 100000, 500000):
        print("{:^10.3f}|".format(test_pop(step, func[k](step))), end="")
    print()


               |   100    |   1000   |  10000   |  100000  |  500000  |
     k = 0     |  0.110   |  0.173   |  1.146   |  11.521  |  58.041  |
  k = n // 2   |  0.114   |  0.103   |  0.391   |  3.424   |  19.573  |
     k = n     |  0.091   |  0.068   |  0.073   |  0.081   |  0.069   |


R-5.9 Explain the changes that would have to be made to the program of Code Fragment 5.11 so that it could perform the Caesar cipher for messages that are written in an alphabet-based language other than English, such as Greek, Russian, or Hebrew.

In [38]:
"""
The changes that would have to made are:

First, the size of the alphabet would have to be parameterized, that is, the "26" in the existing code would have to be a value
that is provided at the initialization of the object.

And lastly, the first letter of the alphabet to be used would also have to be parameterized so that it can be used across the code.

This is assuming that Python is able to transform the Unicode characters of the different languages into integer values and viceversa. 
Another consideration would be that the alphabets would have to have continuos characters so that their translation into integers is
in increasing order, as the English alphabet.

Another approach would be to provide a string or list with every character of a given language, and then utilizing the length of the
string/array and the indexes to calculate the encoding and decoding, without relaying on the ord/char functions. This is the solution
implemented below for a more general approach.
"""

class CaesarCipher:

    def __init__(self, shift, alphabet='ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
        alphabet_length = len(alphabet)
        self._encoder = dict()
        self._decoder = dict()

        for k, el in enumerate(alphabet):
            self._encoder[el] = alphabet[(k + shift) % alphabet_length]
            self._decoder[el] = alphabet[(k - shift) % alphabet_length]

    def encrypt(self, message):
        return self._transform(message, self._encoder)
    
    def decrypt(self, secret):
        return self._transform(secret, self._decoder)
    
    def _transform(self, original, code):
        msg = list(original)
        for i, el in enumerate(msg):
            if el in code: # O(1) because it is a dictionary
                msg[i] = code[el]
        return ''.join(msg)

print('English Cipher')
cipher = CaesarCipher(3)
message = "THE EAGLE IS IN PLAY; MEET AT JOE S."
coded = cipher.encrypt(message)
print("Secret:", coded)
answer = cipher.decrypt(coded)
print("Message:", answer)

print('\nGreek Cipher')
greek_cipher = CaesarCipher(3, alphabet="ΑαΒβΓγΔδΕεΖζΗηΘθΙιΚκΛλΜμΝνΞξΟοΠπΡρΣσςΤτΥυΦφΧχΨψΩω")
greek_message = "μηδείς αγεωµετρητος εισιτω µον την στεγην"
greek_coded = greek_cipher.encrypt(greek_message)
print("Secret:", greek_coded)
greek_answer = greek_cipher.decrypt(greek_coded)
print("Message:", greek_answer)

English Cipher
Secret: WKH HDJOH LV LQ SODB; PHHW DW MRH V.
Message: THE EAGLE IS IN PLAY; MEET AT JOE S.

Greek Cipher
Secret: ΞΙΖΗίΥ ΓΕΗΒµΗΦςΙΦΡΥ ΗΛτΛΦΒ µΡΟ ΦΙΟ τΦΗΕΙΟ
Message: μηδείς αγεωµετρητος εισιτω µον την στεγην


R-5.10 The constructor for the CaesarCipher class in Code Fragment 5.11 can be implemented with a two-line body by building the forward and backward strings using a combination of the join method and an appropriate comprehension syntax. Give such an implementation.

In [4]:
class CaesarCipher:
    def __init__(self, shift):
        self._encoder = ''.join([chr((k + shift) % 26 + ord("A")) for k in range(26)])
        self._decoder = ''.join([chr((k - shift) % 26 + ord("A")) for k in range(26)])
    
    def encrypt(self, message):
        return self._transform(message, self._encoder)
    
    def decrypt(self, secret):
        return self._transform(secret, self._decoder)
    
    def _transform(self, original, code):
        msg = list(original)
        for k in range(len(msg)):
            if msg[k].isupper():
                j = ord(msg[k]) - ord('A')
                msg[k] = code[j]
        return ''.join(msg)


print('English Cipher')
cipher = CaesarCipher(3)
message = "THE EAGLE IS IN PLAY; MEET AT JOE S."
coded = cipher.encrypt(message)
print("Secret:", coded)
answer = cipher.decrypt(coded)
print("Message:", answer)

English Cipher
Secret: WKH HDJOH LV LQ SODB; PHHW DW MRH V.
Message: THE EAGLE IS IN PLAY; MEET AT JOE S.


R-5.11 Use standard control structures to compute the sum of all numbers in an n × n data set, represented as a list of lists.

In [7]:
def matrix_sum(matrix):
    sum = 0
    for col in matrix:
        for cell in col:
            sum += cell
    return sum

m = [[x for x in range(3)] for i in range(3)]
print(matrix_sum(m))


9


R-5.12 Describe how the built-in sum function can be combined with Python’s comprehension syntax to compute the sum of all numbers in an n × n data set, represented as a list of lists.

In [9]:
def matrix_sum(matrix):
    return sum(sum(column) for column in matrix)

m = [[x for x in range(3)] for i in range(3)]
print(matrix_sum(m))

9


# Creativity

C-5.13 In the experiment of Code Fragment 5.1, we begin with an empty list. If data were initially constructed with nonempty length, does this affect the sequence of values at which the underlying array is expanded? Perform your own experiments, and comment on any relationship you see between the initial length and the expansion sequence.

In [10]:
import sys

def creativity_5_13(i, n):
    data = [0] * i if i else []
    for k in range(n):
        a = len(data)
        b = sys.getsizeof(data)
        print("Length: {0:3d}; Size in bytes: {1:4d}".format(a, b))
        data.append(None)

creativity_5_13(None, 20)
print('----')
creativity_5_13(5, 20)
print('----')
creativity_5_13(65, 40)
print('----')
creativity_5_13(125, 40)
print('----')
creativity_5_13(250, 40)
print('----')
creativity_5_13(500, 40)
print('----')
creativity_5_13(1000, 40)

"""
Based on the output of the executions above, we can see a relationship between the size of the initial array
and the amount of expansions the underlaying array suffers. What we can see from the output is that the bigger the initial
size of the array, the less expansions the underlaying list experiences for a fixed number of elements (40 in this test)
"""



Length:   0; Size in bytes:   56
Length:   1; Size in bytes:   88
Length:   2; Size in bytes:   88
Length:   3; Size in bytes:   88
Length:   4; Size in bytes:   88
Length:   5; Size in bytes:  120
Length:   6; Size in bytes:  120
Length:   7; Size in bytes:  120
Length:   8; Size in bytes:  120
Length:   9; Size in bytes:  184
Length:  10; Size in bytes:  184
Length:  11; Size in bytes:  184
Length:  12; Size in bytes:  184
Length:  13; Size in bytes:  184
Length:  14; Size in bytes:  184
Length:  15; Size in bytes:  184
Length:  16; Size in bytes:  184
Length:  17; Size in bytes:  248
Length:  18; Size in bytes:  248
Length:  19; Size in bytes:  248
----
Length:   5; Size in bytes:   96
Length:   6; Size in bytes:  152
Length:   7; Size in bytes:  152
Length:   8; Size in bytes:  152
Length:   9; Size in bytes:  152
Length:  10; Size in bytes:  152
Length:  11; Size in bytes:  152
Length:  12; Size in bytes:  152
Length:  13; Size in bytes:  216
Length:  14; Size in bytes:  216
Lengt

C-5.14 The shuffle method, supported by the random module, takes a Python list and rearranges it so that every possible ordering is equally likely. Implement your own version of such a function. You may rely on the randrange(n) function of the random module, which returns a random number between 0 and n − 1 inclusive.

In [31]:
from random import randrange


def custom_shuffle(l: list) -> list:
    n = len(l)
    for i in range(n):
        new_i = randrange(n-1)
        l[i], l[new_i] = l[new_i], l[i]

first_int = []
tests = 100000
A = [i for i in range(10)]
for _ in range(tests):
    custom_shuffle(A)
    first_int.append(A[0])

for i in range(10):
    probability = (first_int.count(i)/tests) * 100
    print(
        "Count of first int {0} \t Probability is: {1:.2f}%".format(i, probability)
    )



Count of first int 0 	 Probability is: 9.90%
Count of first int 1 	 Probability is: 10.04%
Count of first int 2 	 Probability is: 10.06%
Count of first int 3 	 Probability is: 10.04%
Count of first int 4 	 Probability is: 9.91%
Count of first int 5 	 Probability is: 9.93%
Count of first int 6 	 Probability is: 10.03%
Count of first int 7 	 Probability is: 9.91%
Count of first int 8 	 Probability is: 10.14%
Count of first int 9 	 Probability is: 10.04%


C-5.15 Consider an implementation of a dynamic array, but instead of copying the elements into an array of double the size (that is, from N to 2N) when its capacity is reached, we copy the elements into an array with ⌈N/4⌉ additional cells, going from capacity N to capacity N + ⌈N /4⌉. Prove that performing a sequence of n append operations still runs in O(n) time in this case.

Justification: 

Let N be the length of the array S at which S is full. Let N/4 be the expansion needed, which would result in a new N.

So: new_N = old_N + old_N/4 = 4/4 * old_n + 1/4 * old_N = 5/4 old_N = new_N -> old_N / 4 = 1/5 * new_N

Therefore, the cost to expand to a new N is of N/5. Therefore, we can pay for the execution of n append operations using n/5 cyber-dollars. In other words, the amortized running time of each append operation is O(1); hence, the total running time of n append operations is O(n).



## Due to the amount of time needed to do all exercises per chapter, I have made the decision to only do the reinforcement exercises from now on for the remaining chapters.