<a href="https://colab.research.google.com/github/lblogan14/data_structures_and_algorithms/blob/master/ch5_array_based_sequence.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#5.3 Dynamic Arrays and Amortization

In [0]:
import sys # provide getsizeof function
data = []
for k in range(20): # Note: must fix choice of n
  a = len(data) # number of elements
  b = sys.getsizeof(data) # actual size in bytes
  print('Length: {0:3d}; Size in bytes: {1:4d}'.format(a,b))
  data.append(None) # increase length by one

Length:   0; Size in bytes:   64
Length:   1; Size in bytes:   96
Length:   2; Size in bytes:   96
Length:   3; Size in bytes:   96
Length:   4; Size in bytes:   96
Length:   5; Size in bytes:  128
Length:   6; Size in bytes:  128
Length:   7; Size in bytes:  128
Length:   8; Size in bytes:  128
Length:   9; Size in bytes:  192
Length:  10; Size in bytes:  192
Length:  11; Size in bytes:  192
Length:  12; Size in bytes:  192
Length:  13; Size in bytes:  192
Length:  14; Size in bytes:  192
Length:  15; Size in bytes:  192
Length:  16; Size in bytes:  192
Length:  17; Size in bytes:  264
Length:  18; Size in bytes:  264
Length:  19; Size in bytes:  264


An empty list instance already requires a certain number of bytes of memory.

Each object in Python maintains some state, for example, a reference to denote the class to which it belongs.

##5.3.1 Implementing a Dynamic Array
Provide means to grow the array $A$ that stores the elements of a list.:
1. Allocate a new array $B$ with larger capacity
2. Set $B[i]=A[i]$, for $i=0,...,n-1$, where $n$ denotes current number of items
3. Set $A=B$, that is, use $B$ as the array supporting the list
4. Insert the new element in the new array

###Dynamic Array Implementation

In [0]:
import ctypes # provide low-level arrays
 
class DynamicArray:
  '''A dynamic array class akin to a simplified Python list'''
  
  def __init__(self):
    '''create an empty array'''
    self._n = 0 # count actual elements
    self._capacity = 1 # default array capacity
    self._A = self._make_array(self._capacity) # low-level array
    
  def __len__(self):
    '''Return number of elements stored in the array'''
    return self._n
  
  def __getitem__(self, k):
    '''Return element at index k'''
    if not 0 <= k < self._n:
      raise IndexError('invalid index')
    return self._A[k] # retrieve from array
  
  def append(self, obj):
    '''Add object to end of the array'''
    if self._n == self._capacity: # not enough room
      self._resize(2 * self._capacity) # double capacity
    self._A[self._n] = obj
    self._n += 1
    
  def _resize(self, c): # nonpublic utility
    'Resize internal array to capacity c'
    B = self._make_array(c) # new (bigger) array
    for k in range(self._n): # for each existing value
      B[k] = self._A[k]
    self._A = B # use the bigger array
    self._capacity = c
    
  def _make_array(self, c): # nonpublic utility
    '''Return new array with capacity c'''
    return (c * ctypes.py_object)()

Support for creating low-level arrays is provided by the `ctypes` module.

##5.3.2 Amortized Analysis of Dynamic Arrays
Using an algorithmic design pattern called **amortization**, it is proved that performing a sequence of such append operations on a dynamic array is actually quite efficient.

Let $S$ be a sequence implemented by means of a dynamic array with initial capacity one, using the strategy of doubling the array size when full. The total time to perform a series of $n$ append opeartions in $S$, starting from $S$ being empty, is $O(n)$.

##5.3.3 Python's `List` Class
Python's implementation of the `append` method exhibits amortized constant-time behavior.

###Measure the amortized cost of `append` for Python's `list` class

In [0]:
from time import time # import time function from time module
def compute_average(n):
  '''Perform n appends to an empty list and return average time elapsed'''
  data = []
  start = time() # record the start time (in seconds)
  for k in range(n):
    data.append(None)
  end = time() # record the end time
  return (end - start)/n # compute average per operation

In [0]:
n = [10**i for i in range(2,8,1)]

In [0]:
for index in n:
  print('Compututational time for n={} is {}'.format(index, compute_average(index)))

Compututational time for n=100 is 1.7642974853515625e-07
Compututational time for n=1000 is 1.8215179443359375e-07
Compututational time for n=10000 is 1.859903335571289e-07
Compututational time for n=100000 is 1.8187046051025392e-07
Compututational time for n=1000000 is 1.3286042213439941e-07
Compututational time for n=10000000 is 1.3371222019195556e-07


You can see that Python's implementation of the `append` method exhibits amortized constant-time behavior.

#5.4 Efficiency of Python's Sequence Types

## 5.4.1 Python's `List` and `Tuple` Classes
`Tuple` class is typically more memory efficient than `list` class because they are immutable; thus, there is no need for an underlying dynamic array with surplus capacity.

### Adding Elements to a List
In Section 5.3, the `append` method has been fully explored. In the worst case, it requires $\Omega(n)$ time because the underlying array is resized, but it uses $O(1)$ time in the amortized sense. \\
Another important method for `list` is the `insert` method which inserts a given value into the list at index $0\leq k\leq n$ while shifting all subsequent elements back one slot to make room. This method will be added in the `DynamicArray` class implemented above:

In [0]:
import ctypes # provide low-level arrays
 
class DynamicArray:
  '''A dynamic array class akin to a simplified Python list'''
  
  def __init__(self):
    '''create an empty array'''
    self._n = 0 # count actual elements
    self._capacity = 1 # default array capacity
    self._A = self._make_array(self._capacity) # low-level array
    
  def __len__(self):
    '''Return number of elements stored in the array'''
    return self._n
  
  def __getitem__(self, k):
    '''Return element at index k'''
    if not 0 <= k < self._n:
      raise IndexError('invalid index')
    return self._A[k] # retrieve from array
  
  def append(self, obj):
    '''Add object to end of the array'''
    if self._n == self._capacity: # not enough room
      self._resize(2 * self._capacity) # double capacity
    self._A[self._n] = obj
    self._n += 1
    
  def _resize(self, c): # nonpublic utility
    'Resize internal array to capacity c'
    B = self._make_array(c) # new (bigger) array
    for k in range(self._n): # for each existing value
      B[k] = self._A[k]
    self._A = B # use the bigger array
    self._capacity = c
    
  def _make_array(self, c): # nonpublic utility
    '''Return new array with capacity c'''
    return (c * ctypes.py_object)()
  
  # add the insert method below
  def insert(self, k, value):
    '''insert value at index k, shifting subsequent values rightward'''
    # for simplicity, we assume 0<=k<=n in this version
    if self._n == self._capacity: # not enough room
      self._resize(2 * self._capacity) # so double capacity
    for j in range(self._n, k, -1): # shift rightmost first
      self._A[j] = self._A[j-1]
    self._A[k] = value # store newest element
    self._n += 1

The addition of one element may require a resizing of the dynamic array, which requires $\Omega(n)$ worst-case time but only $O(1)$ amortized time, as per `append`

###Removing Elements from a List
A call to `pop()` removes the last element from a list. 

`pop(k)` removes the element that is at index $k<n$ of a list, shifting all subsequent elements leftward to fill the gap that results from the removal, which requires $O(n-k)$ time.

The `remove` method specifies the value that should be removed (not the index at which it resides). This method is implemented in the `DynamicArray` class as an example:

In [0]:
import ctypes # provide low-level arrays
 
class DynamicArray:
  '''A dynamic array class akin to a simplified Python list'''
  
  def __init__(self):
    '''create an empty array'''
    self._n = 0 # count actual elements
    self._capacity = 1 # default array capacity
    self._A = self._make_array(self._capacity) # low-level array
    
  def __len__(self):
    '''Return number of elements stored in the array'''
    return self._n
  
  def __getitem__(self, k):
    '''Return element at index k'''
    if not 0 <= k < self._n:
      raise IndexError('invalid index')
    return self._A[k] # retrieve from array
  
  def append(self, obj):
    '''Add object to end of the array'''
    if self._n == self._capacity: # not enough room
      self._resize(2 * self._capacity) # double capacity
    self._A[self._n] = obj
    self._n += 1
    
  def _resize(self, c): # nonpublic utility
    'Resize internal array to capacity c'
    B = self._make_array(c) # new (bigger) array
    for k in range(self._n): # for each existing value
      B[k] = self._A[k]
    self._A = B # use the bigger array
    self._capacity = c
    
  def _make_array(self, c): # nonpublic utility
    '''Return new array with capacity c'''
    return (c * ctypes.py_object)()
  
  def insert(self, k, value):
    '''insert value at index k, shifting subsequent values rightward'''
    # for simplicity, we assume 0<=k<=n in this version
    if self._n == self._capacity: # not enough room
      self._resize(2 * self._capacity) # so double capacity
    for j in range(self._n, k, -1): # shift rightmost first
      self._A[j] = self._A[j-1]
    self._A[k] = value # store newest element
    self._n += 1
    
  # add the remove method below
  def remove(self, value):
    '''Remove first occurrence of value (or raise ValueError)'''
    # do not consider shrinking the dynamic array in this version
    for k in range(self._n):
      if self._A[k] == value: # found a match
        for j in range(k, self._n, -1): # shift others to fill gap
          self._A[j] = self._A[j+1]
        self._A[self._n - 1] = None # help garbage collection
        self._n -= 1 # we have one less item
        return # exit immediately
    raise ValueError('value not found') # only reached if no match

##5.4.2 Python's `String` Class

In [0]:
# WARNING: do NOT od this
letters = '' # start with empty string
for c in document:
  is c.isalpha():
    letter += c # concatenate alphabetic character

Although this code accomplishes the goal, it may be terribly inefficient. Because strings are immutable, the command, `letter += c`, would presumably compute the contatenation, `letter + c`, as a new string instance and then reassign the identifier, `letters`, to that result. If the final result has $n$ characters, the series of concatenations would take $O(n^2)$ time. A more standard Python to guarantee linear time composition of a string is to use a temporary list to store individual pieces, and then to rely on the `join` method of the `str` class to compose the final result:

In [0]:
temp = [] # start with empty list
for c in document:
  if c.isalpha():
    temp.append(c) # append alphabetic character
letters = ''.join(temp) # compose overall result

This approach is guaranteed to run in $O(n)$ time. The series of up to $n$ append calls will require a total of $O(n)$ time, as per the definition of the amortized cost of that operation. The final call to `join` also guarantees that it takes time that is linear in the final length of the composed string.

To improve the execution time, *list comprehension* syntax is recommended:

In [0]:
letters = ''.join([c for c in document if c.isalpha()])

Better yet, use a generator comprehension to avoid the temporary list:

In [0]:
letters = ''.join(c for c in document if c.isalphs())

#5.5 Array-Based Sequences

##5.5.1 Storing High Scores for a Game
This application is representative of many applications in which a sequence of objects must be stored. \\
`_score` represents the score itself. `_name` represents the name of the person earning this score... etc


####`GameEntry` class

In [0]:
class GameEntry:
  '''Represents one entry of a list of high scores'''
  
  def __init__(self, name, score):
    self._name = name
    self._score = score
    
  def get_name(self):
    return self._name
  
  def get_score(self):
    reutrn self._score
    
  def __str__(self):
    return '({0}, {1})'.format(self._name, self._score) # e.g. '(Bob, 98)'

###A Class for High Scores
To maintain a sequence of high scores, a class `Scoreboard` is developed, which is limited to a certain number of high scores that can be saved; once that limit is reached, a new score only quantifies for the scoreboard if it is strictly higher that the lowest "high score" on the board.

A Python `list` named `_board` is used to manage the `GameEntry` instances that represent the high scores. This is initially set all entries to `None`. As entries are added, the board will be maintained from highest to lowest score.

####`Scoreboard` Class

In [0]:
class Scoreboard:
  '''Fixed-length sequence of high scores in nondecreasing order'''
  
  def __init__(self, capacity=10):
    '''
    Initialize scoreboard with given maximum capacity
    All entries are initially None
    '''
    self._board = [None] * capacity # reserve space for future scores
    self._n = 0 # number of actual entries
    
  def __getitem__(self, k):
    '''Return entry at index k'''
    return self._board[k]
  
  def __str__(self):
    '''Return string representation of the high score list'''
    return '\n'.join(str(self._board[j]) for j in range(self._n))
  
  def add(self, entry):
    '''Consider adding entry to high scores'''
    score = entry.get_score()
    
    # Does new entry qualify as a high score?
    # yes if board not full or score is higher than last entry
    good = self._n < len(self._board) or score > self._board[-1].get_score()
    
    if good:
      if self._n < len(self._board): # no score drops from list
        self._n += 1 # so overall number increases
        
        # shift lower scores rightward to make room for new entry
        j = self._n - 1
        while j > 0 and self._board[j-1].get_score() < score:
          self._board[j] = self._board[j-1] # shift entry from j-1 to j
          j -= 1 # and decrement j
        self._board[j] = entry

##5.5.2 Sorting a Sequence

###Insertion-Sort Algorithm
Start with the first element in the array. One element by itself is already sorted. For next element, if it is smaller than the first, then swap them. Then swap the third element leftward until it is in it proper order with the first two elements. Then next element... until every element in the whole array is sorted.

####Algorithm: InsertionSort(A):
Input: An array `A` of n comparable elements \\
Output: The array `A` with elements rearranged in nondecreasing order \\
`for k from 1 to n-1 do` \\
$\quad$ `Insert A[k] at its proper location within A[0], A[1],...,A[k]`

In [0]:
def insertion_sort(A):
  '''Sort list of comparable elements into nondecreasing order'''
  for k in range(1, len(A)): # from 1 to n-1
    cur = A[k] # current element to be inserted
    j = k # find correct index j for current
    while j>0 and A[j-1] > cur: # element A[j-1] must be after current
      A[j] = A[j-1]
      j -= 1
    A[j] = cur # cur is now in the right place

The nested loops of insertion-sort lead to an $O(n^2)$ running time in the worst
case. The most work is done if the array is initially in reverse order. On the other
hand, if the initial array is nearly sorted or perfectly sorted, insertion-sort runs in
$O(n)$ time because there are few or no iterations of the inner loop.

##5.5.3 Simple Cryptography

###Converting Between `String` and `Char` Lists
Given that strings are immutable, there is not way to directly edit an instance to encrypt it. A convenient technique for string transformations is to create an equivalent list of characters, eidt the list, and then reassemble a (new) string based on the list.

In [0]:
list('bird')

['b', 'i', 'r', 'd']

In [0]:
''.join(list('bird'))

'bird'

###Using Characters as Array Indices
Number letters like array indices: A is 0, B is 1, C is 2, etc...

Thus, the Caesar cipher can be written with a rotation of $r$ as a simple formula: Replace each letter $i$ with the letter $(i+r)\mod 26$, where mod is the **modulo** operator, whcih returns the remainder after performning an integer division. The decryption for Caesar cipher is the opposite: replace each letter with the one $r$ places before it with wrap around. Letter $i$ is replaced by letter $(i-r)\mod 26$.

Characters are represented in Unicode by integer code points, and the code points for the uppercase letters of the Latin alphabet are consecutive. The function `ord(c)` takes a one-character string as a parameter and returns the ineger code point for that character. Conversely, the function `chr(j)`  takes an integer and returns its associated one-character string.

in order to find a replacement for a character in Caesar cipher, need to map the characters `'A'` to `'Z'` to the respective numbers 0 to 25 using the formula: `j = ord(c) - ord('A')`

####Caesar Cipher Implementation

In [0]:
class CaesarCipher:
  '''Class for doing encryption and decryption using a Caesar cipher'''
  
  def __init__(self, shift):
    '''Consttruct Caesar cipher using given integer shift for rotation'''
    encoder = [None] * 26 # temp array for encryption
    decoder = [None] * 26 # temp array for decyrption
    for k in range(26):
      encoder[k] = chr((k + shift) % 26 + ord('A'))
      decoder[k] = chr((k - shift) % 26 + ord('A'))
    self._forward = ''.join(encoder) # will store as string
    self._backward = ''.join(decoder) # since fixed
    
  def encrypt(self, message):
    '''Return string representing encrpyted message'''
    return self._transform(message, self._forward)
  
  def decrypt(self, secret):
    '''Return decrypted message given encrypted secret'''
    return self._transform(secret, self._backward)
  
  def _transform(self, original, code):
    '''Utility to perform transformation based on given code string'''
    msg = list(original)
    for k in range(len(msg)):
      if msg[k].isupper():
        j = ord(msg[k]) - ord('A') # index from 0 to 25
        msg[k] = code[j] # replace this character
    return ''.join(msg)
  
  
if __name__ == '__main__':
    cipher = CaesarCipher(3)
    message = 'THE EAGLE IS IN PLAY; MEET AT JOE\'S.'
    coded = cipher.encrypt(message)
    print('Secret: ', coded)
    answer = cipher.decrypt(coded)
    print('Message: ', answer)

Secret:  WKH HDJOH LV LQ SODB; PHHW DW MRH'V.
Message:  THE EAGLE IS IN PLAY; MEET AT JOE'S.


#5.6 Multidimensional Data Sets
To properly initialize a two-dimesnional list, each cell of the primary list refers to an independent instance of a secondary list.

In [0]:
data = [ [0] * c for j in range(r)]

####Two-Dimensional Arrays and Positional Games - Tic-Tac-Toe


In [0]:
class TicTacToe:
  '''Management of a Tic-Tac-Toe game (does not do strategy)'''
  
  def __init__(self):
    '''Start a new game'''
    self._board = [ [' '] * 3 for j in range(3)]
    self._player = 'X'
    
  def mark(self, i, j):
    '''Put an X or O mark at position (i,j) for next player\'s turn'''
    if not (0 <= i <= 2 and 0 <= j <= 2):
      raise ValueError('Invalid board position')
    if self._board[i][j] != ' ':
      raise ValueError('Board position occupied')
    if self.winner() is not None:
      raise ValueError('Game is already complete')
    
    self._board[i][j] = self._player
    if self._player == 'X':
      self._player = 'O'
    else:
      self._player = 'X'
      
  def _is_win(self, mark):
    '''Check whether the board configuration is a win for the given player'''
    board = self._board # local variable for shorthand
    return (mark == board[0][0] == board[0][1] == board[0][2] or # row 0
            mark == board[1][0] == board[1][1] == board[1][2] or # row 1
            mark == board[2][0] == board[2][1] == board[2][2] or # row 2
            mark == board[0][0] == board[1][0] == board[2][0] or # column 0
            mark == board[0][1] == board[1][1] == board[2][1] or # column 1
            mark == board[0][2] == board[1][2] == board[2][2] or # column 2
            mark == board[0][0] == board[1][1] == board[2][2] or # diagonal
            mark == board[0][2] == board[1][1] == board[2][0]) # rev diag
  
  def winner(self):
    '''Return mark of winning player, or None to indicate a tie'''
    for mark in 'XO':
      if self._is_win(mark):
        return mark
    return None
  
  def __str__(self):
    '''Return string representation of current game board'''
    rows = ['|'.join(self._board[r]) for r in range(3)]
    return '\n-----\n'.join(rows)

In [0]:
game = TicTacToe( )
# X moves: # O moves:
game.mark(1, 1); game.mark(0, 2)
game.mark(2, 2); game.mark(0, 0)
game.mark(0, 1); game.mark(2, 1)
game.mark(1, 2); game.mark(1, 0)
game.mark(2, 0)

In [0]:
print(game)

O|X|O
-----
O|X|X
-----
X|O|X


In [0]:
winner = game.winner()
if winner is None:
  print('Tie')
else:
  print(winner, 'wins')

Tie
