# Data Structures and Algorithms in Python - Ch.5: Array-based Sequences
### AJ Zerouali, 2023/09/01

## 0) Introduction

These notes deal with built-in arrays in Python, namely lists, tuples and strings. 

**References:**

- Chapter 5 of "Data structures and algorithms in Python", by Goodrich, Tamassia and Goldwasser (primary). 
- Section 12 of "Python for Data Structures, Algorithms, and Interviews!" by Jose Portilla.

We skip section 5.1 here, as it only gives some generic overview of this chapter.

#### To do (23/09/01):


## 1) Low-level arrays

Section 2.1 just covers the basics of how arrays are stored in the computer memory.

Notes:
- Mostly about referencing.
- Python lists are acting more like arrays of pointers. This means that when extending a list or slicing it, we are in fact only changing the memory addresses and really copying the entries.
- The above is what is called a "shallow copy" in Python. When we duplicate an object in the memory, this is what we call a "deep copy" (c.f. Python's *copy* module).
- Assigning a value to a given list index is $O(1)$.
- Accessing a fixed index in an array is done in $O(1)$ time.
- Lists, tuples and strings are implemented in Python so that the two statements above are true.

## 2) Dynamic arrays and amortization

### 2.a - Dynamic arrays

As the name suggests, a dynamic array is one whose size and memory allocation are variable. For practical verifications in Python, we can use the *sys* module and the *getsizeof()* function to get the memory size of a given array.

#### Example 2.1:

The sizes below are in bytes. Note how as we grow the array, Python grows the allocated memory in chunks.

In [1]:
import sys

In [3]:
array = []
print(f"Empty array size:")
print(f"sys.getsizeof(array) = {sys.getsizeof(array)}.")

for i in range(17):
    array.append(i)
    print(f"len(array) = {len(array)}; sys.getsizeof(array) = {sys.getsizeof(array)}.")

Empty array size:
sys.getsizeof(array) = 56.
len(array) = 1; sys.getsizeof(array) = 88.
len(array) = 2; sys.getsizeof(array) = 88.
len(array) = 3; sys.getsizeof(array) = 88.
len(array) = 4; sys.getsizeof(array) = 88.
len(array) = 5; sys.getsizeof(array) = 120.
len(array) = 6; sys.getsizeof(array) = 120.
len(array) = 7; sys.getsizeof(array) = 120.
len(array) = 8; sys.getsizeof(array) = 120.
len(array) = 9; sys.getsizeof(array) = 184.
len(array) = 10; sys.getsizeof(array) = 184.
len(array) = 11; sys.getsizeof(array) = 184.
len(array) = 12; sys.getsizeof(array) = 184.
len(array) = 13; sys.getsizeof(array) = 184.
len(array) = 14; sys.getsizeof(array) = 184.
len(array) = 15; sys.getsizeof(array) = 184.
len(array) = 16; sys.getsizeof(array) = 184.
len(array) = 17; sys.getsizeof(array) = 248.


In [4]:
?sys.getsizeof

[0;31mDocstring:[0m
getsizeof(object [, default]) -> int

Return the size of object in bytes.
[0;31mType:[0m      builtin_function_or_method


_______________________________________________________________

The main takeaway here is that Python's list class is a dynamic array implemented in a memory efficient way. 

The second example is an implementation of a dynamic array.

### Example 2.2:

This example  is after code fragment 5.3 of Goodrich-Tamassia-Goldwasser (see also Portilla's lecture 48). Instead of using a Python array, we implement our array as a *ctypes.py_object()*. The documentation for the latter can be found 
here:

https://docs.python.org/3/library/ctypes.html#loading-dynamic-link-libraries

(This class seems to )

In [2]:
import ctypes
import sys

In [3]:
class DynamicArray(object):
    '''
        Basic class for a dynamic array.
    '''
    def __init__(self):
        
        self.n_elem = 0
        self.capacity = 1
        self.A = self._make_array(self.capacity)
    
    def __len__(self):
        '''
            Return length of array
        '''
        return self.n_elem
    
    def __getitem__(self, idx):
        '''
            Return content of array at index idx
        '''
        if (idx<0) or (idx >self.n_elem):
            return IndexError(f"idx = {idx} is out of range")
        
        return self.A[idx]
    
    def _resize(self, new_size):
        '''
            Set new array capacity to new_size
        '''
        B = self._make_array(new_size)
        for i in range(self.n_elem):
            B[i] = self.A[i]
        
        self.A = B
        self.capacity = new_size
        
    
    def _make_array(self, capacity):
        '''
            Instantiate an array of size "capacity"
        '''
        return (capacity*ctypes.py_object)()
    
    def append(self, elem):
        '''
            Append "elem" to array
        '''
        if self.capacity == self.n_elem:
            self._resize(2*self.capacity)
        
        self.A[self.n_elem] = elem
        self.n_elem += 1
            
    
    

In [4]:
arr = DynamicArray()
print(f"len(arr) = {len(arr)}; arr.capacity = {arr.capacity}; sys.getsizeof(arr) = {sys.getsizeof(arr)}.\n")

for i in range(16):
    print(f"i = {i}:")
    print(f"-------")
    arr.append(i)
    print(f"len(arr) = {len(arr)}; arr.capacity = {arr.capacity}; sys.getsizeof(arr) = {sys.getsizeof(arr)}.\n")
    

len(arr) = 0; arr.capacity = 1; sys.getsizeof(arr) = 48.

i = 0:
-------
len(arr) = 1; arr.capacity = 1; sys.getsizeof(arr) = 48.

i = 1:
-------
len(arr) = 2; arr.capacity = 2; sys.getsizeof(arr) = 48.

i = 2:
-------
len(arr) = 3; arr.capacity = 4; sys.getsizeof(arr) = 48.

i = 3:
-------
len(arr) = 4; arr.capacity = 4; sys.getsizeof(arr) = 48.

i = 4:
-------
len(arr) = 5; arr.capacity = 8; sys.getsizeof(arr) = 48.

i = 5:
-------
len(arr) = 6; arr.capacity = 8; sys.getsizeof(arr) = 48.

i = 6:
-------
len(arr) = 7; arr.capacity = 8; sys.getsizeof(arr) = 48.

i = 7:
-------
len(arr) = 8; arr.capacity = 8; sys.getsizeof(arr) = 48.

i = 8:
-------
len(arr) = 9; arr.capacity = 16; sys.getsizeof(arr) = 48.

i = 9:
-------
len(arr) = 10; arr.capacity = 16; sys.getsizeof(arr) = 48.

i = 10:
-------
len(arr) = 11; arr.capacity = 16; sys.getsizeof(arr) = 48.

i = 11:
-------
len(arr) = 12; arr.capacity = 16; sys.getsizeof(arr) = 48.

i = 12:
-------
len(arr) = 13; arr.capacity = 16; sys.get

In [9]:
arr[15]

15

An important point to note in this example is the array resizing operation, which is done in $O(n)$ time in this implementation. The next subsection will introduce some tools to analyse why and how doubling the array size periodically is more efficient than incrementing the size by 1 at each appension.

### 2.b - Amortized analysis