# Dynamic Arrays: A Simple Data Structure

These notes will supplement section 17.4 of CLRS book, which covers a lot more than this chapter.


A dynamic array is simply an array but it can grow in size to accomodate new elements that are added. 

## Arrays 

An array is a "contiguous chunk" of random access memory in a computer.  
  - Random Access: We can access the individual cells of the array as `a[1]`, `a[2]`, ..., `a[n]`, where  $n$ is the size of the array. 
  - Reading or writing to the memory element at index `j`  takes $O(1)$ time. 
  
Our goal is to maintain an array of $n$ elements and support the following operations:
  - Reading/Writing to a particular index $j$ where $1 \leq j \leq n$.
  - Adding a new element at the end of the array: the size of the array will become $n+1$ as a result.
  
  
## Memory Allocator

The main difficulty in implementing an array data structure lies in how a process in a computer obtains memory. In all computer operating systems there is a memory management module that allocates memory to running programs. Programs can request a "contiguous chunk" of $k$ memory cells using an "allocation" function. This function is setup differently in various programming languages. For instance, in python, we can allocate an array of size `k` all initialized with $0$s as follows:

~~~
a = [0]* k
~~~

Note however, that lists in python are already a "dynamic array" implemented more or less in the same manner that we are going to describe here. 

https://stackoverflow.com/questions/3917574/how-is-pythons-list-implemented

The curious reader may ask about "deallocation" or "freeing" memory. We note that 
in some programming environments like C/C++ this is required for the program to explicitly tell the operating system that a particular chunk of memory that was previously allocated is no longer needed. However, python is a _garbage collected_ language. I.e, the python runtime manages memory and decides that a chunk of memory is no longer needed/can be freed. The details of garbage collection are beyond the scope of this course. 

In [None]:
# Allocate a new memory of size `size`
def allocateMemory(size): 
    assert size >= 1
    return [0]*size

# Copy the contents of old list into new
def copyInto(old, new):
    assert len(old) <= len(new), 'Not enough space to copy into'
    m = len(old)
    for i in range(m):
        new[i] = old[i]

We will now implement the `DynamicArray` data structure as a Python class. 
It will have three fields:
  - `array`: the overall memory that has been allocated.
  - `allocated_size`: How much is the allocated size?
  - `size`: what is the actual size of the array?
  
Note that `allocated_size` is always larger than the actual size. For instance, 
`allocated_size=32` means that `32` cells have been allocated. However, `size=10` means that only 10 out of the 32 cells are used by the array.



In [None]:
class DynamicArray: 
    
    def __init__(self, initial_size=16, initial_fill=0, debug=False):
        self.allocated_size = initial_size 
        self.size = 0
        self.array = [initial_fill] * initial_size
        self.debug = debug
    
    # This allows us to directly access d[idx]
    def __getitem__(self, idx):
        assert idx >= 0 and idx < self.size 
        return self.array[idx]
    
    # This allows us to write d[idx] = val 
    def __setitem__(self, idx, val):
        assert idx >= 0 and idx < self.size 
        self.array[idx] = val
    
    def append(self, x):
        # Do we have enough allocated size to just append x to the array?
        if self.size >= self.allocated_size:
            if self.debug: 
                print(f'Ran out of memory: old allocated size: {self.allocated_size}, new allocated size is {2*self.allocated_size}')
            # No, we have run out of pre-allocated memory
            # Double the size of the array 
            # Double the size of the allocated memory
            self.allocated_size = 2 * self.allocated_size
            old_array = self.array
            # allocate and copy.
            new_array = allocateMemory(self.allocated_size)
            copyInto(old_array, new_array)
            # update the array.
            self.array = new_array
        # Append the element to the end
        self.array[self.size] = x
        # Update its size.
        self.size = self.size + 1
    

In [None]:
l = DynamicArray(initial_size=1, initial_fill=0, debug=True)
for j in range(1000):
    l.append(j)
print(f'l[5] = {l[5]}')
l[0] = 30
print(f'l[0] = {l[0]}')


Ran out of memory: old allocated size: 1, new allocated size is 2
Ran out of memory: old allocated size: 2, new allocated size is 4
Ran out of memory: old allocated size: 4, new allocated size is 8
Ran out of memory: old allocated size: 8, new allocated size is 16
Ran out of memory: old allocated size: 16, new allocated size is 32
Ran out of memory: old allocated size: 32, new allocated size is 64
Ran out of memory: old allocated size: 64, new allocated size is 128
Ran out of memory: old allocated size: 128, new allocated size is 256
Ran out of memory: old allocated size: 256, new allocated size is 512
Ran out of memory: old allocated size: 512, new allocated size is 1024
l[5] = 5
l[0] = 30


Suppose we have appended $n$ elements so far to the array. How many times have we had to grow, assuming the initial allocated size is $1$?

- Allocated size grows from 1 to 2
- Allocated size grows from 2 to 4
- ..
- Allocated size grows from $2^{k}$ to $2^{k+1}$ where $n< 2^{k+1}$.

In other words, we grow $k+1$ times, where $2^{k} \leq n < 2^{k+1}$.
We conclude that $k = \lceil \log_2(n) \rceil$.

However, when we reallocate from size $m$ to $2m$, we have to copy over $m$ elements of the array. Therefore, the total work required to append $n$ elements is given by :

- Allocated size grows from 1 to 2 -- `1 unit of time`.
- Allocated size grows from 2 to 4 -- `2 units of time`
- ..
- Allocated size grows from $2^{k}$ to $2^{k+1}$ where $n< 2^{k+1}$ - $2^k$ ` units of time`.

Total time needed for all the reallocations: $1 + 2 + \cdots 2^k = 2^{k+1} -1  \leq  2 n -1 $.

Also, each append requires $1$ unit of time to copy the element and update the size.

Thus, adding all of it up: appending $n$ elements from scratch requires $3n$ units of time.
