# The Array-Backed List

## Agenda

1. The List **Abstract Data Type** (ADT)
2. A List **Data Structure**
3. NumPy arrays
4. The `ArrayList` data structure
5. Runtime analysis

## 1. The List **Abstract Data Type** (ADT)

An **abstract data type (ADT)** defines a *conceptual model* for how data may be stored and accessed.

A **list ADT** is a data container where:

- values are ordered in a *sequence*
- each value has at most one preceding and one succeeding value
- a given value may appear more than once in a list

## 2. A List **Data Structure**

A **list data structure** is a *concrete implementation* of the list ADT in some programming language, which, in addition to adhering to the basic premises of the ADT, will also typically support operations that:

- access values in the list by their position (index)
- append and insert new values into the list
- remove values from the list

The implementation of any data structure will generally rely on simpler, constituent data types (e.g., "primitive" types offered by the language), the choice of which may affect the runtime complexities of said operations.

## 3. NumPy arrays

Python does not come with a built-in array type. Instead, we're going to make use of the array implementation provided by the [NumPy scientific computing package](https://numpy.org/doc/stable/user/absolute_beginners.html).

To create a NumPy array of size N, we can do:

In [None]:
import numpy as np

N = 10
arr = np.empty(N, dtype=object)
arr

The `dtype=object` specification indicates that we want to use the array to store references to arbitrary Python objects. The `empty` function creates an array of the specified size, but leaves all elements uninitialized.

In [None]:
for i in range(5):
    arr[i] = i
arr

In [None]:
arr[0]  = 'hello'
arr[4] = 'world'
arr

In [None]:
len(arr)

In [None]:
arr = np.random.random(N)
arr

Recall that arrays are **fixed-size**, so we cannot append, insert, or delete elements to/from them directly. These operations must be implemented by the data structure we build *on top of* the array.

## 4. The `ArrayList` data structure

In [None]:
import numpy as np

class ArrayList:
    def __init__(self):
        self.data = np.empty(1, dtype=object)
        self.size = 0

    def append(self, value, doubling=True):
        if self.size == len(self.data):
            if doubling:
                nsize = 2 * len(self.data)
            else:
                nsize = len(self.data) + 1

            ndata = np.empty(nsize, dtype=object)
            for i in range(len(self.data)):
                ndata[i] = self.data[i]
            self.data = ndata
            
        self.data[self.size] = value
        self.size += 1
    
    def printarray(self):
        print(self.data)

In [None]:
l = ArrayList()
l.printarray()

In [None]:
l.append(1,False)
l.printarray()

In [None]:
l.append(2,False)
l.printarray()

In [None]:
l.append(3,False)
l.printarray()

In [None]:
l.append(4,False)
l.printarray()

In [None]:
l.append(5,False)
l.printarray()

In [None]:
l.append(6,False)
l.printarray()

In [None]:
l.append(7,False)
l.printarray()

In [None]:
l.append(8,False)
l.printarray()

In [None]:
l = ArrayList()
l.printarray()

In [None]:
l.append(1,True)
l.printarray()

In [None]:
l.append(2,True)
l.printarray()

In [None]:
l.append(3,True)
l.printarray()

In [None]:
l.append(4,True)
l.printarray()

In [None]:
l.append(5,True)
l.printarray()

In [None]:
l.append(6,True)
l.printarray()

In [None]:
l.append(7,True)
l.printarray()

In [None]:
l.append(8,True)
l.printarray()

In [None]:
l.data

In [None]:
# plot average runtime of `append` for doubling vs. non-doubling strategies,
# as a function of the total number of elements appended

import timeit
import matplotlib.pyplot as plt

ns = np.linspace(1, 100, 50, dtype=int)
ts1 = [timeit.timeit(stmt=f'for x in range({n}): lst.append({n}, doubling=True)', 
                     setup=f'lst = ArrayList()',
                     globals=globals(), 
                     number=10)
       for n in ns]
ts2 = [timeit.timeit(stmt=f'for x in range({n}): lst.append({n}, doubling=False)', 
                     setup=f'lst = ArrayList()',
                     globals=globals(), 
                     number=10)
       for n in ns]

plt.plot(ns, ts1, 'ob')
plt.plot(ns, ts2, 'or')
plt.show()

## 5. Runtime analysis

- Indexing: $O(?)$

- Search (unsorted): $O(?)$

- Search (sorted): $O(?)$

- Deletion: $O(?)$

- Append: $O(?)$
         
- Prepend: $O(?)$