The Array-Backed List
=====================



## Agenda



1.  The List **Abstract Data Type** (ADT)
2.  A List **Data Structure**
3.  Our List API
4.  Getting started: how to store our data?
5.  Built-in `list` as array
6.  The `ArrayList` data structure



## 1.  The List **Abstract Data Type** (ADT)



An **abstract data type (ADT)** defines a *conceptual model* for how data
may be stored and accessed.

A **list ADT** is a data container where:

-   values are ordered in a *sequence*
-   each value has at most one preceding and one succeeding value
-   a given value may appear more than once in a list

Other common ADTs (some of which we'll explore later) include:

-   Stacks
-   Queues
-   Priority Queues
-   Maps
-   Graphs



## 2.  A List **Data Structure**



A **list data structure** is a *concrete implementation* of the list ADT
in some programming language, which, in addition to adhering to the
basic premises of the ADT, will also typically support operations that:

-   access values in the list by their position (index)
-   append and insert new values into the list
-   remove values from the list

The implementation of any data structure will generally rely on simpler,
constituent data types (e.g., "primitive" types offered by the
language), the choice of which may affect the runtime complexities of
said operations.



## 1.  The List API



The operations we'll be building into our list data structures will be
based on the
[common](https://docs.python.org/3.6/library/stdtypes.html#common-sequence-operations)
and
[mutable](https://docs.python.org/3.6/library/stdtypes.html#mutable-sequence-types)
sequence operations defined by the Python library.



In [3]:
class List:
    ### subscript-based access ###

    def __getitem__(self, idx):
        """Implements `x = self[idx]`"""
        pass

    def __setitem__(self, idx, value):
        """Implements `self[idx] = x`"""
        pass

    def __delitem__(self, idx):
        """Implements `del self[idx]`"""
        pass

    ### stringification ###

    def __repr__(self):
        """Supports inspection"""
        return '[]'

    def __str__(self):
        """Implements `str(self)`"""
        return '[]'

    ### single-element manipulation ###

    def append(self, value):
        pass

    def insert(self, idx, value):
        pass

    def pop(self, idx=-1):
        pass

    def remove(self, value):
        pass

    ### predicates (T/F queries) ###

    def __eq__(self, other):
        """Implements `self == other`"""
        return True

    def __contains__(self, value):
        """Implements `val in self`"""
        return True

    ### queries ###

    def __len__(self):
        """Implements `len(self)`"""
        return len(self.data)

    def min(self):
        pass

    def max(self):
        pass

    def index(self, value, i, j):
        pass

    def count(self, value):
        pass

    ### bulk operations ###

    def __add__(self, other):
        """Implements `self + other_array_list`"""
        return self

    def clear(self):
        pass

    def copy(self):
        pass

    def extend(self, other):
        pass

    ### iteration ###

    def __iter__(self):
        """Supports iteration (via `iter(self)`)"""
        pass

## 1.  Getting started: how to store our data?



In [4]:
class List:
    def ini():
        pass

    def append(self, value):
        pass

    def __getitem__(self, idx):
        """Implements `x = self[idx]`"""
        pass

    def __setitem__(self, idx, value):
        """Implements `self[idx] = x`"""
        pass

    def __repr__(self):
        """Supports inspection"""
        pass

## 1.  Built-in `list` as array



To use the built-in list as though it were a primitive array, we will
constrain ourselves to just the following APIs on a given list `lst`:

1.  `lst[i]` for getting and setting values at an *existing, positive*
    index `i`
2.  `len(lst)` to obtain the number of slots
3.  `lst.append(None)` to grow the list by *one slot at a time*
4.  `del lst[len(lst)-1]` to delete the last slot in a list



## 1.  The `ArrayList` data structure



In [5]:
class MyArrayList:
    def __init__(self):
        self.data = []

    def append(self, value):
        self.data.append(value)

    def __getitem__(self, idx):
        """Implements `x = self[idx]`"""
        assert(isinstance(idx, int)) # checks if the value is an integer
        self.data[idx]

    def __setitem__(self, idx, value):
        """Implements `self[idx] = x`"""
        assert(isinstance(idx, int))
        self.data[idx] = value

    def __delitem__(self, idx):
        """Implements `del self[idx]`"""
        assert(isinstance(idx, int))
        for i in range(idx+1, len(self.data)):
            self.data[i-1] = self.data[i]
        del self.data[len(self.data)-1]

    def __len__(self):
        """Implements `len(self)`"""
        len(self.data)

    def __repr__(self):
        """Supports inspection"""
        return "[" + ",".join([str(x) for x in self.data]) + "]"

In [6]:
x = MyArrayList()
x.append(1)
x.append(2)
x

[1,2]

In [7]:
y = MyArrayList()
y.append(1)
y.append(3)
y.append(4)
y
del y[0]
y
print(y)

[3,4]


Recall the **Array API**
- create array of size =n=
- access element at position =i=
- set the element at position =i=
- get length of array =len(array)=

In [8]:
class MyArray:

    def __init__(self, n):
        self.data = [None] * n
        self.len = n

    def __getitem__(self, idx):
        """Implements `x = self[idx]`"""
        assert(isinstance(idx, int) and self.len > idx)
        return self.data[idx]

    def __setitem__(self, idx, value):
        """Implements `self[idx] = x`"""
        assert(isinstance(idx, int) and self.len > idx)
        self.data[idx] = value

    def __len__(self):
        """Implements `len(self)`"""
        return self.len

    def __repr__(self):
        """Supports inspection"""
        return "[" + ",".join([str(x) for x in self.data]) + "]"

In [9]:
x = MyArray(3)
x[0] = 'a'
x[1] = 'b'
x[2] = 'c'


In [10]:
class MyActualArrayList:

    def __init__(self, n=1):
        self.data = MyArray(n)
        self.len = n

    def append(self, value):
        if len(self.data) <= self.len:
            newa = MyArray(len(self.data) * 2) # n
            for i in range(0, self.len): # n
                newa[i] = self.data[i] # n
            self.data = newa
        self.data[self.len] = value
        self.len += 1

        ### This still has the issue of being O(n^2) becayse we increase the array size every single time -> a better way to do it is to double the size of the array every time we overstep the bounds

    def __getitem__(self, idx):
        """Implements `x = self[idx]`"""
        assert(isinstance(idx, int)) # checks if the value is an integer
        return self.data[idx]

    def __setitem__(self, idx, value):
        """Implements `self[idx] = x`"""
        assert(isinstance(idx, int))
        self.data[idx] = value

    def __delitem__(self, idx):
        """Implements `del self[idx]`"""
        assert(isinstance(idx, int))
        for i in range(idx+1, self.len):
            self.data[i-1] = self.data[i]
        #newa = MyArray(self.len - 1)
        #for i in range(0, len(self.data) - 1):
        #    newa[i] = self.data[i]
        #self.data = newa
        #self.len -= 1

    def __len__(self):
        """Implements `len(self)`"""
        return len(self.data)

    def __repr__(self):
        return self.data.data[0:self.len]

In [14]:
x = MyActualArrayList()
for i in range(1000):
    x.append(1)
len(x)

1024

Proof by induction: "prove that it is true for n and n+1"

* base case: 2^0 = 1 = 2^1 - 1 = 1
* we want to prove: sigma(2^i) from 0 to n = 2^n+1 - 1
* sigma(2^i) from 0 to n+1 = sigma(2^i) from 0 to *plus* 2^n+1 -> 2^n+1 - 1 + 2^n+1 -> 2*2^m+1 -1 = 2^m+2 - 1 QED
* doubling the list when needed makes it O(n)