# Class 13: Reinforcement of linear data structures

# (Re-)Implementing a list data structure as `PyList`

This material has been adapted from [Lee & Hubbard (2015) Chapter 4](https://learn.colorado.edu/d2l/le/content/190526/viewContent/2885050/View).

In [None]:
[None] * 10

In [13]:
class PyList(object):
    
    def __init__(self,contents=[],size=10):
        self.items = [None] * size
        self.numItems = 0
        self.size = size

        for content in contents:
            self.append(content)
            
    def __getitem__(self,index):
        if index >= 0 and index < self.numItems:
            return self.items[index]
        else:
            raise IndexError("PyList out of range")
            
    def __setitem__(self,index,val):
        if index >= 0 and index < self.numItems:
            self.items[index] = val
            return
        else:
            raise IndexError('PyList assignment index out of range')
            
    def __add__(self,other):
        result = PyList(size=self.numItems+other.numItems)
        
        for i in range(self.numItems):
            result.append(self.items[i])
        
        for i in range(other.numItems):
            result.append(other.items[i])
            
        return result
    
    def __makeroom(self):
        newlen = (self.size // 4) + self.size + 1
        newlst = [None] * newlen
        
        for i in range(self.numItems):
            newlst = self.items[i]
            
        self.items = newlist
        self.size = newlen
        
    def append(self,item):
        if self.numItems == self.size:
            self.__makeroom()
            
        self.items[self.numItems] = item
        self.numItems += 1
    
    def insert(self,i,e):
        if self.numItems == self.size:
            self.__makeroom()
            
        if i < self.numItems:
            for j in range(self.numItems-1,i-1,-1):
                self.items[j+1] = self.items[j]
                
            self.items[i] = e
            self.numItems +=  1
        else:
            self.append(e)
    
    def __delitem__(self,index):
        for i in range(index, self.numItems-1):
            self.items[i] = self.items[i+1]
            
        self.numItems -= 1
        
    def __len__(self):
        return self.numItems
    
    def __str__(self):
        s = "["
        
        for i in range(self.numItems):
            s = s + str(self.items[i])
            if i < self.numItems - 1:
                s = s + ", "
                
        s = s + "]"
        return s
    
    # Uncomment this to see something interesting
    #def __repr__(self):
    #    s = "PyList(["
    #    
    #    for i in range(self.numItems):
    #        s = s + repr(self.items[i])
    #        if i < self.numItems - 1:
    #            s = s + ", "
    #            
    #    s = s + "])"
    #    return s

In [14]:
test_pylist = PyList(['a','b','c'])
test_pylist

<__main__.PyList at 0x106b5f4e0>

In [4]:
test_pylist.items

['a', 'b', 'c', None, None, None, None, None, None, None]

In [5]:
test_pylist[0]

'a'

In [6]:
test_pylist[2] = 'two'
test_pylist.items

['a', 'b', 'two', None, None, None, None, None, None, None]

In [7]:
test_pylist.append('delta')
test_pylist.items

['a', 'b', 'two', 'delta', None, None, None, None, None, None]

In [8]:
test_pylist.insert(1,'inserted')
test_pylist.items

['a', 'inserted', 'b', 'two', 'delta', None, None, None, None, None]

In [9]:
del test_pylist[1]
test_pylist.items

['a', 'b', 'two', 'delta', 'delta', None, None, None, None, None]

In [10]:
len(test_pylist)

4

In [11]:
print(test_pylist)

[a, b, two, delta]


In [12]:
test_pylist

PyList(['a', 'b', 'two', 'delta'])

# Stacks and queues

A **stack** is a data structure where the last element added is the first element retrieved ("last in, first out").

A **queue** is a data structure where the first element added is the first element retried ("first in, first out").

Stacks and queues are appealing since all their mutator methods operator in $O(1)$ time. For many kinds of computations, you only need a sequence of inputs to iterate through one at a time rather than operations on the entire list. Thus, you can use a stack or queue design to improve the efficiency of your code in some circumstances.

We can use lists in Python to implement much of the [functionality of a stack or queue](https://docs.python.org/3.5/tutorial/datastructures.html#using-lists-as-stacks). This material has been adapted from [Lee & Hubbard (2015) Chapter 4](https://learn.colorado.edu/d2l/le/content/190526/viewContent/2885050/View).

### Stack

Methods on a stack:

| Operation | Complexity | Useage | Description |
| --- | :---: | --- | --- |
| Stack creation | $O(1)$ | `s = Stack()` | Calls the constructor |
| push | $O(1)$ | `s.push(a)` | Puts the item `a` to the back the stack `s` |
| pop | $O(1)$ | `a = s.pop()` | Returns last item in `s` and removes it |
| top | $O(1)$ | `a = q.top()` | Returns top item in `s` without popping |
| is_empty | $O(1)$ | `a = q.is_empty()` | Returns `True` if stack has no pushed items |


In [15]:
class Stack(object):
    def __init__(self):
        self.items = []
        
    def pop(self):
        if self.is_empty():
            raise RuntimeError('Cannot pop an empty stack')
        top_idx = len(self.items) - 1
        item = self.items[top_idx]
        del self.items[top_idx]
        return item
    
    def push(self,item):
        self.items.append(item)
        
    def top(self):
        if self.is_empty():
            raise RuntimeError('Empty stack has no top')
            
        top_idx = len(self.items) - 1
        return self.items[top_idx]
    
    def is_empty(self):
        return len(self.items) == 0
    
    def __repr__(self):
        return 'Stack('+str(self.items)+')'

In [16]:
test_stack = Stack()
test_stack

Stack([])

In [17]:
test_stack.is_empty()

True

In [18]:
test_stack.push('a')
print(test_stack)
test_stack.push('b')
print(test_stack)
test_stack.push('c')
print(test_stack)

Stack(['a'])
Stack(['a', 'b'])
Stack(['a', 'b', 'c'])


In [19]:
test_stack.top()

'c'

In [20]:
test_stack.pop()
print(test_stack)
test_stack.pop()
print(test_stack)
test_stack.pop()
print(test_stack)

Stack(['a', 'b'])
Stack(['a'])
Stack([])


In [21]:
test_stack.pop()

RuntimeError: Cannot pop an empty stack

### Queues

Methods on a queue:

| Operation | Complexity | Useage | Description |
| --- | :---: | --- | --- |
| Queue creation | $O(1)$ | `q = Queue()` | Calls the constructor |
| enqueue | $O(1)$ | `q.enqueue(a)` | Puts the item `a` on the queue `q` |
| dequeue | $O(1)$ | `a = q.dequeue()` | Returns first item in `q` and removes it |
| front | $O(1)$ | `a = q.front()` | Returns front item in `q` without dequeueing |
| is_empty | $O(1)$ | `a = q.is_empty()` | Returns `True` if queue has no enqueued items |


In [1]:
class Queue(object):
    
    def __init__(self):
        self.items = []
        self.front_index = 0
        
    def __compress(self):
        new_list = []
        for i in range(self.front_index,len(self.items)):
            new_list.append(self.items[i])
        
        self.items = new_list
        self.front_index = 0
    
    def enqueue(self,item):
        self.items.appen d(item)

    def dequeue(self):
        if self.is_empty():
            raise RuntimeError('Cannot dequeue an empty queue')
            
        if self.front_index * 2 > len(self.items):
            self.__compress()
            
        item = self.items[self.front_index]
        self.front_index += 1
        return item
        
    def front(self):
        if self.is_empty():
            raise RuntimeError('Cannot access front of empty queue')
        else:
            return self.items[self.front_index]
        
    def is_empty(self):
        return self.front_index == len(self.items)
    
    def __repr__(self):
        return 'Queue('+str(self.items)+')'

In [2]:
test_queue = Queue()
test_queue

Queue([])

In [24]:
test_queue.enqueue('a')
print(test_queue)
test_queue.enqueue('b')
print(test_queue)
test_queue.enqueue('c')
print(test_queue)

Queue(['a'])
Queue(['a', 'b'])
Queue(['a', 'b', 'c'])


In [25]:
test_queue.front()

'a'

In [26]:
test_queue.is_empty()

False

In [27]:
test_queue.dequeue()

'a'

In [28]:
test_queue.dequeue()

'b'

In [29]:
test_queue.dequeue()

'c'

In [30]:
test_queue.dequeue()

RuntimeError: Cannot dequeue an empty queue

# Iterators and generators

Adapted from Python Practice Book's [Chapter 5](http://anandology.com/python-practice-book/iterators.html).

In [31]:
numbers = [1,2,3,4,5]

for num in numbers:
    print(num)

1
2
3
4
5


Oftentimes when we're working with large datasets, we don't need to compute all the values and store them in memory, but just need to compute them one at a time. In other words, we can iterate through a list of numbers without having to "remember" all of them. 

In [39]:
iter_numbers = iter(numbers)
iter_numbers

<list_iterator at 0x106b5f940>

We can make an iterator give us all the values hiding inside it by wrapping it in a list.

In [40]:
list(iter_numbers)

[1, 2, 3, 4, 5]

In [41]:
iter_numbers

<list_iterator at 0x106b5f940>

In [42]:
list(iter_numbers)

[]

But we mostly use iterators to spit out values one at a time. But just like the `Queue` class we implemented above, once it "dequeues" a value from the front of the queue, it's removed.

In [43]:
iter_numbers = iter(numbers)
next(iter_numbers)

1

In [44]:
next(iter_numbers)

2

In [45]:
next(iter_numbers)

3

In [46]:
next(iter_numbers)

4

In [47]:
next(iter_numbers)

5

There are no more values left in the iterator's queue, so another `next` should return an error.

In [48]:
next(iter_numbers)

StopIteration: 

Check to make sure the `iter_numbers` are well-and-truly empty.

In [51]:
list(iter_numbers)

[]

Python 3 has lots of things that are basically iterators.

In [52]:
r = range(1,6)
r

range(1, 6)

In [53]:
list(r)

[1, 2, 3, 4, 5]

In [54]:
list(r)

[1, 2, 3, 4, 5]

In [56]:
from itertools import permutations
perms = permutations('cat',3)
perms

<itertools.permutations at 0x106b5a830>

In [57]:
list(perms)

[('c', 'a', 't'),
 ('c', 't', 'a'),
 ('a', 'c', 't'),
 ('a', 't', 'c'),
 ('t', 'c', 'a'),
 ('t', 'a', 'c')]

A generator function is a way of creating iterators where we use `yield` rather than `return` to create sequences of results.

In [66]:
def yield_range(min_val, max_val):
    i = min_val
    while i < max_val:
        yield i
        i += 1

In [69]:
for i in yield_range(1,6):
    print(i)

1
2
3
4
5


In [67]:
yrange = yield_range(1,6)
yrange

<generator object yield_range at 0x106b82048>

In [60]:
next(yrange)

1

In [61]:
next(yrange)

2

In [62]:
next(yrange)

3

In [63]:
next(yrange)

4

In [64]:
next(yrange)

5

In [65]:
next(yrange)

StopIteration: 

# Arrays in `numpy`

Recall that a list can be used to store just about anything else in Python, including other lists.

In [70]:
list_of_lists = [['a','b'],[1,2],['alpha','beta']]
list_of_lists

[['a', 'b'], [1, 2], ['alpha', 'beta']]

We can access elements within the list of lists by chaining index calls together.

In [71]:
list_of_lists[2]

['alpha', 'beta']

In [72]:
list_of_lists[2][0]

'alpha'

A list of lists is an example of an array -- or rather a list is the simplest example of an array, which can have multiple levels of lists within it.

In [74]:
num_l_of_l = [[[1,2],
               [3,4]],
              [[5,6],
               [7,8]]]
num_l_of_l

[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]

In [75]:
num_l_of_l[1][1][1]

8

For many kinds of statistical metrics, simulations, etc. we represent the data as arrays where the position in the array as well as the value both have meaning. For example, a square array might represent a map of a square area and the values might be temperature recordings from each point in the map. 

Python has an extremely powerful library called [numpy](https://docs.scipy.org/doc/numpy-dev/user/index.html) we've already used to help out in previous class notes and Lab Assignments. Numpy's killer application is handling arrays of numeric data really quickly and elegantly. Covering all of numpy (and its helper library SciPy, for handling scientific computing) would take an entire class, so we'll only get to scratch the surface.

We'll begin by importing numpy, giving it the "np" alias and creating a basic 1-dimensional array.

In [76]:
import numpy as np

a_1d = np.arange(16)
a_1d

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [77]:
type(a_1d)

numpy.ndarray

We can also create numpy arrays from lists.

In [78]:
vanilla_list = [0,1,2,3,4,5]
numpyfied_list = np.array(vanilla_list)
numpyfied_list

array([0, 1, 2, 3, 4, 5])

We can access the elements in this array just like how we would a list of lists.

In [79]:
a_1d[5]

5

Reshape can convert this 1-dimensional array into other shapes.

In [81]:
a_2d = a_1d.reshape(4,4)
a_2d

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [82]:
a_2d[1][1]

5

We can use a different syntax than the list-like __get_items__ too, by passing comma-separated values rather than a succession of bracketed values. These should perform similarly, the latter is a bit clearer in some contexts.

In [83]:
a_2d[1,1]

5

Many of the same slicing accessor approaches still work for numpy arrays.

In [85]:
a_1d

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [84]:
a_1d[1:10:2]

array([1, 3, 5, 7, 9])

We can also mutate numpy arrays.

In [86]:
numpyfied_list

array([0, 1, 2, 3, 4, 5])

In [87]:
numpyfied_list[:3] = -1111
numpyfied_list

array([-1111, -1111, -1111,     3,     4,     5])

A numpy `ndarray` has a bunch of methods we can use to determine its properties too.

In [88]:
a_2d.shape

(4, 4)

In [89]:
a_2d.ndim

2

In [90]:
a_2d.size

16

In [91]:
a_2d.dtype

dtype('int64')

We can create more complex, higher-dimensional arrays as well.

In [93]:
a_4d = a_1d.reshape(2,2,2,2)
a_4d

array([[[[ 0,  1],
         [ 2,  3]],

        [[ 4,  5],
         [ 6,  7]]],


       [[[ 8,  9],
         [10, 11]],

        [[12, 13],
         [14, 15]]]])

In [94]:
a_4d.shape

(2, 2, 2, 2)

In [95]:
a_4d.ndim

4

In [96]:
a_4d.size

16

It was difficult to do arithmetic with vanilla Python lists, but numpy lists let us do lots of exciting things!

In [97]:
vanilla_list

[0, 1, 2, 3, 4, 5]

In [98]:
vanilla_list + 1

TypeError: can only concatenate list (not "int") to list

In [99]:
list_of_lists

[['a', 'b'], [1, 2], ['alpha', 'beta']]

In [100]:
list_of_lists + 1

TypeError: can only concatenate list (not "int") to list

In [107]:
[i+1 for i in vanilla_list]

[1, 2, 3, 4, 5, 6]

Specifically, numpy can do element-wise operations, which means it applies the operator to every element inside the array.

In [102]:
numpyfied_list = np.arange(1,6)
numpyfied_list

array([1, 2, 3, 4, 5])

In [103]:
numpyfied_list + 1

array([2, 3, 4, 5, 6])

In [104]:
numpyfied_list * 2

array([ 2,  4,  6,  8, 10])

In [105]:
numpyfied_list ** 2

array([ 1,  4,  9, 16, 25])

In [106]:
np.sqrt(numpyfied_list)

array([ 1.        ,  1.41421356,  1.73205081,  2.        ,  2.23606798])

It can also do this for more complex kinds of arrays, like our 4-dimensional array.

In [108]:
a_4d + 1

array([[[[ 1,  2],
         [ 3,  4]],

        [[ 5,  6],
         [ 7,  8]]],


       [[[ 9, 10],
         [11, 12]],

        [[13, 14],
         [15, 16]]]])

In [109]:
a_4d * 2

array([[[[ 0,  2],
         [ 4,  6]],

        [[ 8, 10],
         [12, 14]]],


       [[[16, 18],
         [20, 22]],

        [[24, 26],
         [28, 30]]]])

The most powerful functionality is that numpy lets you do linear algebra involving operations between arrays of different sizes.

In [110]:
A = np.array([[1,1],[0,0]])
B = np.array([[1,2],[3,4]])

In [111]:
A + B

array([[2, 3],
       [3, 4]])

In [112]:
A * B

array([[1, 2],
       [0, 0]])

In [113]:
A / B

array([[ 1. ,  0.5],
       [ 0. ,  0. ]])

In [114]:
A.dot(B)

array([[4, 6],
       [0, 0]])

Numpy arrays can also do more advanced linear algebra kinds of shape manupulations.

In [115]:
C = np.arange(12).reshape(3,4)
C

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [116]:
C.T

array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])