BASC0038 Algorithms, Logic and Structure

# Week 6: Lists, stacks and queues

Author: Sam J. Griffiths (sam.griffiths.19@ucl.ac.uk)

---

# Data structures and abstract data types (ADTs)

So far, we have focused on the concept of an algorithm, which could concisely be described as a method to solve a problem. Now, we will move on to consider data structures, that is, how data is actually stored and manipulated. We will begin to appreciate that these complementary notions are effectively equivalent; for example, heapsort is simply equivalent to using a data structure known as a heap, and under the hood a heap uses what could accurately be described as algorithms to accomplish certain tasks, which are performed upon data laid out in memory in a certain way etc.

Terminology in this area often becomes overloaded and contradictory. We will distinguish between *data structures* and *abstract data types* (ADTs), although such definitions are quite mutable. *Data structures* refer to how data is actually laid out and manipulated in memory, whereas *abstract data types* refer to conceptual collections (e.g. lists, stacks and queues) which can be implemented using often a variety of different data structures.

To illustrate this, a heap can be implemented in a number of ways, such as the binary tree we used. A binary tree itself can be implemented in a number of ways, such as the array with parent-child index calculation we used.

# Lists

## The list ADT
Previously, we have used terms like *list* and *array* interchangeably. From here on in, a *list* refers to an abstract data type with certain properties. A list represents a linear collection of elements, with an start and an end, therefore possessing a length.

Some things we should be able to do with a list include inserting and removing elements, as well as searching it to determine if it contains some given element. We can also expect to be able to determine the length of a list.

There are a number of ways a list could be implemented in real terms. We will consider the two major approaches: *arrays* and *linked lists*.

## Arrays

Consider some type of elements we may want to store in a list, such as an integer. Let us assume our architecture lets us possess 32-bit (unsigned) integers; that is, 32 consecutive bits in memory understood to represent overall some value ranging from $0$ to $2^{32}=4294967296$.

What if we would like to store a list of, say, five of these integers? The easiest approach would be to store these consecutively in memory. We could possess a pointer to the first element, and look 32 bits ahead in order to obtain the second element, and another 32 bits ahead to obtain the third element etc.

A memory block of contiguous elements is known as an *array*, the most fundamental way of implementing a list. A key benefit of arrays is its efficiency of indexing; that is, obtaining the element at *any* given index can be done in $O(1)$ time. In the example of 32-bit integers, if you have a pointer in terms of bits to the first (`0`th) object, called `head`, a pointer to the `i`th object is simply `head + i*32`.

A related benefit is its inherent space-efficiency; that is, all of the elements are stored in one place, as compactly as possible (without entering the topic of compression, of course). The downside to this is that inserting a new element is an $O(n)$ operation, because an array of length `n` must be moved into newly-allocated memory of size `n+1`.

A *dynamic array* will move the array into more memory than is needed when inserting an element. For example, let's say we are inserting an element at the end of the list. A growth factor of 2 means that an array of length `n` is moved in $O(n)$ time into newly-allocated memory of size `2*n`, meaning that there are now `n-1` empty spaces at the end of the array, such that the next `n-1` end-insertions do not require reallocation and can thus be done in $O(1)$. By continually doubling the capacity of the array, the capacity grows exponentially, meaning that for arbitrarily large $n$, the probability of an end-insertion requiring $O(1)$ time asymptotically dominates the probability it requires $O(n)$. Therefore, we say the *amortised* time complexity of this operation is $O(1)$.

However, inserting an element into any other position than the end is still $O(n)$, because all elements after that position must be shifted in order to make room. Removal from the end is trivially $O(1)$ even for a simple array, but likewise $O(n)$ for any other position for the same reasons as insertion.

'Lists' in Python, i.e. structures of the form `x=[5,4,2,7]` we've used so far, are implemented as dynamic arrays.

## Linked lists

An alternative approach to storing elements contiguously in memory is to allow each one to be stored arbitrarily in memory, apart from each other. Each of these 'nodes' then stores not only the value, but a reference to the next node in memory. The final node can have an empty reference, signifiying it is currently the end of the list.

Possessing a linked list then simply means possessing the first node, the *head*. A basic implementation in Python might look like:

In [None]:
class SinglyLinkedNode:
  value = None
  next = None


def linked_list_print(head):
  """Print all values in a linked list.

  Args:
    head: First node in linked list.

  """
  node = head
  while node is not None:
    print(node.value, end=" ")
    node = node.next
  print()


node1 = SinglyLinkedNode()
node1.value = 42

node2 = SinglyLinkedNode()
node2.value = 100
node1.next = node2

node3 = SinglyLinkedNode()
node3.value = -24
node2.next = node3

linked_list_print(node1)

42 100 -24 


Firstly, do not be too scared by the `class` syntax! The idea of this course is not software engineering, so we will not be going into something like object-oriented programming, but all you need to know is that a class is effectively just a custom type. Here, a `SinglyLinkedNode` just contains two attributes, as discussed: a `value` and a `next` reference. Both default to `None`.

`node1 = SinglyLinkedNode()` creates a `SinglyLinkedNode` named `node1`. The value is then set to some integer. A second node is created and then added to the list by setting the `next` reference of the first node to it. For the final node, the default to `None` automatically makes it the last element in the list, unless changed.

Secondly, `linked_list_print(head)` is a simple example of how to iterate over the entire linked list &ndash; it should look very similar to the list-search pseudocode from the lecture. The useage of the `print` function here separates each character with a space instead of line breaks, and then inserts a line break at the end.

The above code can be also be improved slightly by providing a *method* to the class, that is, a function which belongs to it. All methods begin with a `self` parameter which refers to the current object itself &ndash; you'll notice that this parameter does not appear when actually calling the methods. A method called `__init__` is known as a *constructor* and allows for nicer syntax when creating objects:

In [None]:
class SinglyLinkedNode:
  """Singly-linked list node.

  Attributes:
    value: Value stored in node.
    next: Reference to next node, or None if tail node.

  """

  def __init__(self, value=None, next=None):
    """Construct a SinglyLinkedNode.

    Args:
      value (optional): Value stored in node. Defaults to None.
      next (optional): Reference to next node. Defaults to None.

    """
    self.value = value
    self.next = next

In [None]:
# Same as last example
node1 = SinglyLinkedNode(42)

node2 = SinglyLinkedNode(100)
node1.next = node2

node3 = SinglyLinkedNode(-24)
node2.next = node3

linked_list_print(node1)

42 100 -24 


In [None]:
# Showing off the constructors even better, in reverse
node3 = SinglyLinkedNode(-24)
node2 = SinglyLinkedNode(100, node3)
node1 = SinglyLinkedNode(42, node2)

linked_list_print(node1)

42 100 -24 


Don't worry too much about this object-oriented programming; it is just used here to make the syntax of `value,next` pairs nice and elegant. We could even go so far as to create a helper function to construct a list from an arbitrary number of values:

In [None]:
def make_singly_linked_list(*args):
  """Make a singly-linked list containing the given arguments.

  Returns:
    SinglyLinkedNode: Head of linked list.

  """
  # If no arguments are given, return None
  if len(args) == 0:
    return None

  # If at least one argument is given, make a list
  head = SinglyLinkedNode(args[0])
  node = head
  for arg in args[1:]:
    node.next = SinglyLinkedNode(arg)
    node = node.next

  return head


mylist = make_singly_linked_list(42, 100, -24)
linked_list_print(mylist)

42 100 -24 


## ✍️ Exercise: Linked list operations

Using the `SinglyLinkedNode` class provided above, implement some of the operations on linked lists as given in the lectures.

First, implement `linked_list_search(head, value)`, which searches a linked list for the given `value`.

<h2>👇</h2>

In [None]:
def linked_list_search(head, value):
  """Query if a value is in a linked list.

  Args:
    head: First node in linked list.
    value: Value to query presence of.

  Returns:
    True if value is in list, False otherwise.

  """
  node = head
  while node is not None:
    if node.value == value:
      return True
    else:
      node = node.next

  return False

🟢

In [None]:
# Output should be:
# (True, True, True, False)

mylist = make_singly_linked_list(0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 110, 111)

(linked_list_search(mylist, 36),
 linked_list_search(mylist, 0),
 linked_list_search(mylist, 111),
 linked_list_search(mylist, -2))

(True, True, True, False)

Next, implement both `linked_list_insert_front(head, value)` and `linked_list_insert_back(head, value)` to insert a value at the start and end of a list, respectively.

As we are so far holding a list simply by holding the head node, the functions must return the new head node we need to hold.

<h2>👇</h2>

In [None]:
def linked_list_insert_front(head, value):
  """Insert a value at the front of a linked list.

  Args:
    head: First node in linked list.
    value: Value to insert.

  Returns:
    SinglyLinkedNode: New head of list.

  """
  return SinglyLinkedNode(value, head)


def linked_list_insert_back(head, value):
  """Insert a value at the back of a linked list.

  Args:
    head: First node in linked list.
    value: Value to insert.

  """
  if head is None:
    return SinglyLinkedNode(value, head)

  node = head
  while node.next is not None:
    node = node.next
  node.next = SinglyLinkedNode(value)

  return head

🟢

In [None]:
# Output should be:
# 0 1 4 9 16 25 36 49 64 81 110 111 
# -999 0 1 4 9 16 25 36 49 64 81 110 111 
# -999 0 1 4 9 16 25 36 49 64 81 110 111 999 

mylist = make_singly_linked_list(0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 110, 111)
linked_list_print(mylist)
mylist = linked_list_insert_front(mylist, -999)
linked_list_print(mylist)
linked_list_insert_back(mylist, 999)
linked_list_print(mylist)

0 1 4 9 16 25 36 49 64 81 110 111 
-999 0 1 4 9 16 25 36 49 64 81 110 111 
-999 0 1 4 9 16 25 36 49 64 81 110 111 999 


## Linked list class
As mentioned, we are currently holding a list by holding the head node. The code below defines a singly-linked list class with a `head` field, mirroring the *L* and *L.head* notation in the lecture.

In [None]:
class SinglyLinkedList:
  """Singly-linked list of values.

  Attributes:
    head: First node in linked list.

  """

  def __init__(self, *args):
    """Construct a SinglyLinkedList containing the given arguments."""
    self.head = make_singly_linked_list(*args)

  def __str__(self):
    """Provide string representation for print() etc.

    Returns:
      str: String representation.

    """
    result = ""
    node = self.head
    while node is not None:
      result += str(node.value) + " "
      node = node.next
    return result

  def __len__(self):
    """Provide length of list for len() etc.

    Returns:
      int: Length of list.

    """
    length = 0
    node = self.head
    while node is not None:
      length += 1
      node = node.next
    return length

  def __contains__(self, value):
    """Query if a value is in the list for 'in' etc.

    Args:
      value: Value to query presence of.

    Returns:
      bool: True if value is in list, False otherwise.

    """
    return linked_list_search(self.head, value)

  def front(self):
    """Return the value at the front of the list.

    Returns:
      First value in list.

    """
    if self.head is None:
      raise IndexError("front of empty list")

    return self.head.value

  def back(self):
    """Return the value at the back of the list.

    Returns:
      Last value in list.

    """
    if self.head is None:
      raise IndexError("back of empty list")

    # Get last node -- tail reference would be quicker here!
    node = self.head
    while node.next is not None:
      node = node.next

    return node.value

  def insert_front(self, value):
    """Insert a value at the front of the list.

    Args:
      value: Value to insert.

    """
    self.head = linked_list_insert_front(self.head, value)

  def insert_back(self, value):
    """Insert a value at the back of the list.

    Args:
      value: Value to insert.

    """
    self.head = linked_list_insert_back(self.head, value)

  def remove_front(self):
    """Remove the value at the front of the list."""
    if self.head is None:
      raise IndexError("remove_front from empty list")

    self.head = self.head.next

  def remove_back(self):
    """Remove the value at the back of the list."""
    if self.head is None:
      raise IndexError("remove_back from empty list")

    if self.head.next is None:
      self.head = None
    else:
      # Get penultimate node -- doubly-linked would be quicker here!
      penultimate = self.head
      while penultimate.next.next is not None:
        penultimate = penultimate.next
      penultimate.next = None

In [None]:
mylist = SinglyLinkedList(0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 110, 111)
print(mylist)
print(len(mylist))
print((36 in mylist,
       0 in mylist,
       111 in mylist,
       -2 in mylist))

print(mylist.front())
print(mylist.back())

mylist.insert_front(-999)
mylist.insert_back(999)
print(mylist)

mylist.remove_front()
mylist.remove_back()
print(mylist)

0 1 4 9 16 25 36 49 64 81 110 111 
12
(True, True, True, False)
0
111
-999 0 1 4 9 16 25 36 49 64 81 110 111 999 
0 1 4 9 16 25 36 49 64 81 110 111 


Some things to note are:

*   The functions you've implemented above are used to power a lot of this class!
*   The `__str__` function enables string representation for class, allowing it to be seamlessly `print`ed. Likewise, `__len__` enables use of `len()` on the class, and `__contains__` enables use of the `in` clause to test for possession.
*    `front` and `back` methods simply return the first and last values in the list. Similarly, `remove_front` and `remove_back` do exactly as they imply. Look at these functions and understand how they work, too. As suggested in comments, it should be clear how these functions would be improved by holding a `tail` pointer to the back of the list (in addition to `head`) and doubly-linking nodes such that they refer to the previous node as well as the next.
*    If an attempt is made to obtain or remove from the front/back of an empty list, an *exception* (specifically, an IndexError) is *thrown* (or *raised*). You're probably used to seeing exceptions flung at you when trying to solve your exercises; this is an example of how these errors are raised when something bad happens.



# Stacks and queues

We have looked at the general concept of a list ADT and seen an example of how a list can be implemented with a linked-list data structure.

Two other common example of ADTs are *stacks* and *queues*. These are both representations of a collection of elements, both with specific contraints:

*   Elements can be inserted ('pushed') into a stack. Elements can also be removed one-by-one ('popped') from a stack. Each element popped from the stack is the element most recently pushed onto it, i.e. it is *last-in, first-out* (LIFO). Imagine a stack of plates, where the most recent plate added to the top is the next one to be removed.
*   Elements can be pushed into a queue and popped from a queue. Each element popped from the queue is the element earliest pushed onto it, i.e. it is *first-in, first-out* (FIFO). Imagine a queue of people waiting in line.

A well-known use of a stack is the function call stack, where nested function calls resolve in reverse order. A queue is useful for things such as processing requests and jobs in the order they were received.

Stacks and queues could be implemented with arrays or linked lists. As linked lists are more efficient than arrays at adding/removing elements to/from both the start and end, they are generally preferred for this specific purpose.

# ✍️ Exercise: Stack and queue classes

Implement both a `Stack` and `Queue` class using the `SinglyLinkedList` class provided above. Each should simply hold a `SinglyLinkedList` attribute, set to an empty list in `__init__`. A `push` method inserts a value into the stack/queue, a `pop` method returns the next value from the stack/queue (removing it in the process) and an `empty` method returns whether or not the queue/stack is empty. 

<h2>👇</h2>

In [None]:
class Stack:
  """Stack of values in a last-in, first-out (LIFO) basis."""

  def __init__(self):
    """Construct an empty Stack."""
    self._data = SinglyLinkedList()

  def push(self, value):
    """Insert a value onto the top of the stack.

    Args:
      value: Value to push.

    """
    self._data.insert_front(value)

  def pop(self):
    """Return and remove the value on the top of the stack.

    Returns:
      Value on top of stack.

    """
    value = self._data.front()
    self._data.remove_front()
    return value

  def empty(self):
    """Determine if the stack is empty, i.e. has no elements.

    Returns:
      True if stack is empty, False otherwise.

    """
    return len(self._data) == 0


class Queue:
  """Queue of values in a first-in, first-out (FIFO) basis."""

  def __init__(self):
    """Construct an empty Queue."""
    self._data = SinglyLinkedList()

  def push(self, value):
    """Insert a value onto the back of the queue.

    Args:
      value: Value to push.

    """
    self._data.insert_back(value)

  def pop(self):
    """Return and remove the value on the front of the queue.

    Returns:
      Value on front of queue.

    """
    value = self._data.front()
    self._data.remove_front()
    return value

  def empty(self):
    """Determine if the queue is empty, i.e. has no elements.

    Returns:
      True if queue is empty, False otherwise.

    """
    return len(self._data) == 0

🟢

In [None]:
stack = Stack()
stack.push("first")
stack.push("second")
stack.push("third")

while not stack.empty():
  print(stack.pop())

third
second
first


In [None]:
queue = Queue()
queue.push("first")
queue.push("second")
queue.push("third")

while not queue.empty():
  print(queue.pop())

first
second
third
