Throughout the discussion of basic data structures, we have used Python lists to implement the abstract data types presented. Unfortunately, “list” is not the best name for this collection type, as we will soon see (a better name would be “array”).

When discussing the list _abstract data type_, we consider a list to be a collection of items where each item holds a relative position with respect to the others.

The members of a list are commonly refered to as nodes. When each node holds a reference to the next node in the list, we call this a singly linked list. When each node holds a reference to both the next and previous nodes in the list, we call this a doubly linked list.

For simplicity we will assume that lists cannot contain duplicate items. Again this is a point of departure from Python’s native list type.

In this chapter we will consider both unordered and ordered lists. As we will see, an ordered list is simply a list with additional functionality designed to maintain its constituent nodes in a particular order.

The Unordered List Abstract Data Type
---

The structure of an unordered list, as described above, is a collection
of items where each item holds a relative position with respect to the
others. Some possible unordered list operations are given below.

-   `List()` creates a new list that is empty. It needs no parameters
    and returns an empty list.
-   `add(item)` adds a new item to the list. It needs the item and
    returns nothing. Assume the item is not already in the list.
-   `remove(item)` removes the item from the list. It needs the item and
    modifies the list. Assume the item is present in the list.
-   `search(item)` searches for the item in the list. It needs the item
    and returns a boolean value.
-   `is_empty()` tests to see whether the list is empty. It needs no
    parameters and returns a boolean value.
-   `size()` returns the number of items in the list. It needs no
    parameters and returns an integer.
-   `append(item)` adds a new item to the end of the list making it the
    last item in the collection. It needs the item and returns nothing.
    Assume the item is not already in the list.
-   `index(item)` returns the position of item in the list. It needs the
    item and returns the index. Assume the item is in the list.
-   `insert(pos, item)` adds a new item to the list at position pos. It
    needs the item and returns nothing. Assume the item is not already
    in the list and there are enough existing items to have
    position pos.
-   `pop()` removes and returns the last item in the list. It needs
    nothing and returns an item. Assume the list has at least one item.
-   `pop(pos)` removes and returns the item at position pos. It needs
    the position and returns the item. Assume the item is in the list.

The Ordered List
---

Later in this section we will also consider the ordered list, which is an
abstract data type identical to the unordered list described above, but with
the additional property that its items are maintained in a meaningful order.
For instance if we add the numbers 2, 3 and 1 to an ordered list, we would
expect to be able to access them as `1 -> 2 -> 3` or perhaps `3 -> 2 -> 1`
depending on whether that ordered list is designed to maintain an ascending
or descending order.


### Implementing an Unordered List

In order to implement an unordered list, we will construct what is
commonly known as a **linked list**. Recall that we need to be sure that
we can maintain the relative positioning of the items. However, there is
no requirement that we maintain that positioning in contiguous memory.
For example, consider the collection of items shown below. It appears that these values have been
placed randomly.

![Items not constrained in their physical
placement](figures/random-items.png)

If we can maintain some explicit information in each
item, namely the location of the next item, then the relative position of each item
can be expressed by simply following the link from one item to the next:

![Relative positions maintained by explicit
links](figures/explicit-links.png)

It is important to note that the location of the first item of the list
must be explicitly specified. Once we know where the first item is, the
first item can tell us where the second is, and so on. The external
reference is often referred to as the **head** of the list. Similarly,
the last item needs to know that there is no next item.


The `Node` Class
----------------

The basic building block for the linked list implementation is the
**node**. Each node object must hold at least two pieces of information.
First, the node must contain the list item itself. We will call this the
**data field** of the node. In addition, each node must hold a reference
to the next node. Here we provide one simple Python
implementation:

In [3]:
class Node(object):
    def __init__(self, value):
        self.value = value
        self.next = None

To construct a node, you need to supply the initial data
value for the node. Evaluating the assignment statement below will yield
a node object containing the value passed:

In [4]:
temp = Node(93)
temp.value

93

The special Python reference value `None` will play an important role in
the `Node` class and later in the linked list itself. A reference to
`None` will denote the fact that there is no next node. Note in the
constructor that a node is initially created with `next` set to `None`.
Since this is sometimes referred to as “grounding the node,” we will use
the standard ground symbol to denote a reference that is referring to
`None`. It is always a good idea to explicitly assign `None` to your
initial next reference values.

![A typical representation for a node](figures/node.png)

The `Unordered List` Class
--------------------------

As we suggested above, the unordered list will be built from a
collection of nodes, each linked to the next by explicit references. As
long as we know where to find the first node (containing the first
item), each item after that can be found by successively following the
next links. With this in mind, the `UnorderedList` class must maintain a
reference to the first node. Below we show the
constructor. Note that each list object will maintain a single reference
to the head of the list.

In [5]:
class UnorderedList(object):

    def __init__(self):
        self.head = None

Initially when we construct a list, there are no items. The assignment
statement


In [6]:
mylist = UnorderedList()

creates this linked list representation:

![An empty list](figures/empty-list.png)

As we discussed in the `Node`
class, the special reference `None` will again be used to state that the
head of the list does not refer to anything. Eventually, the example
list given earlier will be represented by this linked list:

![A linked list of integers](figures/linked-list.png)

The head of the list refers to the first node which contains the first item of
the list. In turn, that node holds a reference to the next node (the next
item) and so on. It is important to note that the list class itself does not
contain any node objects. Instead it contains a single reference to only the
first node in the linked structure.


The `is_empty` method, shown below, simply
checks to see if the head of the list is a reference to `None`. The
result of the boolean expression `self.head is None` will only be true if
there are no nodes in the linked list. Since a new list is empty, the
constructor and the check for empty must be consistent with one another.
This shows the advantage to using the reference `None` to denote the
“end” of the linked structure. In Python, `None` can be compared to any
reference. Two references are equal if they both refer to the same
object. We will use this often in our remaining methods.

In [8]:
def is_empty(self):
    return self.head is None

So, how do we get items into our list? We need to implement the `add`
method. However, before we can do that, we need to address the important
question of where in the linked list to place the new item. Since this
list is unordered, the specific location of the new item with respect to
the other items already in the list is not important. The new item can
go anywhere. With that in mind, it makes sense to place the new item in
the easiest location possible.

Recall that the linked list structure provides us with only one entry
point, the head of the list. All of the other nodes can only be reached
by accessing the first node and then following `next` links. This means
that the easiest place to add the new node is right at the head, or
beginning, of the list. In other words, we will make the new item the
first item of the list and the existing items will need to be linked to
this new first item so that they follow.

The linked list shown above was built by
calling the `add` method a number of times.

```
>>> mylist.add(31)
>>> mylist.add(77)
>>> mylist.add(17)
>>> mylist.add(93)
>>> mylist.add(26)
>>> mylist.add(54)
```

Note that since 31 is the first item added to the list, it will
eventually be the last node on the linked list as every other item is
added ahead of it. Also, since 54 is the last item added, it will become
the data value in the first node of the linked list.

The `add` method is shown below. Each item of the list must reside in a node
object. We create a new node within the method and place the item as its
value. Then we complete the process by linking the new node into the existing
structure.

In [10]:
def add(self, item):
    temp = Node(item)
    temp.next = self.head
    self.head = temp


This requires two steps as
shown below. Step 1 (line 3) changes the
`next` reference of the new node to refer to the old first node of the
list. Now that the rest of the list has been properly attached to the
new node, we can modify the head of the list to refer to the new node.

![Adding a new node is a two-step
process](figures/add-to-head.png)

The order of the two steps described above is very important. What
happens if the order of the steps is reversed? If the
modification of the head of the list happens first, the result can be
seen below. Since the head was the only
external reference to the list nodes, all of the original nodes are lost
and can no longer be accessed.

![Result of reversing the order of the two
steps](figures/wrong-order.png)

The next methods that we will implement–`size`, `search`, and
`remove`–are all based on a technique known as **linked list
traversal**. Traversal refers to the process of systematically visiting
each node. To do this we use an external reference that starts at the
first node in the list. As we visit each node, we move the reference to
the next node by “traversing” the next reference.

To implement the `size` method, we need to traverse the linked list and
keep a count of the number of nodes that occurred.
Below we provide the Python code for counting the
number of nodes in the list. The external reference is called `current`
and is initialized to the head of the list in line 2. At the start of
the process we have not seen any nodes so the count is set to $$0$$. Lines
4–6 actually implement the traversal. As long as the current reference
has not seen the end of the list (`None`), we move current along to the
next node via the assignment statement in line 6. Every time current moves
to a new node, we add $$1$$ to `count`. Finally, `count` gets returned
after the iteration stops.

In [11]:
def size(self):
    current = self.head
    count = 0
    while current is not None:
        count = count + 1
        current = current.next

    return count

Searching for a value in a linked list implementation of an unordered
list also uses the traversal technique. As we visit each node in the
linked list we will ask whether the data stored there matches the item
we are looking for. In this case, however, we may not have to traverse
all the way to the end of the list. In fact, if we do get to the end of
the list, that means that the item we are looking for must not be
present. Also, if we do find the item, there is no need to continue.

Here is a possible implementation of `search`:

In [13]:
def search(self, item):
    current = self.head

    while current is not None:
        if current.value == item:
            return True
        current = current.next

    return False

The `remove` method requires two logical steps. First, we need to
traverse the list looking for the item we want to remove. Once we find
the item (recall that we assume it is present), we must remove it. The
first step is very similar to `search`. Starting with an external
reference set to the head of the list, we traverse the links until we
discover the item we are looking for. Since we assume that item is
present, we know that the iteration will stop before `current` gets to
`None`.

Once we have found the node to be removed, how do we remove it? One
possibility would be to replace the value of the item with some marker
that suggests that the item is no longer present. The problem with this
approach is the number of nodes will no longer match the number of
items. It would be much better to remove the item by removing the entire
node.

In order to remove the node containing the item, we need to modify the
link in the previous node so that it refers to the node that comes after
`current`. Unfortunately, there is no way to go backward in the linked
list. Since `current` refers to the node ahead of the node where we
would like to make the change, it is too late to make the necessary
modification.

The solution to this dilemma is to use two external references as we
traverse down the linked list. `current` will behave just as it did
before, marking the current location of the traverse. The new reference,
which we will call `previous`, will always travel one node behind
`current`. That way, when `current` stops at the node to be removed,
`previous` will be referring to the proper place in the linked list for
the modification.

Here is an implementation of a complete `remove` method:

In [16]:
def remove(self, item):
    current = self.head
    previous = None

    while True:
        if current.value == item:
            break
        previous, current = current, current.next

    if previous is None:
        self.head = current.next
    else:
        previous.next = current.next

First we assign current and previous to the head of the list and `None`
respectively. Then, on each iteration of our while loop, we break if `current`
represents the node we wish to remove, and if not we update `previous` and
`current` to `current` and `current.next` respectively. Again, the order of
these two statements is crucial. `previous` must first be moved one node ahead
to the location of `current`. At that point, `current` can be moved.

Here we illustrate the movement of `previous` and `current` as they progress
down the list looking for the node containing the value 17:

!["previous" and "current" move down the
list](figures/previous-current.png)

Once the searching step of the `remove` has been completed, we need to remove
the node from the linked list. If `previous` is `None`, we know that `current`
is in fact the head of the list, so we remove that node by updating the head
of the list to the subsequent node, thereby losing the reference to the
original head node:

![Removing the first node from the list](figures/remove-head.png)

In all other cases, we know that both `previous` and `current` are nodes in
the list, so we can remove `current` by setting the `next` attribute of
`previous` to the node _after_ current in the list:

![Removing an item from the middle of the
list](figures/remove-from-middle.png)

The remaining methods `append`, `insert`, `index`, and `pop` are left as
exercises. Remember that each of these must take into account whether
the change is taking place at the head of the list or someplace else.
Also, `insert`, `index`, and `pop` require that we name the positions of
the list. We will assume that position names are integers starting with

In order to implement the ordered list, we must remember that the relative
positions of the items are based on some underlying characteristic. The
ordered list of integers given above (17, 26, 31, 54, 77, and 93) can be
represented by a linked structure as shown below. Again, the node and link
structure is ideal for representing the relative positioning of the items.

![An ordered linked list](figures/ordered-list.png)

To implement the `OrderedList` class, we will use the same technique as
seen previously with unordered lists. We will subclass `UnorderedList` and
leave the `__init__` method intact as once again, an empty list will be
denoted by a `head` reference to `None`.

In [20]:
# from unordered_list import Node, UnorderedList


class OrderedList(UnorderedList):
    pass

As we consider the operations for the ordered list, we should note that
the `is_empty` and `size` methods can be implemented the same as with
unordered lists since they deal only with the number of nodes in the
list without regard to the actual item values. Likewise, the `remove`
method will work just fine since we still need to find the item and then
link around the node to remove it. The two remaining methods, `search`
and `add`, will require some modification.

The search of an unordered linked list required that we traverse the
nodes one at a time until we either find the item we are looking for or
run out of nodes (`None`). It turns out that the same approach would
actually work with the ordered list and in fact in the case where we
find the item it is exactly what we need. However, in the case where the
item is not in the list, we can take advantage of the ordering to stop
the search as soon as possible.

For example, the diagram below shows the ordered linked
list as a search is looking for the value 45. As we traverse, starting
at the head of the list, we first compare against 17. Since 17 is not
the item we are looking for, we move to the next node, in this case 26.
Again, this is not what we want, so we move on to 31 and then on to 54.
Now, at this point, something is different. Since 54 is not the item we
are looking for, our former strategy would be to move forward. However,
due to the fact that this is an ordered list, that will not be
necessary. Once the value in the node becomes greater than the item we
are searching for, the search can stop and return `False`. There is no
way the item could exist further out in the linked list.

![Searching an ordered linked
list](figures/ordered-list-search.png)

Below we provide an adaptation of the `search` method from our `UnorderedList`
class to take advantage of this optimization.

In [22]:
def search(self, item):
    current = self.head

    while current is not None:
        if current.value == item:
            return True
        if current.value > item:
            return False
        current = current.next

    return False


The most significant method modification will take place in `add`.
Recall that for unordered lists, the `add` method could simply place a
new node at the head of the list. It was the easiest point of access.
Unfortunately, this will no longer work with ordered lists. It is now
necessary that we discover the specific place where a new item belongs
in the existing ordered list.

Assume we have the ordered list consisting of 17, 26, 54, 77, and 93 and
we want to add the value 31. The `add` method must decide that the new
item belongs between 26 and 54. Below we show
the setup that we need. As we explained earlier, we need to traverse the
linked list looking for the place where the new node will be added. We
know we have found that place when either we run out of nodes (`current`
becomes `None`) or the value of the current node becomes greater than
the item we wish to add. In our example, seeing the value 54 causes us
to stop.

![Adding an item to an ordered linked
list](figures/ordered-list-insert.png)

As we saw with unordered lists, it is necessary to have an additional
reference, again called `previous`, since `current` will not provide
access to the node that must be modified.

Once we have identified the position at which to add our new node, we
construct it and place it correctly, either as the new head of the node (if
`previous` is `None`) or between `previous` and `current` otherwise.

In [23]:
def add(self, item):
    current = self.head
    previous = None

    while current is not None:
        if current.value > item:
            break
        previous, current = current, current.next

    temp = Node(item)
    if previous is None:
        temp.next, self.head = self.head, temp
    else:
        temp.next, previous.next = current, temp

We leave the remaining methods as exercises. You should
carefully consider whether the unordered implementations will work given
that the list is now ordered.

Analysis of Linked Lists
------------------------

To analyze the complexity of the linked list operations, we need to
consider whether they require traversal. Consider a linked list that has
*n* nodes. The `is_empty` method is $$O(1)$$ since it requires one step to
check the head reference for `None`. `size`, on the other hand, will
always require $$n$$ steps since there is no way to know how many nodes
are in the linked list without traversing from head to end. Therefore,
`length` is $$O(n)$$. Adding an item to an unordered list will always be
$$O(1)$$ since we simply place the new node at the head of the linked list.
However, `search` and `remove`, as well as `add` for an ordered list,
all require the traversal process. Although on average they may need to
traverse only half of the nodes, these methods are all $$O(n)$$ since in
the worst case each will process every node in the list.

You may also have noticed that the performance of this implementation
differs from the actual performance given earlier for Python lists. This
suggests that linked lists are not the way Python lists are implemented.
The actual implementation of a Python list is based on the notion of an
array. We discuss this in depth later.
