Skip to content

Concepts: improving SLL

Eugene Lazutkin edited this page Jun 10, 2024 · 5 revisions

A classic null-terminated singly linked list looks something like that:

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  head --> a --> b --> c --> d --> null
Loading

It takes O(n) time to traverse the list and calculate the length of the list. Saving the list's length in a variable solves this problem, but introduces a new one: a linear time traversal when we remove a section of a list or add it to a list. More on that later.

Operations on the front of the list

An efficient adding of a node to the list is possible at the head only:

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  x(("`**x**`"))
  head --> x --> a --> b --> c --> d --> null
Loading

This is O(1) time.

Removing a node from the front is O(1) time too:

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  x(("`**x**`"))
  head --> a --> b --> c --> d --> null
  removed --> x
Loading

Operations on the back of the list

Adding to the end of the list will take O(n) time to find the last node and O(1) time to add the new node:

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  x(("`**x**`"))
  head --> a --> b --> c --> d --> x --> null
Loading

It can be solved with a single pointer to the last node:

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  head --> a --> b --> c --> d --> null
  last --> d
Loading

Now adding a new node to the back will take Q(1) time — the same as to the front:

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  x(("`**x**`"))
  head --> a --> b --> c --> d --> x --> null
  last --> x
Loading

Unfortunately it doesn't help to pop a node from the back — we still need to traverse the whole list for that.

Operations on the middle of the list

For example, we have a pointer to a node in the middle of the list:

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  head --> a --> b --> c --> d --> null
  p --> b
Loading

We can add a new node in the middle of the list after the pointer p:

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  x(("`**x**`"))
  head --> a --> b --> x --> c --> d --> null
  p --> b
Loading

While doing that we can efficiently maintain the last pointer:

  • if p === last, the new node becomes the last
  • otherwise last is unchanged

Or we can remove the next node after the pointer p:

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  head --> a --> b --> d --> null
  p --> b
  removed --> c
Loading

While doing that we can efficiently maintain the last pointer:

  • if p.next === last, last becomes p
  • otherwise last is unchanged

Given all that we can manipulate a node if we know its previous node. With that we can remove the node, add a new node before it, and add a new node after it.

One of the problems with this approach is that there is no previous node for the front node. It can be solved with circular lists.

Extracting a range of nodes

This is a generalization of removing a node in the middle of the list. The range is defined by a previous node of the first node of the range (prevFrom) and the last node of the range (to):

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  head --> a --> b --> c --> d --> null
  prevNode --> a
  to --> c
Loading

Extracting them will take O(1) time:

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  head --> a --> d --> null
  last --> d
  extractedHead --> b --> c --> null
  extractedTail --> c
Loading

This operation can maintain the last pointer in the constant time.

Circular lists

The last node's next pointer points to the first node:

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  head --> a --> b --> c --> d --> a
Loading

With last it looks like this:

flowchart LR
  a((a))
  b((b))
  c((c))
  d((d))
  head --> a --> b --> c --> d --> a
  last --> d
Loading

This way last is the previous node for the front node. We have all possible operations described above for all nodes.

Hosted vs. headless lists

All examples above show headless lists: all nodes are functionally equal, a list is represented as an external object with two pointers (head and last).

Headless lists cannot be empty. Having no nodes should be handled as a special case.

Introducing a head node simplifies things considerably. The head node serves as a host for other nodes.

The empty list:

flowchart LR
  list((list))
  list --> list
Loading

Added a new node:

flowchart LR
  list((list))
  x(("`**x**`"))
  list --> x --> list
Loading

It can be seen that the head node is always the first and the last node in the list. Logically, it serves as the previous node for the first and the next node for the last node. It can be used to keep some bookkeeping information about the list, like the last pointer.

General operations on circular singly linked lists

pop()

Having the previous node allows us to define the necessary operations:

const pop = prev => {
  const node = prev.next,
    next = node.next;

  // exclude the node
  prev.next = next;

  // circle the node
  node.next = node;

  return {extracted: {prevFrom: node, to: node}, rest: next === node ? null : next};
}

This function extracts the node and returns the rest of the list. It takes O(1) time.

extract()

This operation is generalization of pop() above. It takes O(1) time:

const extract = ({prevFrom, to = prevFrom.next}) => {
  const node = prevFrom.next,
    next = to.next;

  // exclude the range
  prevFrom.next = to.next;

  // circle the range
  to.next = node;

  return {extracted: {prevFrom: to, to}, rest: next === node ? null : next};
};

We can see that pop(prev) === extract({prevFrom: prev}).

splice()

export const splice = (target, {prevFrom, to = prevFrom.next}) => {
  // form the combined head
  const next = target.next;
  target.next = prevFrom.next;

  // finish the combined  tail
  prevFrom.next = to.next;
  to.next = next;

  return target;
};

This function removes a range of nodes from its list and adds it after the target node. The range is defined by a previous node of the first node and the last node (inclusive). It takes O(1) time.

Traversal

Headless circular lists:

let node = head;
do {
  // do something with the node
  node = node.next;
} while (node !== head);

Hosted circular lists:

for (let node = list; node !== list; node = node.next) {
  // do something with the node
}

An inclusive range (from fromNode to toNode):

for (let node = fromNode;; node = node.next) {
  // do something with the node
  if (node === toNode) break;
}

Complexity considerations

Note that if our list maintains a number of elements, splice() immediately becomes O(k) because we have to go and count all nodes in the range in order to update the list's count. Obviously, we don't want to do it in the off-chance that the list's size is required.

In fact, we rarely need to know the size, especially when we can trivially check the list for emptiness in O(1) time.

The same goes for another "popular feature" of some list implementations: keeping a reference to the list in the node. While it can be useful, it requires updating the list reference for every node while splicing, which makes it immediately become O(k).