In [14]:
#include <iostream>
#include <iomanip>
#include <exception>
#include <stdexcept>
using namespace std;
std::cout << std::boolalpha; 

# Lists

## Outline

- The `Node` and `List` classes
- Accessors and mutators
- The implementation of various member functions
- Stepping through a linked list
- Defining the copy and assignment operator
- Defining move constructors and move assignment operators
- Discussed efficiencies

## Linked List

- A **linked list** is a data structure consisting of a sequence of nodes where each object is stored in a **node**

- As well as storing data, the node must also contains a **reference/pointer** to the node containing the **next item** of data

![](../img/linked-list.png)

## Linked List (cont.)

- The node are dynamically createed in a linked list

- Thus, because **new** returns a pointer, the logical manner in which to track a linked lists is through a pointer

- A `Node` class must store the **data** and a **reference** to the next node (also a pointer)

![](../img/list-node.png)

In [2]:
class Node {
public:
    Node( int = 0, Node * = nullptr );

    int value() const;
    Node* next() const;

private:
    int node_value;
    Node *next_node;
};

## Constructor

- The constructor assigns the two member variables based on the arguments

In [3]:
Node::Node( int e, Node *n ): node_value( e ), next_node( n ) {
    // empty constructor
}

## Accessors

- The two member functions are accessors which simply return the `node_value` and the `next_node` member variables, respectively
    - Member functions that do not change the object acted upon are variously called *accessors*, *readonly functions*, *inspectors*, and, when it involves simply returning a member variable, *getters*

In [4]:
int Node::value() const {
  return node_value;
}

In [5]:
Node* Node::next() const {
    return next_node;
}

## Accessors (cont.)

- In C++, a member function cannot have the same name as a member variable

|                     | Member Variables | Member Functions |
|--------------------:|:----------------:|:----------------:|
| Vary capitalization | next_node        | Next_node() or NextNode() |
| Prefix with "get"   | next_node        | get_next_node() / getNextNode() |
| Use an underscore   | next_node_       | next_node() |
| Different names     | next_node        | next() |

- Always use the naming convention and coding styles used by your employer - even if you disagree with them
  - Consistency aids in maintenance

## Linked List Class

- Because each node in a linked lists refers to the next, the linked list class need only link to the first node in the list

- The linked list class requires member variable: a pointer to a node

```cpp
class List {
    public:
        class Node {...};

    private:
        Node *list_head;
    // ...
};
```

## Structure

- To begin, let us look at the internal representation of a linked list

- Suppose we want a linked list to store the values in this order

    42  95  70  81


## Structure (cont.)

- A linked list uses *linked allocation*, and therefore each node may appear anywhere in memory

- Also the memory required for each node equals the memory required by the member variables
    - 4 bytes for the linked list (a pointer)
    - 8 bytes for each node (an int and a pointer)
        - We are assuming a 32-bit machine


## Structure (cont.)

- Such a list could occupy memory as follows:
    - The `next_node` pointers store the addresses of the next node in the list

![](../img/linked-list-example.png)

## Structure (cont.)

- Because the addresses are arbitrary, we can remove that information:

![](../img/linked-list-example2.png)

## Structure (cont.)

- We will clean up the representation as follows:

![](../img/linked-list-example3.png)
	
- We do not specify the addresses because they are arbitrary and:
    - The contents of the circle is the value
    - The `next_node` pointer is represented by an arrow

## Operations

- First, we want to create a linked list

- We also want to be able to manage the stored values in the linked list
    - insert into,
    - access, and
    - erase from

## Operations (cont.)

- We can do them with the following operations:
    - Adding, retrieving, or removing the value at the front of the linked list
```cpp
void push_front( int );
int front() const;
void pop_front();
```
    - We may also want to access the head of the linked list
``` cpp
Node *begin() const;
```

- Member functions that may change the object acted upon are variously called *mutators*, *modifiers*, and, when it involves changing a single member variable, *setters*

## Operations (cont.)

- All these operations relate to the first node of the linked list

- We may want to perform operations on an arbitrary node of the linked list, for example:

- Find the number of instances of an integer in the list:
```cpp
int count( int ) const;
```

- Remove all instances of an integer from the list:
```cpp
int erase( int );
```

## Capacity

- Additionally, we may wish to check the state: 
    - Is the linked list empty?
```cpp
bool empty() const;
```
    - How many objects are in the list?
```cpp
int size() const;
```

- The list is empty when the `list_head` pointer is set to `nullptr`

Consider this simple (but **incomplete**) linked list class:

In [6]:
class List {
    public:
        //class Node {...}; // we defined it outside of the List class scope
        List();

        // Accessors
        bool empty() const;
        int size() const;
        int front() const;
        Node *begin() const;
        Node *end() const;
        int count( int ) const;

        // Mutators
        void push_front( int );
        int pop_front();
        int erase( int );

    private:
        Node *list_head;
};

## Constructor

- The constructor initializes the linked list
    - We do not count how may objects are in this list, thus:
        - we must rely on the last pointer in the linked list to point to a special value
        - in C++, that standard value is `nullptr`

- Thus, in the constructor, we assign `list_head` the value `nullptr`

- We will always ensure that when a linked list is empty, the list head is assigned `nullptr`

In [7]:
List::List(): list_head( nullptr ) { } // empty constructor

## Allocation

The constructor is called whenever an object is created, either:

- Statically
    - The following statement defines `ls` to be a linked list and the compiler deals with memory allocation
```cpp
List ls;
```

- Dynamically
	- The following statement requests sufficient memory from the OS to store an instance of the class
```cpp
List *pls = new List();
```

- In both cases, the memory is allocated and then the constructor is called

## Static Allocation

In [8]:
int f() {
    List ls;   // ls is declared as a local variable on the stack

    ls.push_front( 3 );
    cout << ls.front() << endl;

    // The return value is evaluated
    // The compiler then calls the destructor for local variables
    // The memory allocated for 'ls' is deallocated

    return 0;
}

## Dynamic Allocation

In [9]:
List *f( int n ) {
    List *pls = new List();  // pls is allocated memory by the OS

    pls->push_front( n );
    cout << pls->front() << endl;

    // The address of the linked list is the return value
    // After this, the 4 bytes for the pointer 'pls' is deallocated
    // The memory allocated for the linked list is still there

    return pls;
}

## `empty()`

- Starting with the easier member functions:
```cpp
bool List::empty() const {
    if ( list_head == nullptr ) {
        return true;
    } else {
        return false;
    }
}   
```

- Better yet:

In [10]:
bool List::empty() const {
    return ( list_head == nullptr );
}

## `begin()`

- The member function `Node *begin() const` is easy enough to implement:

In [11]:
Node *List::begin() const {
    return list_head;
}

- This will always work: if the list is empty, it will return `nullptr`

## `end()`

- The member function `Node *end() const` equals whatever the last node in the linked list points to

In [12]:
// In this case, nullptrfront
Node *List::end() const {
    return nullptr;
}

## `front()`

- To get the first value in the linked list, we must access the node to which the `list_head` is pointing

- Because we have a pointer, we must use the `->` operator to call the member function:
```cpp
int List::front() const {
    return begin()->value();
}
```

- The member function `int front() const` requires some additional consideration, however:
    - What if the list is empty?

- If we tried to access a member function of a pointer set to `nullptr`, we would access restricted memory
    - The operating system would terminate the running program
    - Instead, we can use an exception handling mechanism where we thrown an exception

In [16]:
int List::front() const {
    if ( empty() ) {
        throw underflow_error("List is empty");
    }
    return begin()->value();
}

## Software Engening Tip

- Why is `empty()` better than

```cpp
int List::front() const {
    if ( list_head == nullptr ) {
        throw underflow();
  }

    return list_head->node_value;
}
```

- Two benefits:
    - More readable
    - If the implementation changes we do nothing

## Inserting at the Head

- Step required for insering a new element at the beginning of the list
    - Allocate a new node
    - Insert new element
    - Have new node point to old head
    - Update head to point to new node
- Corresponding mutator function is `void push_front(int)`

## `push_front`

Let us add a value in front of the list

- If it is empty, we start with:
![](../img/llist01.png)

- and, if we try to add 81, we should end up with:
![](../img/llist02.png)

- To visualize what we must do:
    - We must create a new node which:
        - stores the value 81, and
        - is pointing to 0
    - We must then assign its address to list_head

- We can do this as follows:
```cpp
list_head = new Node( 81, nullptr );
```

- We could also use the default value...

- Suppose however, we already have a non-empty list
![](../img/llist02.png)

- Adding 70, we want:
![](../img/llist03.png)

- To achieve this
    - We must we must create a new node which:
        - stores the value 70, and
        - is pointing to the current list head
    - We must then assign its address to `list_head`

- We can do this as follows:
```cpp
list_head = new Node( 70, list_head );
```

In [None]:
void List::push_front( int n ) {
    if ( empty() ) {
        list_head = new Node( n, nullptr );
    } else {
        list_head = new Node( n, begin() );
    }
}

- We could, however, note that when the list is empty, `list_head == 0`, thus we could shorten this to:

```cpp
void List::push_front( int n ) {
    list_head = new Node( n, list_head );
}
```

- Are we allowed to do this?

In [None]:
void List::push_front( int n ) {
    list_head = new Node( n, begin() );
}

- **Yes**:  the right-hand side of an assignment is evaluated first
    - The original value of `list_head` is accessed first before the function call is made

## Question

- Does this work?

```cpp
void List::push_front( int n ) {
    Node new_node( n, begin() );
    list_head = &new_node;
}
```

- Why or why not?  What happens to `new_node`?

- How does this differ from
```cpp
void List::push_front( int n ) {
    Node *new_node = new Node( n, begin() );
    list_head = new_node;
}
```

## Removing at the Head

- Erasing the element from the front of the list requires:

    1. Update head to point to next node in the list
    2. Free memory of the former first node

## `pop_front`

- Erasing from the front of a linked list is even easier:
    - We assign the list head to the next pointer of the first node

- Graphically, given:
![](../img/llist04.png)

- we want
![](../img/llist05.png)

- Easy enough:
```cpp
int List::pop_front() {
    int e = front();
    list_head = begin()->next();
    return e;
}
```

- Unfortunately, we have some problems:
    - The list may be empty
    - We still have the memory allocated for the node containing 70

- Does this work?	

```cpp
int List::pop_front() {
    if ( empty() ) {
        throw underflow_error("List is empty");
    }

    int e = front();
    delete begin();               /// ????
    list_head = begin()->next();  /// ????
    return e;
}
```

int List::pop_front() {
<div style="margin-left: 1em">
    
if ( empty() ) {
     throw underflow_error();
}

<div><span style="color:red"> int e = front(); </span><img style="vertical-align: center;  max-width: auto;  display: inline-block;" src="../img/llist06.png"/>

delete begin();

list_head = begin()->next();

return e;
</div>
}

int List::pop_front() {
<div style="margin-left: 1em">
    
if ( empty() ) {
     throw underflow_error();
}

int e = front();    
    
<div><span style="color:red"> delete begin(); </span><img style="vertical-align: center;  max-width: auto;  display: inline-block;" src="../img/llist07.png"/>

list_head = begin()->next();

return e;
</div>
}

int List::pop_front() {
<div style="margin-left: 1em">
    
if ( empty() ) {
     throw underflow_error();
}

int e = front();    

delete begin();
    
list_head = begin()->next();
    
<div><span style="color:red"> return e; </span><img style="vertical-align: center;  max-width: auto;  display: inline-block;" src="../img/llist08.png"/>

</div>
}

## Problem

- The problem is, we are accessing a node which we have just deleted

- Unfortunately, this will work more than 99% of the time:
    - The running program (process) may still own the memory
- Once in a while it will fail ...
    - ... and it will be almost impossible to debug
        
![](https://imgs.xkcd.com/comics/forgetting.png)

## Solution

- The correct implementation assigns a temporary pointer to point to the node being deleted:

In [21]:
int List::pop_front() {
     if ( empty() ) {
          throw underflow_error("List is empty");
     }

    int e = front();
    Node *ptr = list_head;
    list_head = list_head->next();
    delete ptr;
    return e;
}

int List::pop_front() {
<div style="margin-left: 1em">
    
if ( empty() ) {
     throw underflow_error();
}


<div><span style="color:red"> int e = front(); </span><img style="vertical-align: center;  max-width: auto;  display: inline-block;" src="../img/llist09.png"/>

Node *ptr = begin();   
    
list_head = begin()->next();
    
delete ptr;

return e;
    
</div>
}

int List::pop_front() {
<div style="margin-left: 1em">
    
if ( empty() ) {
     throw underflow_error();
}

int e = front();
    
<div><span style="color:red"> Node *ptr = begin(); </span><img style="vertical-align: center;  max-width: auto;  display: inline-block;" src="../img/llist10.png"/>
    
list_head = begin()->next();
    
delete ptr;

return e;
    
</div>
}

int List::pop_front() {
<div style="margin-left: 1em">
    
if ( empty() ) {
     throw underflow_error();
}

int e = front();
    
Node *ptr = begin();
    
<div><span style="color:red"> list_head = begin()->next(); </span><img style="vertical-align: center;  max-width: auto;  display: inline-block;" src="../img/llist11.png"/>    
    
delete ptr;

return e;
    
</div>
}

int List::pop_front() {
<div style="margin-left: 1em">
    
if ( empty() ) {
     throw underflow_error();
}

int e = front();
    
Node *ptr = begin();
    
list_head = begin()->next();
    
<div><span style="color:red"> delete ptr; </span><img style="vertical-align: center;  max-width: auto;  display: inline-block;" src="../img/llist12.png"/>        

return e;
    
</div>
}

## Stepping through a Linked List

- The next step is to look at member functions which potentially require us to step through the entire list:

```cpp
int size() const;
int count( int ) const;
int erase( int );
```

- The second counts the number of instances of an integer, and the last removes the nodes containing that integer

- The process of stepping through a linked list can be thought of as being analogous to a for-loop:
- We initialize a temporary pointer with the list head
- We continue iterating until the pointer equals nullptr
- With each step, we set the pointer to point to the next object

Thus, we have:

```cpp
for ( Node *ptr = begin(); ptr != end(); ptr = ptr->next() ) {
   // do something
   // use ptr->fn() to call member functions
   // use ptr->var to assign/access member variables
}
```  

## Initialization

- With the initialization and first iteration of the loop, we have:

![](../img/llist13.png)

- `ptr != nullptr` and thus we evaluate the body of the loop and then set ptr to the next pointer of the node it is pointing to

## Stepping

- `ptr != nullptr` and thus we evaluate the loop and increment the pointer

![](../img/llist14.png)

- In the loop, we can access the value being pointed to by using `ptr->value()`

## Stepping

- `ptr != nullptr` and thus we evaluate the loop and increment the

![](../img/llist15.png)

- Also, in the loop, we can access the next node in the list by using `ptr->next()`

## Stepping

- `ptr != nullptr` and thus we evaluate the loop and increment the

![](../img/llist16.png)

- 	This last increment causes `ptr == nullptr` 

## Reached the End

- Here, we check and find `ptr != nullptr` is false, and thus we exit the loop

![](../img/llist17.png)

- Because the variable `ptr` was declared inside the loop, we can no longer access it

## `count`

- To implement `int count(int) const`, we simply check if the argument matches the value with each step
    - Each time we find a match, we increment the count
    - When the loop is finished, we return the count
    - The size function is simplification of count

    

In [23]:
int List::count( int n ) const {
    int node_count = 0;

    for ( Node *ptr = begin(); ptr != end(); ptr = ptr->next() ) {
        if ( ptr->value() == n ) {
            ++node_count;
        }
    }

    return node_count;
}

## `erase`

- To remove an arbitrary value, i.e., to implement `int erase( int )`, we must update the previous node
	
- For example, given

![](../img/linked-list-example3.png)

- if we delete **70**, we want to end up with

![](../img/linked-list-example4.png)

## ## Software Engening Tip

- The `erase` function must modify the member variables of the node prior to the node being removed

- Thus, it must have access to the member variable `next_node`

- We could supply the member function
```cpp
void set_next( Node * );
```
however, this would be globally accessible

- Possible solutions:
    - Friends
    - Nested classes

## Friends

- In C++, you could explicitly break encapsulation by declaring the class `List` to be a **friend** of the class `Node`:

```cpp
class A {
    private:
        int class_size;
    // ... declaration ...
    friend class B;
};
```

- Now, any member function of class `B` has access to all private member variables of class `A`

- For example, if the `Node` class was one class, and the `List` class was a **friend** of the `Node` class, `List::erase` could modify nodes:

```cpp
int List::erase( int n ) {
    int node_count = 0;
    // ... 
    for ( Node *ptr = begin(); ptr != end(); ptr = ptr->next() ) {
        // ...
        if ( some condition ) {
            ptr->next_node = ptr->next()->next(); // access private `next_node` of the Node class 
            // ...
            ++node_count;
        }
    }
    return node_count;
}
```

## Nested Classes

- In C++, you can nest one class inside another, which is what we do: 

```cpp
class Outer {
    private:
        class Nested {
            private:
                int node_value;
            public:
                int get() const;
                void set( int );
        };
 
        Nested stored;
 
    public:
        int get() const;
        void set( int );
};
```

The function definitions are as one would expect:

```cpp
int Outer::Nested::get() const {
    return node_value;
}
 
void Outer::Nested::set( int n ) {
    node_value = n;
}
```

```cpp
int Outer::get() const {
    return stored.get();
}
 
void Outer::set( int n ) {
    // Not allowed, as node_value is private  
    // stored.node_value = n; 
    stored.set( n );
}
```

## Destructor

- We dynamically allocated memory each time we added a new **int** into this list

- Suppose we delete a list before we remove everything from it
    - This would leave the memory allocated with no reference to it
    
![](../img/linked-list-delete.png)

- The destructor has to delete any memory which had been allocated but has not yet been deallocated

- This is straight-forward enough:

```cpp
while ( !empty() ) {
    pop_front();
}
```

- Is this efficient?
    - It runs in $O(n)$ time, where $n$ is the number of objects in the linked list
    - Given that *delete* is approximately 100× slower than most other instructions (it does call the OS), the extra overhead is negligible...