In [9]:
# Linked lists
class Link:
    empty = ()
    
    def __init__(self, first, rest = empty):
        assert rest is Link.empty or isinstance(rest, Link)
        self.first = first
        self.rest = rest
        
    def __repr__(self):
        if self.rest:
            rest_str = ', ' + repr(self.rest)
        else:
            rest_str = ''
        return 'Link({0}{1})'.format(self.first, rest_str)

# Sets as Ordered Linked Lists

In this section, we'll consider representing sets as ordered linked lists.

## Sets as Ordered Sequences

#### Proposal 2
A set is represented by a linked list with unique elements that is **ordered from least to greatest**.

The main idea here is not about making sure that sets are ordered. The main idea is to improve the order of growth for various sets operations. Thus, we're trying to make sets more efficient while still maintaining their general properties.

|Parts of the program that... | Assume that sets are...| Using...|
| ---- | --- | --- |
| Use sets to contain values | Unordered collections | `empty`, `contains`, `adjoin`, `intersect`, `union` |
| Implement set operations | Ordered linked lists | `first`, `rest`, `<`, `>`, `==` |

Note that between the first row and the second row of the table above, there's an abstraction barrier that separates:
1. Parts of the program that work with sets, and
2. Parts that implement their representation

Different parts of a program may make different assumptions about data

## Searching an Ordered List

The advantage of ordering the elements in a list is that we know something about the rest of the list just by looking at its first element. 

Let's say we have the following set,

In [None]:
>>> s = Link(1, Link(3, Link(5)))

Here we represent it as a linked list,

<img src = 'represent.jpg' width = 800/>

### Contains

Now if we want to do the `contains` operation,

In [None]:
>>> contains(s, 1)
True

This is easy. We just look at the `first` element, it's `1`. However,

In [None]:
>>> contains(s, 2)

Above is different process, but fortunately it doesn't look through the entire list. 

1. Python look at the first element, `1`. It's not `2`, so Python moves on to the next element.
2. Python moves to the next element, `3`, which is greater than `2`. 
    * Python knows that `2` is not in the list so far and it is not in the rest of the list either.
    
Thus, Python returns `False`.

From this example, we can see that performing `contains` operation on a sorted representation of a set is faster because we can tell early whether an element is within a list. If the set was unordered, then Python has to go through the entire list. 

The order of growth of this `contains` is still $\Theta(n)$ in average case, assuming `v` is not in the list or in an arbitrary location. 

However, there is an efficiency effect. On average, if we keep taking random `v` that we search for, Python only needs to search maybe half of the list. This constant factor doesn't show up in $\Theta$ notation, but it is potentially important in practice.

### Adjoin

In [None]:
>>> t = adjoin(s, 2)

This means create a new set that contains `s` and `2`, and assign it to `t`.

1. Python looks at `2` and see whether if it's greater than `1`. Python moves on to the next element
2. Python looks at `3`, and realizes that `2` is less than `3`. Python inserts `2` before `3`.

<img src = 'insert.jpg' width = 900/>

3. Python now completes the new set by adding the elements that were less than `2`.

<img src = 'complete.jpg' width = 900/>

The order of growth of `adjoin` is also $\Theta(n)$ since most of the work is finding where we would insert `2`. 

Let's implement both of these operations!

## Demo - Sets as Sorted Sequences

This time we're going to implement sets as sorted sequence

In [6]:
# The definition of empty doesn't change
def empty(s):
    return s is Link.empty

This time, `contains` becomes slightly faster since we can return `False` immediately if `s.first` is greater than `v`.

In [7]:
def contains(s, v):
    if empty(s) or s.first > v:
        return False
    elif s.first == v:
        return True
    else:
        return contains(s.rest, v)

And we'll define `adjoin` as well.

In [8]:
def adjoin(s, v):
    # if s is empty or if v is less than the very first element
    if empty(s) or v < s.first:
        # construct a linked list with v as the first value and s as the rest of the list
        return Link(v, s)
    # if v is already in the set, do nothing
    elif v == s.first:
        return s
    # if v is supposed to be inserted somewhere within the linked list
    else:
        return Link(s.first, adjoin(s.rest, v))
    

Let's test these functions! Below we have the `Link` definition.

In [9]:
# Linked lists
class Link:
    empty = ()
    
    def __init__(self, first, rest = empty):
        assert rest is Link.empty or isinstance(rest, Link)
        self.first = first
        self.rest = rest
        
    def __repr__(self):
        if self.rest:
            rest_str = ', ' + repr(self.rest)
        else:
            rest_str = ''
        return 'Link({0}{1})'.format(self.first, rest_str)

In [10]:
s = Link(1, Link(3, Link(5)))

In [11]:
contains(s, 1)

True

In [12]:
contains(s, 2)

False

In [13]:
contains(s, 6)

False

In [14]:
adjoin(s, 3)

Link(1, Link(3, Link(5)))

In [15]:
adjoin(s, 0)

Link(0, Link(1, Link(3, Link(5))))

Note that `adjoin` doesn't change the original set `s`.

In [16]:
adjoin(s, 2)

Link(1, Link(2, Link(3, Link(5))))