## ADTs

Data Structures, when viewed in terms of their storage: is it an array, or linked list, or dynamic array; are concrete implementations. But when viewed in terms of what they can do, or the operations they support can be viewed as Abstract Data Types (ADTs). 

A stack can be thought of from the perspective of its operations: push, pop, and peek. The O(n) these give us depend upon the concrete underlying representation: a python list may be, or our linked list class. Appending (pushing) at the front of LL is O(1). So is popping, peeking. If you used the front of the list its O(n)..in Python its better to use the back.

Notice that the operations supported on the stack are a subset, or even a facade over operations on the list or linked list (its possible we might need to put multiple manipulations together). In this sense, the stack is better developed as a new class, with any implementation specifics of the underlying data structures hidden away.

If you dont do this, you are **evil**. You have **exposed** your **Representation**. Dont do it. Some tricks to achieve this are to make explicit copies of containers or iterators on the way in or out, and to only expose methods that are facades over the methods of underlying storage. Additionally, use single underscores or double underscores for internal stuff...python will even mangle names starting with two underscores for you (the client can still get at themethods but then does so at their own peril).

### Specifying an ADT 

When you specify an ADT, you should **never never ** have anything about the representation leaking into your specification.

A **specification** is a **collection of procedural abstractions**. It is not a collection of procedures.

These abstrations provide a
- set of values, creatable/destroyable, manipulatable, and observable
- only by an API

How do we specify an ADT? We can do it in the informal way we have done so far: the concept of a sequence, or something thats iterable. But this could be thought of as too abstract a definition. It may help to tell us that a stack is a sequence, but does not help us define a stack ADT formally. There is nothing to stop us doing this informally though and communicating it via documentation. Indeed this is what we mostly do in documenting APIs for clients.

But we could also (mis?)use inheritance for this by creating a base class and having implementations inherit from this. But these **interfaces** are important enough that languages provide support to make them explicit and verifiable. Java for example has *interfaces*, Smalltalk and Objective-C have protocols. In C++ the STL provides data-type agnostic stacks, vectors, etc as well with well defined interfaces and implementations under the hood.

Python provides us this whole gamut of possibilities as well, from the dynamic, just-documented protocols characteristic of duck typing, to Abstract Base Classes (ABC's) which make the interface explicit, and verifiable to boot.


How does this work?

- Firstly note that even without ABCs, a class has an interface defined by its publicly accessible attributes (methods or data). This also includes dunder methods; while these are NOT publicly accessible, they control the external behavior on functions like `len` and `iter`.

- Secondly we can use ABC's to define such protocols, and then mix-in into a class multiple such protocols to give the class its aggregated behavior. We shall see this use of mixins soon. Thus for example a class can be a sequence as well as an iterable.

### ABCs

How do these work? A simple example suffices:

In [325]:
class Answer:
    def __len__(self): 
        return 42
from collections import abc
isinstance(Answer(), abc.Sized), issubclass(Answer, abc.Sized)

(True, True)

NO EXPLICIT INHERITANCE HERE!

From https://hg.python.org/cpython/file/3.5/Lib/_collections_abc.py#l300  we can see:

```python
class Sized(metaclass=ABCMeta):

    __slots__ = ()

    @abstractmethod
    def __len__(self):
        return 0

    @classmethod
    def __subclasshook__(cls, C):
        if cls is Sized:
            if any("__len__" in B.__dict__ for B in C.__mro__):
                return True
        return NotImplemented
```

The `isinstance`, `issubclass` dynamically check to see what the abstract methods of the `Sized` ABC are.

We'll take up ABC's in more detail next time (how they work), but before you rush out to use them, notice they involve `isinstance` checks. As Fluent Python puts it:

>However, even with ABCs, you should beware that excessive use of isinstance checks may be a code smell—a symptom of bad OO design. It’s usually not OK to have a chain of if/elif/elif with insinstance checks performing different actions depending on the type of an object: you should be using polymorphism for that—i.e., designing your classes so that the interpreter dispatches calls to the proper methods

>On the other hand, it’s usually OK to perform an insinstance check against an ABC if you must enforce an API contract: “Dude, you have to implement this if you want to call me,” as technical reviewer Lennart Regebro put it. That’s particularly useful in sys‐ tems that have a plug-in architecture. Outside of frameworks, duck typing is often sim‐ pler and more flexible than type checks.

>ABCs are meant to encapsulate very general concepts, abstractions, introduced by a framework—things like “a sequence” and “an exact number.” [Readers] most likely don’t need to write any new ABCs, just use existing ones correctly, to get 99.9% of the benefits without serious risk of misdesign.

### The `SimpleSet` ADT

Here we are going to define a very simple ADT...a set with some simple operations. Thus we will use the ABC mechanism as a way of **documenting** our specification and getting verification for free.

In [328]:
import abc
class SimpleSetInterface(abc.ABC):
    
    @abc.abstractmethod
    def __len__(self):
        "A SimpleSet has a length"
        
    @abc.abstractmethod
    def __iter__(self):
        "iteration. order is not guaranteed"
    
    @abc.abstractmethod
    def __contains__(self, item)->bool:
        "A test for whether item is in set"
        
    @abc.abstractmethod
    def add(self, item)->None:
        "add item to set"
        
    @abc.abstractmethod
    def rem(self, item)->None:
        "delete item from set"
        
    @abc.abstractmethod
    def union(self, other:SimpleSet)->SimpleSet:
        "union with another set"
        
    @abc.abstractmethod
    def intersection(self, other:SimpleSet)->SimpleSet:
        "intersection with another set"

Notice that we cannot create a SimpleSetInterface explicitly.

In [329]:
a = SimpleSetInterface()

TypeError: Can't instantiate abstract class SimpleSetInterface with abstract methods __contains__, __iter__, __len__, add, intersection, rem, union

### Implementation

The implementation of an ADT is provided by a class for us. To do this we need to

- first choose a representation, the `rep`
- implement the procedure abstractions of the abstract class in terms of this `rep`

The representation ought to have the most frequently used procedures fast. But we mightnot know this at the onset. Then the abstraction allows us to change representations later.
Implementation of an ADT

#### Set representation with list

Lets first implement a set simply as a list. We do so below, requiring some gymnastics to make sure that there are no duplicates when we use the "implemented abstract operations"

In [352]:
import reprlib
class SimpleSet1:
    """
    >>> A=SimpleSet1([1,2,3,1])
    >>> B=SimpleSet1([2,3,4,4,5])
    >>> sorted(list(A))
    [1, 2, 3]
    >>> sorted(list(A.union(B)))
    [1, 2, 3, 4, 5]
    >>> sorted(list(A.intersection(B)))
    [2, 3]
    >>> A.rem(1)
    >>> sorted(list(A))
    [2, 3]
    """
    def __init__(self, container=[]):
        if container:
            self._storage = list(container)
        else:
            self._storage = []
        
    def __contains__(self, item):
        if item in self._storage:
            return True
        else:
            return False
        
    def __len__(self):
        counter = 0
        slist=[]
        for ele in self._storage:
            if ele not in slist:
                slist.append(ele)
                counter += 1
        return counter
    
    def __iter__(self):
        slist=[]
        for ele in self._storage:
            if ele not in slist:
                slist.append(ele)
                yield ele
                
    def add(self, item):
        self._storage.append(item)
        
    def rem(self, item): #this is wrong
        index = self._storage.index(item)
        del self._storage[index]
        
    def union(self, other): #bust the representation here.
        return SimpleSet1(self._storage + other._storage)
    
    def intersection(self, other): #here too. ok but document
        intlist = filter(lambda x : x in other._storage, self._storage)
        return SimpleSet1(intlist)
    
    def __repr__(self):
        slist=[]
        for ele in self._storage:
            if ele not in slist:
                slist.append(ele)
        return reprlib.repr(slist).replace('[','{').replace(']','}')
    

In [371]:
SimpleSetInterface.register(SimpleSet1)

__main__.SimpleSet1

In [372]:
C=SimpleSet1([1,2,3,1])

In [374]:
isinstance(C, SimpleSetInterface), issubclass(SimpleSet1, SimpleSetInterface)

(True, True)

In [354]:
C #this is NOT part of the set interface

{1, 2, 3}

In [355]:
from doctest import run_docstring_examples as dtest
dtest(SimpleSet1, globals(), verbose=True)

Finding tests in NoName
Trying:
    A=SimpleSet1([1,2,3,1])
Expecting nothing
ok
Trying:
    B=SimpleSet1([2,3,4,4,5])
Expecting nothing
ok
Trying:
    sorted(list(A))
Expecting:
    [1, 2, 3]
ok
Trying:
    sorted(list(A.union(B)))
Expecting:
    [1, 2, 3, 4, 5]
ok
Trying:
    sorted(list(A.intersection(B)))
Expecting:
    [2, 3]
ok
Trying:
    A.rem(1)
Expecting nothing
ok
Trying:
    sorted(list(A))
Expecting:
    [2, 3]
**********************************************************************
File "__main__", line ?, in NoName
Failed example:
    sorted(list(A))
Expected:
    [2, 3]
Got:
    [1, 2, 3]


The tests tell us there is something wrong in our implementation. Sure enough, in a list when we do the delete in python, it removes only the first match. Lets fix it.

In [368]:
class SimpleSet1:
    """
    A simple set implementation that has some basic functionality.
    Implements SimpleSetInterface.
    
    AbsFun: the list [a,b,...,z] represents the
    smallest set containing all the elements a,b,...,z.
    The list may contain duplicates.
    [] represents the empty set.
    
    >>> A=SimpleSet1([1,2,3,1])
    >>> B=SimpleSet1([2,3,4,4,5])
    >>> sorted(list(A))
    [1, 2, 3]
    >>> sorted(list(A.union(B)))
    [1, 2, 3, 4, 5]
    >>> sorted(list(A.intersection(B)))
    [2, 3]
    >>> A.rem(1)
    >>> sorted(list(A))
    [2, 3]
    """
    def __init__(self, container=[]):
        if container:
            self._storage = list(container)
        else:
            self._storage = []
        
    def __contains__(self, item):
        if item in self._storage:
            return True
        else:
            return False
        
    def __len__(self):
        counter = 0
        slist=[]
        for ele in self._storage:
            if ele not in slist:
                slist.append(ele)
                counter += 1
        return counter
    
    def __iter__(self):
        slist=[]
        for ele in self._storage:
            if ele not in slist:
                slist.append(ele)
                yield ele
                
    def add(self, item):
        self._storage.append(item)
        
    def rem(self, item):
        indices_to_delete=[]
        for i, v in enumerate(self._storage):
            if v==item:
                indices_to_delete.append(i)
        for i in sorted(indices_to_delete, reverse=True):
            del self._storage[i]
        
    def union(self, other): #bust the representation here.
        return SimpleSet1(self._storage + other._storage)
    
    def intersection(self, other): #here too. ok but document
        intlist = filter(lambda x : x in other._storage, self._storage)
        return SimpleSet1(intlist)
    
    def __repr__(self):
        slist=[]
        for ele in self._storage:
            if ele not in slist:
                slist.append(ele)
        return reprlib.repr(slist).replace('[','{').replace(']','}')
    

Ok!, Now we test again...

In [369]:
from doctest import run_docstring_examples as dtest
dtest(SimpleSet1, globals(), verbose=True)

Finding tests in NoName
Trying:
    A=SimpleSet1([1,2,3,1])
Expecting nothing
ok
Trying:
    B=SimpleSet1([2,3,4,4,5])
Expecting nothing
ok
Trying:
    sorted(list(A))
Expecting:
    [1, 2, 3]
ok
Trying:
    sorted(list(A.union(B)))
Expecting:
    [1, 2, 3, 4, 5]
ok
Trying:
    sorted(list(A.intersection(B)))
Expecting:
    [2, 3]
ok
Trying:
    A.rem(1)
Expecting nothing
ok
Trying:
    sorted(list(A))
Expecting:
    [2, 3]
ok


We passed the tests. Yay!

Notice that we added something strange to the documentation. It reads like this:

```
AbsFun: the list [a,b,...,z] represents the
    smallest set containing all the elements a,b,...,z.
    The list may contain duplicates.
    [] represents the empty set.
```

### The Abstract Function

The **Abstract Function** helps in telling us the meaning of our representation. It maps the concrete representation (here a list) to the abstract value (a set). It helps us, the implementors, reason from the client perspective.

What is the client perspective? The client should NOT be able to distinguish  implementations based on their functional behavior. Here we have a list with repeated values giving us a set with unique ones.  The client should not know this. But the implementer here knows that there is a loss of information in going from the list to the set...this loss of information is described by the **Abstract Function**.

![](http://www.cs.cornell.edu/courses/cs3110/2011sp/lectures/lec08-absfun-repinv/images/abst-fcn2.gif)

(diagram from cornell cs 3110)

Note that several lists may map to the same set, ie this function is many-one. Additionally some values in the domain may not map to any in the range (not true here, we'll see an example soon).

### Refactoring our Implementation

Something about our implementation does not sit well. It seems un-necessarily loosey-goosey, and brittle...witness the mistake we met. There does not seem to be any way except for the *Abstraction Function* to formally reason about what the lists have. Indeed, perhaps the only way we might have been able to catch the deletion formally would have been to impose a post-condition on the deletion that ALL values corresponding to the asked-for deletion in the list implementation were removed.

Now that we have our tests we can confidently refactor our implementation to one in which we have no duplicates in the list. Notice our Abstract function has changed somewhat, as it does not need to go through the contortions to represent the fact that we might have duplicates.

In [407]:
class SimpleSet2:
    """
    AbsFun: the list [a,b,...,z] represents the
    set  a,b,...,z.
    [] represents the empty set.
    
    Examples:
    
    >>> A=SimpleSet2([1,2,3,1])
    >>> B=SimpleSet2([2,3,4,4,5])
    >>> sorted(list(A))
    [1, 2, 3]
    >>> sorted(list(A.union(B)))
    [1, 2, 3, 4, 5]
    >>> sorted(list(A.intersection(B)))
    [2, 3]
    >>> A.rem(1)
    >>> sorted(list(A))
    [2, 3]
    >>> C=SimpleSet2()
    >>> C
    {}
    >>> sorted(list(C.union(A)))
    [2, 3]
    >>> sorted(list(C.intersection(A)))
    []
    """
    def __init__(self, container=[]):
        if container:
            self._storage=[]
            for ele in container:
                self.add(ele)
        else:
            self._storage = []
        
    def __contains__(self, item):
        if item in self._storage:
            return True
        else:
            return False
        
    def __len__(self):
        return len(self._storage)
    
    def __iter__(self):
        for ele in self._storage:
            yield ele
            
    def add(self, item):#this one is wrong
        self._storage.append(item)
        
    def rem(self, item): #this is now right
        index = self._storage.index(item)
        del self._storage[index]
        
    def union(self, other):
        return SimpleSet2(self._storage + other._storage)
    
    def intersection(self, other):
        intlist = list(filter(lambda x : x in other._storage, self._storage))
        return SimpleSet2(intlist)
    
    def __repr__(self):
        return reprlib.repr(self._storage).replace('[','{').replace(']','}')
    

In [408]:
dtest(SimpleSet2, globals(), verbose=True)

Finding tests in NoName
Trying:
    A=SimpleSet2([1,2,3,1])
Expecting nothing
ok
Trying:
    B=SimpleSet2([2,3,4,4,5])
Expecting nothing
ok
Trying:
    sorted(list(A))
Expecting:
    [1, 2, 3]
**********************************************************************
File "__main__", line ?, in NoName
Failed example:
    sorted(list(A))
Expected:
    [1, 2, 3]
Got:
    [1, 1, 2, 3]
Trying:
    sorted(list(A.union(B)))
Expecting:
    [1, 2, 3, 4, 5]
**********************************************************************
File "__main__", line ?, in NoName
Failed example:
    sorted(list(A.union(B)))
Expected:
    [1, 2, 3, 4, 5]
Got:
    [1, 1, 2, 2, 3, 3, 4, 4, 5]
Trying:
    sorted(list(A.intersection(B)))
Expecting:
    [2, 3]
ok
Trying:
    A.rem(1)
Expecting nothing
ok
Trying:
    sorted(list(A))
Expecting:
    [2, 3]
**********************************************************************
File "__main__", line ?, in NoName
Failed example:
    sorted(list(A))
Expected:
    [2, 3]
Got:
    

Our union failed and out=r intersection failed. The problem is clearly in add: add violates our idea of the representation that the list must have unique values.

Indeed notice even out implementation of `__len__`: there is no uniqueness chean any more. How do we know that we dont do this? Since code does not say no duplicates, an implemmenter needs to go digging to figure this out, and wont be anle to reason locally whether `__len__` is implemented correctly or not.
 
Thus our constraint on our representation (implementation) needs to be clearly communicated, and further used for testing! Such a constraint is called a representation Invariant.

### Representation Invariant (RI)

The representation Invariant tells us what MUST NOT CHANGE across multiple methods in the concrete implementations. The fact that the list has no duplicates must be respected by all concrete operations. In other words it captures whatever we must do and maintain on the underlying data structure to keep our external interface correct.

The abstraction function tells us the loss of information we pass on to our users. There are domain consisted of all possible lists. Remember we said that some lists might not map using the AbsFun to interface values? Which ones wont? The representation invariant tells us. In other words, the RI tells us which concrete data is valid given the abstract data.

The nature of the RI can be captured now in this diagram:

![](http://www.cs.cornell.edu/courses/cs3110/2011sp/lectures/lec08-absfun-repinv/images/ri-af.png)

(diagram from cornell cs 3110)

Ok so lets add that in to our documentation. And what we will do is to define a function `repOK` whose job it is to make sure all our operations obey this representation invariant

In [409]:
def repOK(inlist):
    testlist=[]
    for item in inlist:
        if item not in testlist:
            testlist.append(item)
    assert len(testlist)==len(inlist), "there are duplicates {} {}".format(len(testlist), len(inlist))
    return inlist

In [420]:
class SimpleSet2:
    """
    AbsFun: the list [a,b,...,z] represents the
    set  a,b,...,z.
    [] represents the empty set.
    
    RepInv: the list contains no duplicates.
    
    Examples:
    
    >>> A=SimpleSet2([1,2,3,1])
    >>> B=SimpleSet2([2,3,4,4,5])
    >>> sorted(list(A))
    [1, 2, 3]
    >>> sorted(list(A.union(B)))
    [1, 2, 3, 4, 5]
    >>> sorted(list(A.intersection(B)))
    [2, 3]
    >>> A.rem(1)
    >>> sorted(list(A))
    [2, 3]
    >>> C=SimpleSet2()
    >>> C
    {}
    >>> sorted(list(C.union(A)))
    [2, 3]
    >>> sorted(list(C.intersection(A)))
    []
    """
    def __init__(self, container=[]):
        if container:
            self._storage=[]
            for ele in container:
                self.add(ele)
        else:
            self._storage = []
        
    def __contains__(self, item):
        if item in self._storage:
            return True
        else:
            return False
        
    def __len__(self):
        return len(self._storage)
    
    def __iter__(self):
        for ele in self._storage:
            yield ele
            
    def add(self, item):#this one is wrong
        self._storage.append(item)
        repOK(self._storage)
        
    def rem(self, item): #this is now right
        index = self._storage.index(item)
        del repOK(self._storage)[index]
        repOK(self._storage)
        
    def union(self, other):
        s = SimpleSet2(repOK(self._storage) + repOK(other._storage))
        repOK(s._storage)
        return s
    
    def intersection(self, other):
        intlist = list(filter(lambda x : x in other._storage, repOK(self._storage)))
        s = SimpleSet2(intlist)
        repok(s._storage)
        return s
    
    def __repr__(self):
        return reprlib.repr(self._storage).replace('[','{').replace(']','}')
    
    


In [421]:
dtest(SimpleSet2, globals(), verbose=True)

Finding tests in NoName
Trying:
    A=SimpleSet2([1,2,3,1])
Expecting nothing
**********************************************************************
File "__main__", line ?, in NoName
Failed example:
    A=SimpleSet2([1,2,3,1])
Exception raised:
    Traceback (most recent call last):
      File "//anaconda/envs/py35/lib/python3.5/doctest.py", line 1320, in __run
        compileflags, 1), test.globs)
      File "<doctest NoName[0]>", line 1, in <module>
        A=SimpleSet2([1,2,3,1])
      File "<ipython-input-420-431e5b225ea0>", line 34, in __init__
        self.add(ele)
      File "<ipython-input-420-431e5b225ea0>", line 53, in add
        repOK(self._storage)
      File "<ipython-input-409-18c2b03fb366>", line 6, in repOK
        assert len(testlist)==len(inlist), "there are duplicates {} {}".format(len(testlist), len(inlist))
    AssertionError: there are duplicates 3 4
Trying:
    B=SimpleSet2([2,3,4,4,5])
Expecting nothing
*********************************************************

Aha, by testing the repinv we fail immediately. Lets fix this:

In [424]:
class SimpleSet2:
    """
    AbsFun: the list [a,b,...,z] represents the
    set  a,b,...,z.
    [] represents the empty set.
    
    RepInv: the list contains no duplicates.
    
    Examples:
    
    >>> A=SimpleSet2([1,2,3,1])
    >>> B=SimpleSet2([2,3,4,4,5])
    >>> sorted(list(A))
    [1, 2, 3]
    >>> sorted(list(A.union(B)))
    [1, 2, 3, 4, 5]
    >>> sorted(list(A.intersection(B)))
    [2, 3]
    >>> A.rem(1)
    >>> sorted(list(A))
    [2, 3]
    >>> C=SimpleSet2()
    >>> C
    {}
    >>> sorted(list(C.union(A)))
    [2, 3]
    >>> sorted(list(C.intersection(A)))
    []
    """
    def __init__(self, container=[]):
        if container:
            self._storage=[]
            for ele in container:
                self.add(ele)
        else:
            self._storage = []
        
    def __contains__(self, item):
        if item in self._storage:
            return True
        else:
            return False
        
    def __len__(self):
        return len(self._storage)
    
    def __iter__(self):
        for ele in self._storage:
            yield ele
            
    def add(self, item):
        if item not in repOK(self._storage):
            self._storage.append(item)
        repOK(self._storage)
        
    def rem(self, item): #this is now right
        index = self._storage.index(item)
        del repOK(self._storage)[index]
        repOK(self._storage)
        
    def union(self, other):
        s = SimpleSet2(repOK(self._storage) + repOK(other._storage))
        repOK(s._storage)
        return s
    
    def intersection(self, other):
        intlist = list(filter(lambda x : x in other._storage, repOK(self._storage)))
        s = SimpleSet2(intlist)
        repOK(s._storage)
        return s
    
    def __repr__(self):
        return reprlib.repr(self._storage).replace('[','{').replace(']','}')
    
    


In [425]:
dtest(SimpleSet2, globals(), verbose=True)

Finding tests in NoName
Trying:
    A=SimpleSet2([1,2,3,1])
Expecting nothing
ok
Trying:
    B=SimpleSet2([2,3,4,4,5])
Expecting nothing
ok
Trying:
    sorted(list(A))
Expecting:
    [1, 2, 3]
ok
Trying:
    sorted(list(A.union(B)))
Expecting:
    [1, 2, 3, 4, 5]
ok
Trying:
    sorted(list(A.intersection(B)))
Expecting:
    [2, 3]
ok
Trying:
    A.rem(1)
Expecting nothing
ok
Trying:
    sorted(list(A))
Expecting:
    [2, 3]
ok
Trying:
    C=SimpleSet2()
Expecting nothing
ok
Trying:
    C
Expecting:
    {}
ok
Trying:
    sorted(list(C.union(A)))
Expecting:
    [2, 3]
ok
Trying:
    sorted(list(C.intersection(A)))
Expecting:
    []
ok


Notice that having a pss through `repOK` in conjunction with testing saves the day. Clearly all these repOK's in production code may slow things down too much. Python usually rund in debug mode, but turning on optimization (-O) will make the assert into a no-op. There is still the computation of the uniqueness though that will cost.

Thus whether to keep repinv's in or not is a decision you must make. It might be worth atleast keeping them in comments. (Notice also i did not test non-destructive methods...to be complete you might want to repOK them as well.

Notice also that ita hard to set any representation invariant on our initial write of the code, and that a refactoring with a clear representation invariant gets us quite far. The general process of refactoring involves making yourself DRY, and more generally refactoring larger functions into smaller, testable ones.

In [426]:
class SimpleSet2:
    """
    AbsFun: the list [a,b,...,z] represents the
    set  a,b,...,z.
    [] represents the empty set.
    
    RepInv: the list contains no duplicates.
    
    Examples:
    
    >>> A=SimpleSet2([1,2,3,1])
    >>> B=SimpleSet2([2,3,4,4,5])
    >>> sorted(list(A))
    [1, 2, 3]
    >>> sorted(list(A.union(B)))
    [1, 2, 3, 4, 5]
    >>> sorted(list(A.intersection(B)))
    [2, 3]
    >>> A.rem(1)
    >>> sorted(list(A))
    [2, 3]
    >>> C=SimpleSet2()
    >>> C
    {}
    >>> sorted(list(C.union(A)))
    [2, 3]
    >>> sorted(list(C.intersection(A)))
    []
    """
    def __init__(self, container=[]):
        if container:
            self._storage=[]
            for ele in container:
                self.add(ele)#makes sure repinv is respected
        else:
            self._storage = []
        
    def __contains__(self, item):
        if item in self._storage:
            return True
        else:
            return False
        
    def __len__(self):
        return len(self._storage)
    
    def __iter__(self):
        for ele in self._storage:
            yield ele
            
    def add(self, item):
        #repOK(self._storage)
        if item not in self._storage:
            self._storage.append(item)
        #repOK(self._storage)
        
    def rem(self, item): #this is now right
        #repOK(self._storage)
        index = self._storage.remove(item)
        #repOK(self._storage)
        
    def union(self, other):
        #repOK(self._storage)
        #repOK(other._storage)
        s = SimpleSet2(self._storage + other._storage)
        #repOK(s._storage)
        return s
    
    def intersection(self, other):
        #repOK(self._storage)
        intlist = list(filter(lambda x : x in other._storage, self._storage))
        s = SimpleSet2(intlist)
        #repok(s._storage)
        return s
    
    def __repr__(self):
        return reprlib.repr(self._storage).replace('[','{').replace(']','}')
    
    


In [427]:
dtest(SimpleSet2, globals(), verbose=True)

Finding tests in NoName
Trying:
    A=SimpleSet2([1,2,3,1])
Expecting nothing
ok
Trying:
    B=SimpleSet2([2,3,4,4,5])
Expecting nothing
ok
Trying:
    sorted(list(A))
Expecting:
    [1, 2, 3]
ok
Trying:
    sorted(list(A.union(B)))
Expecting:
    [1, 2, 3, 4, 5]
ok
Trying:
    sorted(list(A.intersection(B)))
Expecting:
    [2, 3]
ok
Trying:
    A.rem(1)
Expecting nothing
ok
Trying:
    sorted(list(A))
Expecting:
    [2, 3]
ok
Trying:
    C=SimpleSet2()
Expecting nothing
ok
Trying:
    C
Expecting:
    {}
ok
Trying:
    sorted(list(C.union(A)))
Expecting:
    [2, 3]
ok
Trying:
    sorted(list(C.intersection(A)))
Expecting:
    []
ok


## Modularity

The idea behind refactoring is to make sure we have no repeated code, everything is readable and testable, and most importantly, you have made it easier for future you. The usual direction this modularity goes in is to have many small classes and functions with loose coupling between them.

Here are the pros and cons to this:

- small classes/modules mean interfaces with only few abstract procedures.
- thid means simple specs for interfaces
- this also means invariants are local. What is this? For the many functions part of modularity we can judge and test what a function does independent of the other functions. Now, with AbsFun and RepInv, we can do the same for ADT.
- Notice that this makes writing pre-and post-conditions harder. If you remember out binary search implementation we could have modularized it further, but communicating the pre-and-post conditions would have got harder with greater modularity. Still, remember how complex our binary-search spec was?.
- the correctness is now easier to reason about and test on a per function and per-method basis
- but we are less performant because we have many additional function calls. Also since everything is not at one place, its harder to play optimization tricks


**You must exercise your own judgement** as to where you want to make this tradeoff between loose coupling/modularity and tight coupling/performance. As scientists, we are often exposed to the latter: a performant array for example precludes all sorts of nice streaming algorithms, duplicates memory, etc and makes things more monolithic. But where it is an advantage to ease of programming, it behooves us to choose narrow and nice interfaces. We'll see an example of this next time.


## Back to ABCs

Above we "registered" SimpleSet1 with the SimpleSetInterface ADT, and found that even without ANY explicit inheritance, SimpleSet1 was found to be a subclass of SimpleSet. This is very useful in python since Python supports multiple-inheritance, and we can thus "mixin" different protocols into what we do.

On such registration, Python will just believe us without checking that we implement all the abstract methods. If we dont, we will gwt runtime exceptions.

Do you remember the `Answer` class from abovr which was a subclass of Sized? That wasnt even registered!

This is because `abc.Sized` implements a class method  `__subclasshook__`., which had this implementation...

```python
    @classmethod
    def __subclasshook__(cls, C):
        if cls is Sized:
            if any("__len__" in B.__dict__ for B in C.__mro__):
                return True
        return NotImplemented
```

This says, when you call `issubclass`, the class in question, C, check if any of its parents has a `__len__` in them, and if they do, C is a subclass. This is precisely what happens here. 

Thus, in-fact, the ad-hoc protocols we have been talking about in python actually do have a formal counterpart, even tho no inheritance is actually going on. What is the use of this?
        
Unless you are a framework creator, dont. Indeed even SimpleSetInterface did not have to be an ABC, it could have been simply documentation. But the ideas of interface and implementation separation are super important, so whatever method you want to use to document you interface is better than none, even if it is declaring an ABC.