## Sequences in Python

We talked about "listiness" earlier. Lets expand further on it.

Python takes the idea of **protocols** very seriously. 

We have seen that a protocol, the idea that different objects of related types, carry out the same behavior, is key to making software general. For example, all shapes will have an `area` method, all animals will have a `call` method and so on.

But one of the key aspects of python that make it a very natural language to program in is that this idea of polymorphism, this notion of protocols, pervades almost every built in class in the language. These classes may not be related (as different shapes are, and different animals are), but they still have the same methods, and thus can be used in the same ways.

### Sequences

To see this, let us look at many things that look like sequences in Python, but are otherwise different kinds of containers in python. All of these objects are said to follow the **sequence protocol**, which simply says, that one should (a) be able to loop over them, and (b) they should have a `len`gth.

In [2]:
lst = ['hi', 7, 'c', 2.2]
mystring = 'Hi !'
nums = [1, 4, 7, 9, 12]

`nums` if a list of numbers. You can iterate over them in a list comprehension. So is `lst`, but of different things.

In [3]:
evens = [e for e in nums if e%2 == 0]

In [4]:
for element in lst:
    print(element, type(element))

hi <class 'str'>
7 <class 'int'>
c <class 'str'>
2.2 <class 'float'>


But look here, the same loop can be used to iterate over strings.

In [5]:
for character in mystring:
    print(character)

H
i
 
!


Each of these has a length:

In [6]:
len(lst)

4

In [7]:
len(evens)

2

In [8]:
len(mystring)

4

And you can index into them:

In [9]:
lst[0], mystring[1], nums[2]

('hi', 'i', 7)

What happens if we look at a dictionary

In [10]:
d = dict(   
    name = 'Alice',
    age  = 18,
    gender = 'F'
)
print("age", d['age'])
for item in d:
    print(item)

age 18
name
age
gender


We get the keys in some order (order is not guaranteed in dictionaries. Use an `OrderedDict` for that)

In [11]:
len(d)

3

What about tuples?

In [12]:
tup = (1, 2, 3)

In [13]:
for i in tup:
    print(i)

1
2
3


In [14]:
len(tup)

3

All off lists, tuples, dictionaries, and strings follow the *sequence protocol*. What if you wanted to write your own class that supports the protocol?

### The sequence protocol, formally: Dunder methods

Any class which wants to be a sequence, must support having a length, being iterated over, and being indexable. It turns out that in Python, this is ensured by having 2 special methods defined within the class. These are `__len__`, which calculates the length of the class. When you call `len` on an instance of this class, python will *dispatch* to this **dunder** (or double-underscore) method defined by the class. The second one is `__getitem__(self, i)__`, which tells you whats in the sequence at index i. By having this we can now loop over `i`s until the `len` to iterate over the *sequence*. Here is such an implementation:

In [15]:
class Vector:
    
    def __init__(self, lst):
        self.storage = lst
        
    def __len__(self):
        return len(self.storage)
    
    def __getitem__(self, i):
        return self.storage[i]



This is not a particularly interesting implementation, it simply *delegates* the length calculation and indexing to the underlying list-based `self.storage` instance variable.

In [16]:
v = Vector([3, 4, 5, 6])

In [17]:
v

<__main__.Vector at 0x112966710>

In [18]:
v[1]

4

In [19]:
for i in v:
    print(i)

3
4
5
6


But as you can see, it does show how easy it isnto behave like a sequence. This type of protocol/polymorphism is called *Duck Typing*, a term coined by Alex Martelli, which roughly means: If it quacks like a duck (sequence) it is a Duck (sequence).

## Printing nicer

In [20]:
v

<__main__.Vector at 0x112966710>

This is not a particularly useful printout. Lets fix this:

In [22]:
class Vector:
    
    def __init__(self, lst):
        self.storage = lst.copy()
        
    def __len__(self):
        return len(self.storage)
    
    def __getitem__(self, i):
        return self.storage[i]
    
    def __repr__(self):
        return f"Vector({self.storage})"

v = Vector([3, 4, 5, 6])

In [23]:
v

Vector([3, 4, 5, 6])

Much better! Implementing `__repr__` lets us print our object in a human-readable way. We have used a format string or a f-string to do this. Its a wierd kind of string, with a "f" in front of it. Python keeps this "one letter in front of string" syntax for alternative strings such as f-strings (start with f), byte strings (startwith b), etc. 

In an f-string you can interpolate python variables if you enclose them in braces. Here we'll put in the full contents of self.storage. Terrible if this is a really long list but we'll let it be for now.