# 07 - In-Place Concatenation and Repetition

#### In-Place Concatenation

We saw that using concatenation ended up creating a new sequence object:

In [36]:
l1 = [1, 2, 3, 4]
l2 = [5, 6]
print(f'{id(l1)=} (before concatenation)', l1)
print(f'{id(l2)=}', l2)

l1  = l1 + l2 
print(f'{id(l1)=} (after concatenation)', l1)

id(l1)=140117538850368 (before concatenation) [1, 2, 3, 4]
id(l2)=140117624613760 [5, 6]
id(l1)=140117538861760 (after concatenation) [1, 2, 3, 4, 5, 6]


But watch what happens when we use the in-place concatenation operator `+=:

In [37]:
l1 = [1, 2, 3, 4]
l2 = [5, 6]
print(f'{id(l1)=} (before concatenation)', l1)
print(f'{id(l2)=}', l2)

l1 += l2 
print(f'{id(l1)=} (after concatenation)', l1)

id(l1)=140117538807616 (before concatenation) [1, 2, 3, 4]
id(l2)=140117539448192 [5, 6]
id(l1)=140117538807616 (after concatenation) [1, 2, 3, 4, 5, 6]


**When using `+=` on an mutable object (that supports this operator), the object is mutated.**

If we used a tuple instead of a list above, we have no other option but to create a new sequence object and store the concatenated data because tuples are not mutable.

Another point to note is that normally we cannot concatenate a list and a tuple.

In [45]:
l1 = [1, 2, 3, 4]
t1 = (5, 6)

l1 = l1 + t1
print(l1)

TypeError: can only concatenate list (not "tuple") to list

But we CAN do so **if we use the concatenation operator**. But this only works if we concatenate the mutable list with a tuple - we cannot concatenate the immutable tuple with a list, because the tuple is immutable.

In [49]:
l1 = [1, 2, 3, 4]
t1 = (5, 6)

l1 += t1
print(l1)

[1, 2, 3, 4, 5, 6]


#### In-Place Repetition

**The same is true for the `*=` operator.**

In [41]:
l1 = [1, 2, 3, 4]
l2 = [5, 6]
print(f'{id(l1)=} (before concatenation)', l1)
print(f'{id(l2)=}', l2)

l1 *= 2
print(f'{id(l1)=} (after concatenation)', l1)

id(l1)=140117539326464 (before concatenation) [1, 2, 3, 4]
id(l2)=140117539440000 [5, 6]
id(l1)=140117539326464 (after concatenation) [1, 2, 3, 4, 1, 2, 3, 4]


So the key takeaway from this section is that `a = a + b` is NOT equivalent to `a += b` if `a` is a mutable object; in the former, a new object is created and `a` points to it, and in the latter, `a` is modified as is.

# 08 - Assignments in Mutable Sequences

#### Replacing

Up until now we've used slicing to read elements from a sequence. But, as you know we can also replace the elements in a slice using assignment. Those elements need to be retrieved from any iterable e.g. tuple, set, etc.

For regular slices (step size = 1, i.e. non-extended), the slice and iterable **need *not* be the same length**.

This operation **performs a mutation**.

In [52]:
l = [1, 2, 3, 4, 5]
l[1:2] = (10, 20, 30)
l

[1, 10, 20, 30, 3, 4, 5]

For extended slices, the length of the slice and the length of the iterable we are setting on the RHS must have the **same length**:

In [54]:
l = [1, 2, 3, 4, 5]

In [57]:
l[::2]

[1, 3, 5]

In [58]:
l[::2] = ['a', 'b']

ValueError: attempt to assign sequence of size 2 to extended slice of size 3

In the last example, we are telling Python to replace the elements 1, 3 and 5. But we are only assigning two elements on the RHS. How will Python know where those two elements should go? Perhaps the 'a' should replace 1 and 'b' replace 3 and leave 5 as it is? That would be confusing so Python requires the same length on each side.

In [56]:
l = [1, 2, 3, 4, 5]
l[::2] = ['a', 'b', 'c']
l

['a', 2, 'b', 4, 'c']

#### Deleting

Deleting is a special case of replacement. We just replace the element with an **empty** iterable. This only works for standard slicing - not extended slicing because that requires the iterable to have the same length. The problem with that is we can't assign, say, 3 empty iterables to replace out 3 non-contiguous elements (i.e. non-sequential elements).

In [60]:
l = [1, 2, 3, 4, 5]
l[1:3]

[2, 3]

In [61]:
l[1:3] = []
l

[1, 4, 5]

The empty iterable we've assigned on the RHS is `[]` but we could've equivalently done `''` or `()` or `{}` if we wanted to.

In [69]:
l2 = [1, 2, 3, 4, 5]
print(l2[::2])

l2[::2] = []

[1, 3, 5]


ValueError: attempt to assign sequence of size 0 to extended slice of size 3

#### Insertion

The trick for inserting elements is that the slice must be empty and the RHS must contain an iterable. If the slice is not empty, then it will replace those elements as opposed to insert next to it, so it will no longer be considered insertion.

Here's how we insert a string iterable `abc` at index 1.

In [70]:
l = [1, 2, 3, 4, 5]
print(l[1:1])
l[1:1] = 'abc'
l

[]


[1, 'a', 'b', 'c', 2, 3, 4, 5]

Again, this won't work for extended slicing because insertion requires a single empty slice at a given location which is replaced by our iterable, whereas extended slicing will always give us a non-empty slice (unless we write something like `l[3:3:1]` but that's just a technicality - it's identical to `l[3:3]`).

So a key takeaway for this section is that we can use regular slicing for replacing, deleting and inserting, and generally, the LHS and RHS **need not be the same length**. 

If we want to perform an operation on non-contiguous elements, then we'll have to use extended slicing to select those elements, but now the LHS and RHS **need to be the same length**. Since, inserting requires an empty slice on the LHS while deleting requires an empty slice on the RHS, these two operations are impossible using this approach.

# 09 - Custom Sequences - Part 2a

We have seen before how we could define our own custom sequence type by implementing the `__len__` and `__getitem__` methods.

Here we are going to look at how to implement:
* concatenation (`+`) : same as `.__add__()`
* in-place concatenation (`+=`) : same as `.__iadd__()`
* repetition (`seq * n`); if Python runs into `TypeError`, it then tries the below with reversed args: same as `.__mul__()`
* repetition reversed (`n * seq`) : same as `.__rmul__()`
* in-place repetition (`*=`) : same as `.__rmul__()`
* index assignment (`seq[i]=val`) : same as `.__setitem__()`
* slice assignment (`seq[i:j]=iter` and `seq[i:j:k]=iter`) : same as `.__setitem__()`
* del, : same as `.__delitem__()`
* `<value> in <sequence>` : same as `__contains__()`
* append, extend, pop : These are *not* special methods. If we want them, we just implement them as regular methods.

Implementing these methods will **overload** the current operator methods with our method.

#### The `+` and `+=` Operators

First we look at how we can overload the `+` and `+=` operators in a custom class in general. Then we'll look at how to use this in the context of sequences.

We use the special functions `__add__` and `__iadd__`.

Just to see how those methods get called, we're actually going to implement them to just print out that they were called. As you can see, we can implement them however we want!

In [7]:
class MyClass:
    def __init__(self, name):
        self.name = name
        
    def __repr__(self):
        return f'MyClass(name={self.name})'
    
    def __add__(self, other):
        print(f'You called + on {self} and {other}')
        return 'Hello from __add__'
        
    def __iadd__(self, other):
        print(f'You called += on {self} and {other}')
        return 'Hello from __iadd__'

In [8]:
c1 = MyClass('instance 1')
c2 = MyClass('instance 2')

In [9]:
c3 = c1 + c2

You called + on MyClass(name=instance 1) and MyClass(name=instance 2)


Let's try the in-place addition operator. We may expect mutation to occur and therefore, the id should stay the same.

In [10]:
print(id(c1))
c1 += c2
print(id(c1))
print(c1)

140528194484784
You called += on MyClass(name=instance 1) and MyClass(name=instance 2)
140528049348144
Hello from __iadd__


But it doesn't! That's because `c1` is now the string `"Hello from __iadd__"` which is a different object. So the special method `__iadd__` doesn't inherently impose this behaviour - we don't have to make it do that - but it's what everyone expects it to do so we should. The only behaviour that it imposes is allowing for `+=` functionality.

So, `__add__` expects us to take two objects (typically of the same instance) and return a new object of the same instance. 

`__iadd__` expect us to take two objects and typically return the first object but mutated. So we're going to apply a change to the first object as opposed to creating a new instance altogether.

How do we do these two things in code?

In [11]:
class MyClass:
    def __init__(self, name):
        self.name = name
        
    def __repr__(self):
        return f'MyClass(name={self.name})'
    
    def __add__(self, other):
        return MyClass(self.name + ' ' + other.name)
        
    def __iadd__(self, other):
        self.name += ' ' + other.name   # Remember, this is NOT inplace concatenation / mutation 
                                        # because 'name' is a string which is immutable so Python is forced to create a new object.
        return self
        

In [12]:
c1 = MyClass('Eric')
c2 = MyClass('Idle')

In [13]:
c3 = c1 + c2

In [14]:
c3

MyClass(name=Eric Idle)

These two methods are not restrictive (which is not necessarily an issue). All we require for the 2nd operand is that it has a property of `name` and is of a type that supports concatenation.

Also, to emphasise once more, `__iadd__` will now have the ensure that the ID remains the same. This is because we return `self` (the original object) as opposed to `MyClass(self.name + other.name)`.

#### The `*` and `*=` Operators

Just as easily we can overload the `*` and `*=` operators too, using the `__mul__` and `__imul__` methods.

In [14]:
class MyClass:
    def __init__(self, name):
        self.name = name
        
    def __repr__(self):
        return f'MyClass(name={self.name})'
    
    def __add__(self, other):
        return MyClass(self.name + ' ' + other.name)
        
    def __iadd__(self, other):
        self.name += ' ' + other.name
        return self
    
    def __mul__(self, n):
        return MyClass(self.name * n)
        
    def __imul__(self, n):
        self.name *= n
        return self

In [15]:
c1 = MyClass('Eric')

In [16]:
c1 * 3

MyClass(name=EricEricEric)

In [17]:
print(id(c1), c1)
c1 *= 3
print(id(c1), c1)

140528194484304 MyClass(name=Eric)
140528194484304 MyClass(name=EricEricEric)


What about multiplying an integer by the sequence?

In [25]:
c1 = MyClass('Monty')
2 * c1

TypeError: unsupported operand type(s) for *: 'int' and 'MyClass'

To handle this we need to implement the `__rmul__` method:

In [18]:
class MyClass:
    def __init__(self, name):
        self.name = name
        
    def __repr__(self):
        return f'MyClass(name={self.name})'
    
    def __mul__(self, n):
        return MyClass(self.name * n)
    
    def __rmul__(self, n):
        return self.__mul__(n)

In [19]:
c1 = MyClass('Monty')

In [20]:
2 * c1

MyClass(name=MontyMonty)

Python first tries `2.__mul__(c1)` and raises a `TypeError` at `self.name` = `2.name`, so it runs `__rmul__` with the arguments reversed. So, `self <=> n` and `return self.__mul__(n)` becomes `c1.__mul__(3)`, which works.

#### Implementing the `in` operator

For this example, we'll want `in` to test if the something is contained in the name string of our class:

In [21]:
class MyClass:
    def __init__(self, name):
        self.name = name
        
    def __repr__(self):
        return f'MyClass(name={self.name})'
    
    def __contains__(self, value):
        return value in self.name

In [22]:
c1 = MyClass('MontyPython')

In [23]:
'ty' in c1

True

# 10 - Custom Sequences - Part 2b_c

For this example we'll re-use the Polygon class from a previous lecture on extending sequences.

We are going to consider a polygon as nothing more than a collection of points (and we'll stick to a 2-dimensional space).

So, we'll need a `Point` class, but we're going to use our own custom class instead of just using a named tuple.

We do this because we want to enforce a rule that our Point co-ordinates will be real numbers. We would not be able to use a named tuple to do that and we could end up with points whose `x` and `y` coordinates could be of any type.

In [2]:
from collections import namedtuple

Point = namedtuple('Point', 'x y')
p1 = Point(10, 5)
p2 = Point('abc', [1,2,3])          # We want to be able to prevent passing in the wrong type in our custom sequence

x, y = p1                           # But, we would like this functionality of namedtuples in our custom sequence
print(x)
print(y)

10
5


First we'll need to see how we can test if a type is a numeric real type.

We can do this by using the numbers module.

In [5]:
import numbers

This module contains certain base types for numbers that we can use, such as Number, Real, Complex, etc.

In [6]:
print(isinstance(10, numbers.Number))
print(isinstance(1+1j, numbers.Number))

True
True


We will want our points to be real numbers only, so we can do it this way:

In [7]:
isinstance(1+1j, numbers.Real)

False

So now let's write our Point class. We want it to have these properties:

  1. The `x` and `y` coordinates should be real numbers only
  2. Point instances should be a sequence type so that we can unpack it as needed in the same way we were able to unpack the values of a named tuple.

In [8]:
class Point:
    def __init__(self, x, y):
        if isinstance(x, numbers.Real) and isinstance(y, numbers.Real):
            self._pt = (x, y)
        else:
            raise TypeError('Point co-ordinates must be real numbers.')
            
    def __repr__(self):
        return f'Point(x={self._pt[0]}, y={self._pt[1]})'
    
    def __len__(self):
        return 2
    
    def __getitem__(self, s):
        return self._pt[s]

In `__getitem__(self, s)`, recall that `s` is the index or slice we provide in square brackets to the sequence: `sequence[s]`. You might think we'll need to worry about whether `s` is a value or a slice but we don't. Why?

`self._pt` is a tuple. Tuples know how to index and slice/extended slice perfectly well. **We are *delegating* this request to the tuple**.

Let's use our point class and make sure it works as intended:

In [9]:
p = Point(1, 2)

In [10]:
p

Point(x=1, y=2)

In [11]:
len(p)

2

In [12]:
p[0], p[1]

(1, 2)

The unpacking above **only** works because `p` is a custom sequence, and it's a custom sequence because of the `__getitem__` method. This unpacking is the same as the unpacking below, except we just assign them to variables.

In [13]:
x, y = p

But why would we want to do this? This will allow us to take one point and unpack it into a new point. When we create a polygon class (which takes a series of xy points), we can just feed in our `Point` object (or we can feed in a regular 2-element list or tuple or any 2-element sequence for that matter).

For example:

In [16]:
p2 = Point(*p1)
p2

Point(x=10, y=5)

Now, we can start creatiung our Polygon class, that will essentially be a mutable sequence of points making up the verteces of the polygon.

In [17]:
class Polygon:
    def __init__(self, *pts):
        if pts:
            self._pts = [Point(*pt) for pt in pts]  # if pt is already a Point object, we don't have to worry about dealing with it separately 
                                                    # since we implemented unpacking functionality
        else:
            self._pts = []
            
    def __repr__(self):
        return f'Polygon({self._pts})'

Let's try it and see if everything is as we expect:

In [18]:
p = Polygon((0,0), [1,1])

In [19]:
p

Polygon([Point(x=0, y=0), Point(x=1, y=1)])

Now, to see if the `__repr__` is accurate, we should copy and paste it to see if it creates the expected object.

In [20]:
Polygon([Point(x=0, y=0), Point(x=1, y=1)])

TypeError: Point co-ordinates must be real numbers.

Our representation contains those square brackets which technically should not be there as the Polygon class `__init__` assumes multiple comma-separated arguments, not a single iterable.

So we should fix that by taking each Point in the iterable, converting it into a string and append it to a list. Then join the list via `,` so that we get comma-separated arguments.

In [21]:
class Polygon:
    def __init__(self, *pts):
        if pts:
            self._pts = [Point(*pt) for pt in pts]
        else:
            self._pts = []
            
    def __repr__(self):
        pts_str = ', '.join([str(pt) for pt in self._pts])
        return f'Polygon({pts_str})'

In [22]:
p = Polygon((0,0), [1,1])
p

Polygon(Point(x=0, y=0), Point(x=1, y=1))

In [25]:
p2 = Polygon(Point(x=0, y=0), Point(x=1, y=1)) # Works, no errors

So now we can start making our Polygon into a sequence type, by implementing methods such as `__len__` and `__getitem__`.

Then we'll support concatenation with `__add__`. (We don't need to support repetition - it doesn't make sense with polygons.)

In [27]:
class Polygon:
    def __init__(self, *pts):
        if pts:
            self._pts = [Point(*pt) for pt in pts]
        else:
            self._pts = []
            
    def __repr__(self):
        pts_str = ', '.join([str(pt) for pt in self._pts])
        return f'Polygon({pts_str})'
    
    def __len__(self):
        return len(self._pts)
    
    def __getitem__(self, s):
        return self._pts[s]         # No need to worry whether s is an integer or a slice because _pts is a list 
                                    # and lists are sequences and sequences support indexing and slicing. We've delegated again.]
    
    def __add__(self, other):
        if isinstance(other, Polygon):
            new_pts = self._pts + other._pts
            return Polygon(*new_pts)
        else:
            raise TypeError('can only concatenate with another Polygon')

In [28]:
p1 = Polygon((0,0), (1,1))
p2 = Polygon((2,2), (3,3))

new_polygon = p1 + p2
new_polygon

Polygon(Point(x=0, y=0), Point(x=1, y=1), Point(x=2, y=2), Point(x=3, y=3))

Now, let's add in-place concatenation. Remember, we need to mutate ourselves (`self`) by adding the points from the other point `other` and then **return ourselves (`self`)**.

In [29]:
class Polygon:
    def __init__(self, *pts):
        if pts:
            self._pts = [Point(*pt) for pt in pts]
        else:
            self._pts = []
            
    def __repr__(self):
        pts_str = ', '.join([str(pt) for pt in self._pts])
        return f'Polygon({pts_str})'
    
    def __len__(self):
        return len(self._pts)
    
    def __getitem__(self, s):
        return self._pts[s]
    
    def __add__(self, other):
        if isinstance(other, Polygon):
            new_pts = self._pts + other._pts
            return Polygon(*new_pts)
        else:
            raise TypeError('can only concatenate with another Polygon')
            
    def __iadd__(self, pt):
        if isinstance(pt, Polygon):
            self._pts = self._pts + pt._pts   # THIS IS NOT IN-PLACE CONCATENATION; self._pts on the LHS will be a new object, but that's okay!
            return self
        else:
            raise TypeError('can only concatenate with another Polygon')

You'll notice that in `__iadd__` we have: `self._pts = self._pts + pt._pts` instead of `self._pts += pt._pts`. That's because we don't care if we `self._pts` gets a new memory address or not - we only want to ensure that the memory address of the object `self` stays the same.

In [32]:
p1 = Polygon((0,0), (1,1))
p2 = Polygon((2,2), (3,3))

print(f'{id(p1)=}')
p1 += p2                # Equivalent to p1 = p1.__iadd__(p2)
print(f'{id(p1)=}')

id(p1)=139642594553872
id(p1)=139642594553872


But what we can't do yet is concatenate with anything that appears like a point, such as a list or tuple with 2 elements.

In [75]:
p1 = Polygon((0,0), (1,1))
p1 += [(2,2), (3,3)]

To fix this, we need to rewrite our `__iadd__` to not include the `if isinstance(pt, Polygon)`. Then, if the `other` is a Polygon, we deal with it as we have above. 

But, if it's an iterable, we'll need to take each point in the iterable and unpack its 2 elements into a `Point` object. If those elements are not real numbers, our `Point` class should catch it.

For example:

`[(3,3), (4,4)]` --> `[Point(3,3), Point(4,4)]`

In [41]:
class Polygon:
    def __init__(self, *pts):
        if pts:
            self._pts = [Point(*pt) for pt in pts]
        else:
            self._pts = []
            
    def __repr__(self):
        pts_str = ', '.join([str(pt) for pt in self._pts])
        return f'Polygon({pts_str})'
    
    def __len__(self):
        return len(self._pts)
    
    def __getitem__(self, s):
        return self._pts[s]
    
    def __add__(self, pt):
        if isinstance(pt, Polygon):
            new_pts = self._pts + pt._pts
            return Polygon(*new_pts)
        else:
            raise TypeError('can only concatenate with another Polygon')
            
    def __iadd__(self, pts):
        if isinstance(pts, Polygon):
            self._pts = self._pts + pts._pts
        else:
            # assume we are being passed an iterable containing Points
            # or something compatible with Points
            points = [Point(*pt) for pt in pts]
            self._pts = self._pts + points
        return self

In [42]:
p1 = Polygon((0,0), (1,1))
p1 += [(2,2), (3,3), Point(5,5), {3,4}]

Now, let's implement `append`, `insert` and `extend`. 

Remember, these are not special methods. But, everyone expects them to behave in a particular way so we should replicate that. 

For example, the method `extend` generally does a mutation and **has no return**. 

In fact, **`extend` is identical to `__iadd__` except that it has no return, so we can refactor our code to reflect this** We do this by making `__iadd__` use our `extend` method but just return the outcome.

In [44]:
class Polygon:
    def __init__(self, *pts):
        if pts:
            self._pts = [Point(*pt) for pt in pts]
        else:
            self._pts = []
            
    def __repr__(self):
        pts_str = ', '.join([str(pt) for pt in self._pts])
        return f'Polygon({pts_str})'
    
    def __len__(self):
        return len(self._pts)
    
    def __getitem__(self, s):
        return self._pts[s]
    
    def __add__(self, pt):
        if isinstance(pt, Polygon):
            new_pts = self._pts + pt._pts
            return Polygon(*new_pts)
        else:
            raise TypeError('can only concatenate with another Polygon')

    def append(self, pt):
        self._pts.append(Point(*pt))
        
    def extend(self, pts):
        if isinstance(pts, Polygon):
            self._pts = self._pts + pts._pts
        else:
            # assume we are being passed an iterable containing Points
            # or something compatible with Points
            points = [Point(*pt) for pt in pts]
            self._pts = self._pts + points
    
    def __iadd__(self, pts):
        self.extend(pts)
        return self
    
    def insert(self, i, pt):
        self._pts.insert(i, Point(*pt))

In [50]:
p1 = Polygon((0,0), Point(1,1))
p2 = Polygon([2, 2], [3, 3])

print(f"p1 original. Result: {p1}")

p1.append((4, 4))
print(f"p1 appended. Result: {p1}")

p1.insert(1, Point(-1, -1))
print(f"p1 inserted. Result: {p1}")

p3 = Polygon((6,6), Point(20,20))
p1.extend(p3)
print(f"p1 extended. Result: {p1}")

p1 original. Result: Polygon(Point(x=0, y=0), Point(x=1, y=1))
p1 appended. Result: Polygon(Point(x=0, y=0), Point(x=1, y=1), Point(x=4, y=4))
p1 inserted. Result: Polygon(Point(x=0, y=0), Point(x=-1, y=-1), Point(x=1, y=1), Point(x=4, y=4))
p1 extended. Result: Polygon(Point(x=0, y=0), Point(x=-1, y=-1), Point(x=1, y=1), Point(x=4, y=4), Point(x=6, y=6), Point(x=20, y=20))


**Part 2c:`__setitem__` Method**

The two things we want to do is:

- provide an index and a new point, and replace our Polygon's element at that index with the new point.
- same as above but with a slice.

For example:

`p1[3] = Point(100, 100)`
`p1[0:2] = [Point(-1,-1), Point(-2,-2)]`

This is going to be easier than we may think because we can use **delegation** to delegate our slice operation to the list type because lists support setting items.

We'll start off from where we left off above and add in our `__setitem__` method. Before we write it in our class above, I'll write one possible approach in isolation below, and then we'll see what issues may arise.

In [None]:
def __setitem__(self, s, value):
    
    if isinstance(s, int):
        self._pts[s] = Point(*value)
    else:
        self._pts[s] = [Point(*pt) for pt in value]           

Consider 

`p[0] = [Point(10, 10), Point(20, 20)]`

This won't work. Why not?

The LHS is fine. It's the the RHS that's causing the issue. RHS = `[Point(10, 10), Point(20, 20)]` which gets equated to `value`.

Since `s` is an `int`, we go through the first condition which evaluates to: `self._pts[0] = Point( Point(10, 10) , Point(20, 20) )` 

In other words, we have a Point made of Points instead of integers. 

**This throws an error as it should but the error is found in Point which we created way back: 'Point co-ordinates must be real numbers.'**

-----------
```
class Point:
    def __init__(self, x, y):
        if isinstance(x, numbers.Real) and isinstance(y, numbers.Real):
            self._pt = (x, y)
        else:
            raise TypeError('Point co-ordinates must be real numbers.')
            
    def __repr__(self):
        return f'Point(x={self._pt[0]}, y={self._pt[1]})'
    
    def __len__(self):
        return 2
    
    def __getitem__(self, s):
        return self._pt[s]
```
-------------
The error is technically correct - Point(10, 10) isn't an integer after all - but, we'd want to be more specific.

We'd find a similar sort of somewhat meaningless/unintuitive error if we tried 

`p[0:2] = Point(20, 20)`

`> TypeError: type object argument after * must be an iterable, not int`

Again, the fact that an error is thrown is good but we just want a clear error.

**How are we going to fix this?**

We need to understand these two rules:

- If `s` is an integer, `value` must be a single `Point`.
- If `s` is a slice, `value` must be a sequence/iterable of Points

So what's our method?

- We'll try to take the RHS e.g `[Point(10, 10), Point(20, 20)]` and **`try`** to make a list of points. If we succeed, we know we have a list of points.
- Otherwise, we probably have a single point when we want a list of points. So, we have a TypeError that we want to catch.
- But we *may not* have a single point, so let's **`try`** to make a `Point` from the RHS. If we succeed, we have a Point.
- Otherwise, we have something useless like a complex value or string. For this, we need to **`raise`** our own error.

- If no exceptions were raised throughout this process, **then we have a valid RHS**.
- Only now we can implement the two rules above which can be done with a single `x or y` instead of two `if` statements. 

Here's our isolated method solution:

In [51]:
def __setitem__(self, s, value):
    # we first should see if we have a single Point
    # or an iterable of Points in value
    try:
        rhs = [Point(*pt) for pt in value]
        is_single = False
    except TypeError:
        # not a valid iterable of Points
        # maybe a single Point?
        try:
            rhs = Point(*value)
            is_single = True
        except TypeError:
            # still no go
            raise TypeError('Invalid Point or iterable of Points')

    # reached here, so rhs is either an iterable of Points, or a Point
    # we want to make sure we are assigning to a slice only if we have an iterable of points,
    # or, assigning to an index if we have a single Point only
 
    if (isinstance(s, int) and is_single) or isinstance(s, slice) and not is_single: 
        self._pts[s] = rhs
    else:
        raise TypeError('Incompatible index/slice assignment')

Let's add `del` and `pop` functionality to our class quickly using **delegation** as we've done previously before testing out our exception handling.

Here's our complete class:

In [57]:
class Polygon:
    def __init__(self, *pts):
        if pts:
            self._pts = [Point(*pt) for pt in pts]
        else:
            self._pts = []
            
    def __repr__(self):
        pts_str = ', '.join([str(pt) for pt in self._pts])
        return f'Polygon({pts_str})'
    
    def __len__(self):
        return len(self._pts)
    
    def __getitem__(self, s):
        return self._pts[s]
    
    def __setitem__(self, s, value):
        # we first should see if we have a single Point
        # or an iterable of Points in value
        try:
            rhs = [Point(*pt) for pt in value]
            is_single = False
        except TypeError:
            # not a valid iterable of Points
            # maybe a single Point?
            try:
                rhs = Point(*value)
                is_single = True
            except TypeError:
                # still no go
                raise TypeError('Invalid Point or iterable of Points')
        
        # reached here, so rhs is either an iterable of Points, or a Point
        # we want to make sure we are assigning to a slice only if we 
        # have an iterable of points, and assigning to an index if we 
        # have a single Point only
        if (isinstance(s, int) and is_single) \
            or isinstance(s, slice) and not is_single:
            self._pts[s] = rhs
        else:
            raise TypeError('Incompatible index/slice assignment')
                
    def __add__(self, pt):
        if isinstance(pt, Polygon):
            new_pts = self._pts + pt._pts
            return Polygon(*new_pts)
        else:
            raise TypeError('can only concatenate with another Polygon')

    def append(self, pt):
        self._pts.append(Point(*pt))
        
    def extend(self, pts):
        if isinstance(pts, Polygon):
            self._pts = self._pts + pts._pts
        else:
            # assume we are being passed an iterable containing Points
            # or something compatible with Points
            points = [Point(*pt) for pt in pts]
            self._pts = self._pts + points
    
    def __iadd__(self, pts):
        self.extend(pts)
        return self
    
    def insert(self, i, pt):
        self._pts.insert(i, Point(*pt))
        
    def __delitem__(self, s):
        del self._pts[s]
        
    def pop(self, i):
        return self._pts.pop(i)

So now let's see if we get better error messages:

In [58]:
p1 = Polygon((0,0), (1,1), (2,2))

In [59]:
p1[0:2] = (10,10)

TypeError: Incompatible index/slice assignment

In [60]:
p1[0] = [(0,0), (1,1)]

TypeError: Incompatible index/slice assignment

In [61]:
p = Polygon(*zip(range(6), range(6)))
p.pop(1)

Point(x=1, y=1)

# 11- Sorting Sequences

#### Lecture

Just like with the concatenation and in-place concatenation we saw previously, we have two different ways of sorting a mutable sequence:

* returning a new sorted sequence
* in-place sorting (mutating sequence) - obviously this works for mutable sequence types only!


For any iterable, the built-in `sorted` function will return a **list** containing the sorted elements of the iterable.

So a few things here: 
* any iterable can be sorted (as long as it is finite)
* the elements must be pair-wise comparable (possibly indirectly via a sort key)
* the returned result is always a list
* the original iterable is not mutated

In addition:
* optionally specify a `key` - a function that extracts a comparison key for each element. If that key is not specified, Python will use the natural ordering of the elements (such as __gt__, etc, so that fails if they do not!)
* optional specify the `reverse` argument which will return the reversed sort

Of course, numbers have a natural sort order and strings do too (alphabetical, but don't forget that a distinction is made if the letter is capitalised). But what if we have numerous strings and we want to sort by the last character of the string. What word the **sort key** be?

```
item:    'hello'    'python'    'parrot'    'bird'
key:       'o'        'n'         't'         'd'
```

And we know that since these sort keys are strings, they have a natural sort order.

So what would the key function look like?

`key = lambda s: s[-1]`

What do we mean by the natural sort order?

These are how we'd generally and intuitively sort the items. If we do not provide the `key` keyword argument, then we can always think of the default key as the elements themselves. In other words,

`sorted(iterable) <-> sorted(iterable, key=lambda x: x)`

For things like dictionaries, this works slightly differently. Remember what happens when we iterate a dictionary?

In [8]:
d = {3: 100, 2: 200, 1: 10}
for item in d:
    print(item)

3
2
1


We actually are iterating the keys.

Same thing happens with sorting - we'll end up just sorting the keys:

In [9]:
d = {3: 100, 2: 200, 1: 10}

sorted(d)

[1, 2, 3]

But what if we wanted to sort the dictionary keys based on the values instead?

This is where the `key` argument of `sorted` will come in handy.

We are going to specify to the `sorted` function that it should use the value of each item to use as a sort key:

In [11]:
d = {'a': 100, 'b': 50, 'c': 10}

sorted(d, key=lambda k: d[k])

['c', 'b', 'a']

Note: **The `sorted` function makes a copy of the iterable and returns the sorted elements in a `list` always**.

#### Stable Sorting

You might have noticed that the words `this`,  `late` and `bird` all have four characters - so how did Python determine which one should come first? Randomly? No!

The sort algorithm that Python uses, called the *TimSort* (named after Python core developer Tim Peters - yes, the same Tim Peters that wrote the Zen of Python!!), is what is called a **stable** sort algorithm.

This means that items with equal sort keys maintain their relative position.

In [None]:
t = 'aaaa', 'bbbb', 'cccc', 'dddd', 'eeee'

In [None]:
sorted(t, key = lambda s: len(s))

['aaaa', 'bbbb', 'cccc', 'dddd', 'eeee']

Now let's change our tuple a bit:

In [None]:
t = 'bbbb', 'cccc', 'aaaa', 'eeee', 'dddd'

In [None]:
sorted(t, key = lambda s: len(s))

['bbbb', 'cccc', 'aaaa', 'eeee', 'dddd']

As you can see, when the sort keys are equal (they are all equal to 4), the original ordering of the iterable is preserved. `bbbb` came before `cccc` in our tuple so `bbbb` will come before `cccc` after sorting, even though they have the same sort key.

#### Reversed Sort

We also have the `reverse` keyword-only argument that we can use - basically it sorts the iterable, but returns it reversed:

In [1]:
t = 'this', 'bird', 'is', 'a', 'late', 'parrot'

In [40]:
sorted(t, key=lambda s: len(s), reverse=True)

['parrot', 'this', 'bird', 'late', 'is', 'a']

#### In-Place Sort

If the iterable is mutable, in-place sorting is possible. The `list` class has a `.sort()` instance method that does in-place sorting. **It won't return anything**.

This method is slightly more efficient than the `sorted()` function because it doesn't have to make a copy of the iterable before sorting.

In [4]:
l = ['this', 'bird', 'is', 'a', 'late', 'parrot']
result = l.sort(key = lambda x: len(x))
result

#### Natural Ordering for Custom Classes

I just want to quickly show you that in order to have a "natural ordering" for our custom classes, we just need to implement the `<` or `>` operators. (I discuss these operators in Part 1 of this course)

In fact, we can modify our class slightly so we can see that `sorted` is calling our `__lt__` method repeatedly to perform the sort:

In [8]:
class MyClass:
    def __init__(self, name, val):
        self.name = name
        self.val = val
        
    def __repr__(self):
        return f'MyClass({self.name}, {self.val})'
    
    def __lt__(self, other):
        print(f'called {self.name} < {other.name}')
        return self.val < other.val

In [9]:
c1 = MyClass('c1', 20)
c2 = MyClass('c2', 10)
c3 = MyClass('c3', 20)
c4 = MyClass('c4', 10)

In [10]:
sorted([c1, c2, c3, c4])

called c2 < c1
called c3 < c2
called c3 < c1
called c4 < c1
called c4 < c2


[MyClass(c2, 10), MyClass(c4, 10), MyClass(c1, 20), MyClass(c3, 20)]

Now we can sort those objects, without specifying a key, since that class has a natural ordering (`<` in this case). Moreover, notice that the sort is stable.

But we can still sort by using a key. For example:

In [11]:
l = [c2, c4, c1, c3]
sorted(l, key=lambda c: c.name)

[MyClass(c1, 20), MyClass(c2, 10), MyClass(c3, 20), MyClass(c4, 10)]

If you wanted to implement all the other orderings you can just use the `@total_ordering` decorator: `from functools import total_ordering`

# 12 - List Comprehensions

#### Lecture

See Part 1 for introduction on comprehensions if needed.

**Internal Mechanics of List Comprehensions**

Comprehensions have their own **local scope** - just like a function.

Functions can be nested inside other functions, creating an inner and an outer scope. If the inner references a variable in the outer, that variable becomes a free variable and the inner function is known as a **closure** - so closures have free variables.

We need to recognize that list comprehensions are essentially temporary functions that Python creates, executes and returns the resulting list from it.

Let's break this down with an example:

`sq = [item**2 for item in range(10)]`

- The entire RHS *is* the list comprehension. 
- There are two stages: compilation and execution. 
- During **compilation**, Python creates a temporary function, `item` being a variable in the local scope, that'll be used to evaluate the comprehension. 
- Something like:

```
def temp():
    new_list = []
    for item in range(10):
        new_list.append(item**2)
       
    return new_list
```

- When the original line is **executed**, python executes `temp()`.
- Then, it stores the returned object (`new_list`) at some memory address and points `sq` on the LHS to it.

#### Comprehension Scopes

As mentioned `item` in the list comprehension is a local symbol of the temporary function. 

But the comprehension has access to **global** variables just like how normal functions will access the global scope if it can't find a variable in the local scope.

But what about **nonlocal** variables? 

These are variables that are "neither local nor global". The nonlocal is a keyword in python that is used to declare any variable as not local but instead comes from the nearest enclosing scope that is not global. 

Consider the function:

In [13]:
def my_func(num):
    sq = [item**2 for item in range(num)]

Remember to think of the RHS as a function. It contains `num` which is a variable in the outer scope. We therefore have a comprehension function nested inside a regular function. Since the comprehension function is referencing `num`, a `nonlocal` symbol, it becomes a free variable - so we have a closure.

**The RHS *is* a closure - a function nested in another function that has access to one or more free variables in an outer scope because it references those nonlocal symbols.**

#### Things to Watch Out For

**Example 1**

To drive the point home about comprehension scopes, consider the following below. We have a regular `for` loop containing the symbol `number`:

In [13]:
l=[]

for number in range(5):
    l.append(number**2)
    
print(number)
print('number' in globals())

4
True


So, Python automatically creates a symbol within the scope that it's residing in. But what if we create this loop in a comprehension? (First, delete `number` from `globals()`.)

In [14]:
if 'number' in globals():
    del number

l = [number**2 for number in range(5)]

print('number' in globals())

False


It's not in the `globals()` scope because it was created within scope of the temporary function that Python created for the comprehension.

**But be careful!**

In `[number**2 for number in range(5)]`, the `number` in `for number` is the **declaration** of the variable within the function scope. But, the `number` in `number**2` is a **reference**.

First, note that, although the first mention of `number` comes first in the comprehension, it comes at the very end of the temporary function that Python creates - that's why it's considered a reference and NOT a declaration by Python.

Secondly, from what we know about functions in Python, if a variable is referenced within a function, Python first looks for any declarations within the local scope of the function. If none are found, it zooms out to look in the global scope.

So, take a look at the example below:

In [15]:
number = 100

[number * i for i in range(5)] # number is in the first part of the statement, so it's a reference NOT a DECLARATION
                               # i is AFTER the first part of the statement, it's a declaration.

[0, 100, 200, 300, 400]

- When Python created the temporary function and began with `for i in range(5):`, it declared `i` as a local variable. 
- Then, it got round to `number * i` but it couldn't find any declaration of `number` within its local (comprehension) scope, so it looked in globals and found `number = 100`.

**Example 2**

Now let's look at an example we've seen before when we studied closures.

Suppose we want to generate a list of functions that will calculate powers of their argument, i.e. we want to define a bunch of functions. Here's one approach:

We could certainly define a bunch of functions one by one:

In [45]:
fn_0 = lambda x: x**0
fn_1 = lambda x: x**1
fn_2 = lambda x: x**2
fn_3 = lambda x: x**3
# etc

But this would be very tedious if we had to do it more than just a few times.

Instead, why don't we create those functions as lambdas and put them into a list where the index of the list will correspond to the power we are looking for.

Something like this if we were doing it manually:

In [46]:
funcs = [lambda x: x**0, lambda x: x**1, lambda x: x**2, lambda x: x**3]

Now we can call these functions this way:

In [47]:
print(funcs[0](10))
print(funcs[1](10))
print(funcs[2](10))
print(funcs[3](10))

1
10
100
1000


Now all we need to do is to create these functions using a loop - the traditional way first:

First let's make sure `i` is not in our global symbol table:

In [17]:
if 'i' in globals():
    del i

In [18]:
funcs = []
for i in range(6):
    funcs.append(lambda x: x**i)

And let's use them as before:

In [20]:
print(funcs[0](10))
print(funcs[1](10))

funcs

100000
100000


[<function __main__.<lambda>(x)>,
 <function __main__.<lambda>(x)>,
 <function __main__.<lambda>(x)>,
 <function __main__.<lambda>(x)>,
 <function __main__.<lambda>(x)>,
 <function __main__.<lambda>(x)>]

What happened?? It looks like every function is actually calculating `10**5`. To break it down:

- `lambda x: x**i` is a function which references `i` that exists in an outer scope. Therefore, the lambda is a closure.
- Inside `funcs`, each of those lambda's have a `x**i` but `i` is pointing to the global `i`.
- So, for whichever lambda we fetch, when lambda wants to compute `x**i`, it goes through a pointer to the global `i`.
- Since the `for` loop terminated, `i` ended as `5`. So any mention of `i` in any of those lambdas will point to whatever is the current value of `i`.

Take a look at the equivalent breakdown:

In [None]:
i = 0
def fn_0(x):
    return x ** i

i = 1
def fn_0(x):
    return x ** i

i = 2
def fn_0(x):
    return x ** i

What's the solution? Recall that if a function is defined with a default parameter, e.g. `def log(current_dt=datetime.now()`, that parameter is calculated during **compilation** of the function, not **execution**. So, if our `log` function is called on different days without a `current_dt` parameter provided, the same default value will be pulled every time.

We can use this to our advantage in this scenario. Take a look:

In [2]:
funcs = [lambda x, p=i: x**p for i in range(4)]

print(funcs[0](10))
print(funcs[1](10))
print(funcs[2](10))
print(funcs[3](10))

1
10
100
1000


It works because first the value of `i=0` is taken and during the creation of the first lambda (compilation), `p` gets assigned to `i`, so `p=0` is a default value to this particular lambda. If no default `p` value is provided, then Python pulls this value of `p=0`. It might be useful to think that a value for `p` gets hardcoded for each value of `i` in the loop. 

**Compilation**

Let's show that Python is indeed creating a function by compiling a comprehension, and then disassembling the compiled code to see what's happened:

In [11]:
import dis

In [12]:
compiled_code = compile('[i**2 for i in (1, 2, 3)]', 
                        filename='', mode='eval')

In [13]:
dis.dis(compiled_code)

  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x000001F77210ED20, file "", line 1>)
              2 LOAD_CONST               1 ('<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_CONST               5 ((1, 2, 3))
              8 GET_ITER
             10 CALL_FUNCTION            1
             12 RETURN_VALUE


As you can see, in step 4, Python created a function (`MAKE_FUNCTION`), called it (`CALL_FUNCTION`), and then returned the result (`RETURN_VALUE`) in the last step.

**Nested Comprehensions**

Comprehensions can be nested within each other. And since they're functions, a nested comprehension can access variables in the outer comprehension which are nonlocal to inner comprehension. 

If this is the case, the inner/nested comprehension becomes a closure because it's a nested 'function' accessing a free variable. 

In the example below, `i` is a nonlocal variable to the nested comprehension because `i` is spotted in the outer comprehension/function. So, the nested comprehension is a function.

In [14]:
[ [i * j for j in range(5)] for i in range(5)]

[[0, 0, 0, 0, 0],
 [0, 1, 2, 3, 4],
 [0, 2, 4, 6, 8],
 [0, 3, 6, 9, 12],
 [0, 4, 8, 12, 16]]

**Nested Loops in Comprehensions**

These are NOT nested comprehensions. Let's take a look at the regular way and compare it to having nested loops in a comprehension.

In [18]:
l = []

for i in range(2):
    for j in range(2):    
        for k in range(2):    
            l.append((i, j, k))
            
print(l)

[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (1, 0, 0), (1, 0, 1), (1, 1, 0), (1, 1, 1)]


In [20]:
l = [(i, j, k) for i in range(2) for j in range(2) for k in range(2)]

print(l)

[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (1, 0, 0), (1, 0, 1), (1, 1, 0), (1, 1, 1)]


We can have `if` statements but be careful - the order of `for` and `if` matters. Make sure your `if` condition only references variables that are earlier on in the loop (excluding the first bit of the loop - `(i,j)` is the first bit of the loop below, but `i` and `j` are not declared here).

```
l = [(i, j) for i in range(2) for j in range(2) if i==j] # CORRECT
l = [(i, j) for i in range(2) if i==j for j in range(2)] # WRONG, j not declared until 'for j...' but referenced earlier in 'i==j'
```

Here's an example with two conditions:

In [23]:
[(i, j)
 for i in range(1,6) if i%2==0 
 for j in range(1,6) if j%3==0]

[(2, 3), (4, 3)]

But note, we could put both `if` statements at the end because both `i` and `j` are defined by then:

In [24]:
[(i, j)
 for i in range(1,6) 
 for j in range(1,6)
 if i%2==0
 if j%3==0]

[(2, 3), (4, 3)]

Here's another example. Note that there's a significant difference between enclosing and not enclosing the first expression in [ ]. 

If we want an output of `['ax', 'ay', 'az', 'bx', 'by', 'bz', 'cx', 'cy', 'cz']`, then we will only have **one** nested comprehension. As a result, the two variables are both local because they are both within the same [ ].

In [12]:
l1 = ['a', 'b', 'c']
l2 = ['x', 'y', 'z']

nested_comp = [[s1 + s2 for s1 in l1] for s2 in l2]
nested_loop = [s1 + s2 for s1 in l1 for s2 in l2]

print(nested_comp)
print(nested_loop)

[['ax', 'bx', 'cx'], ['ay', 'by', 'cy'], ['az', 'bz', 'cz']]
['ax', 'ay', 'az', 'bx', 'by', 'bz', 'cx', 'cy', 'cz']


The easy way to interpret the nested loop expression is to recognise that the loops are in the same order as the traditional way, i.e.,

```
for s1 in l1:
    for s2 in l2
```