**Polymorphism** is the ability to define a generic type of behaviour that may behave different when applied to different types.

Python is very polymorphic
- It makes use of [duck typing](https://realpython.com/duck-typing-python/#:~:text=Python%20makes%20extensive%20use%20of,a%20core%20concept%20in%20Python.)
- Operators +, -, * and / are polymorphic because they work with different types (ints, floats, decimals, lists, tuples, custom objects etc).
- When certain dunder methods are implemented in a class to satisfy a protocol, then the instances of that class will be able to make use of that protocol's behaviour. For example, implementing `__enter__` and `__exit__` will make the class function as a context manager with the `with` keyword.. `__iter__` and `__next__` can be implemented for iterable/iterator functionality with the `for` keyword. `__get/set/delitem__` gives sequence functionality with slicing operators (`a[i:j]`) etc.

We'll now look at some dunder methods.

# 01 - __str__ and __repr__ Methods

Both used for creating a string representation of an object

What's the difference?
- typically `__repr__` used by developers - often try to make the representation such that the string can be used to recreate the object - the way you'd instantiate the object.
- `__str__` is used by `str()` and `print()` functions as well as various formatting functions e.g. f-strings. It's typically used for display purposes to the end user, logging, etc.

If `__str__` is not implemented, Python will look for `__repr__` instead.

If neither is implemented, Python will look up the inheritance tree to eventually reach `Object` which has a defined `__repr__`.

Here's an example:

In [2]:
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    def __repr__(self):
        print('__repr__ called')
        return f"Person(name='{self.name}, age=self.age')"
    
    def __str__(self):
        print('__str__ called')
        return self.name

In [3]:
p = Person('Python', 30)

In Jupyter's console, the return of an object will favour the `__repr__` which makes sense as this is what developers will want to see.

In [4]:
p

__repr__ called


Person(name='Python, age=self.age')

But `print` gives the expected `__str__` output.

In [6]:
print(p)

__str__ called
Python


# 02 - Arithmetic Operators

We have `__add__`, `__sub__`, `__mul__`, `__truediv__` (/), `__floordiv__` (//), `__mod__` (%), `__pow__` and `__matmul__` (@).

Regarding `__matmul__`, python doesn't actually implement this in any type but it was added for better **numpy** support of matrix multiplication.

If we want to indicate that any one of these haven't been implemented, we should implement it and `return NotImplemented` (as opposed to raising an exception). 

Below is an example class that implements adding, subtracting, scalar and dot products of vectors, along with some additional dunder methods such as `__iadd__`, `__neg__` and `__abs__`. 

After that, we have the descriptions of the different methods implemented as well as some code executions to demonstrate it working:

#### Vector Class Example

In [23]:
from numbers import Real
from math import sqrt

class Vector:
    def __init__(self, *components):
        # validate number of components is at least one, and all of them are real numbers
        if len(components) < 1:
            raise ValueError('Cannot create an empty Vector.')
        for component in components:
            if not isinstance(component, Real):
                raise ValueError(f'Vector components must all be real numbers - {component} is invalid.')
        
        # use immutable storage for vector
        self._components = tuple(components)
        
    def __len__(self):
        return len(self._components)
        
    @property
    def components(self):
        return self._components
    
    def __repr__(self):
        # works - but unwieldy for high dimension vectors
        return f'Vector{self._components}'
    
    def validate_type_and_dimension(self, v):
        return isinstance(v, Vector) and len(v) == len(self)
            
    def __add__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x + y for x, y in zip(self.components, other.components))
        return Vector(*components)
            
    def __sub__(self, other):
        if not self.validate_type_and_dimension(other):
            return NotImplemented
        components = (x - y for x, y in zip(self.components, other.components))
        return Vector(*components)
    
    def __mul__(self, other):
        print('__mul__ called...')
        if isinstance(other, Real):
            components = (other * x for x in self.components)
            return Vector(*components)
        if self.validate_type_and_dimension(other):
            # dot product
            components = (x * y for x, y in zip(self.components, other.components))
            return sum(components)
        return NotImplemented
    
    def __rmul__(self, other):
        print('__rmul__ called...')
        # for us, multiplication is commutative, so we can leverage our existing __mul__ method
        return self * other
    
    def __iadd__(self, other):
        print('__radd__ called...')
        if self.validate_type_and_dimension(other):
            components = (x + y for x, y in zip(self.components, other.components))
            self._components = tuple(components)  # mutating our Vector object
            return self # don't forget to return the result of the operation!
        return NotImplemented
        
    def __neg__(self):
        print('__neg__ called...')
        components = (-x for x in self.components)
        return Vector(*components)
    
    def __abs__(self):
        print('__abs__ called...')
        return sqrt(sum(x ** 2 for x in self.components))

#### Reflected Operators

**If the first operand of a reflected operator (e.g. `__add__`, `__mul__`) returns `NotImplemented` AND operands are not of the same type**, Python will swap the operands and try the reflected dunder method (e.g. `__radd__`, `__rmul__`). 

If the operands are of the same type then we never have to worry about the reflected dunder method as it will never be called. If that type has implemented `__mul__`, then order is unimportant as both objects being of that type will have access to `__mul__` so the method can be called. If `__mul__` is not defined, then neither order will work. 

All dunders above are reflected operators except `__matmul__`.

Typically the reflected operator will delegate back to the original. For example, `__rmul__` is often implemented by delegating back to `__mul__` either by calling `__mul__` explictly or just using the `*` operator. 

**Addition/Subtraction**

In [24]:
v1 = Vector(1, 2)
v2 = Vector(3, 4)

In [25]:
v1 + v2

Vector(4, 6)

In [26]:
v1 - v2

Vector(-2, -2)

**Multiplication and Reflected Multiplication / Scalar Product**

In [27]:
v1 = Vector(1, 2)

In [28]:
v1 * 10

__mul__ called...


Vector(10, 20)

In [29]:
10 * v1

__rmul__ called...
__mul__ called...


Vector(10, 20)

This only works because the `integer` class has not implemented `__mul__` so it returns `NotImplemented` and both objects are of different types. Also note how `__rmul__` has been implemented with `return self * other` which leverages `__mul__`.

**Dot Product**

In [30]:
v1 = Vector(2, 3)
v2 = Vector(4, 5)

In [31]:
v1 * v2

__mul__ called...


23

#### In-Place Operators

Examples of these are `__iadd__` (+=), `__isub__` (-=), `__imul__` (\*=), `__itruediv__` (/=), `__ifloordiv__` (//=), `__imod__` (%=) and `__ipow__` (**=).

Since these are in-place, typically in-place operators will try to **mutate** the object on the left of the expression, but this is **not** guaranteed. Tuples are a good example:

In [32]:
t = (1, 2)
print(id(t))
t += (3, 4)
print(id(t))

2383005389568
2383005565392


Therefore, the in-place dunder methods will not force you to keep the return an object of the same ID.

Here's our implementation:

In [33]:
v1 = Vector(1, 2)
v2 = Vector(10, 20)

print(id(v1))

v1 += v2

print(id(v1), v1)

2382985773552
__radd__ called...
2382985773552 Vector(11, 22)


#### Unary Operators

Examples of these are `__neg__` (-a), `__pos__` (+a), `__abs__` (`abs(a)`).

First, negation:

In [34]:
v1 = Vector(1, 2)
-v1

__neg__ called...


Vector(-1, -2)

So we can use it in arithmetic operations such as:

In [35]:
v2 = Vector(10, 10)

v2 + -v1

__neg__ called...


Vector(9, 8)

Now, absolute values:

In [36]:
v1 = Vector(1, 1)

In [37]:
abs(v1)

__abs__ called...


1.4142135623730951

# 03 - Rich Comparisons

Rich comparisons (<, <=, >, >=, ==) are quite straightforward. 

If `==` is not implemented, Python defaults to the identity operator: `is`.

Furthermore, if one comparison does not exist (object returns `NotImplemented`), Python will try to the reverse the operands and the operator (and unlike the arithmetic operators, both operands can be of the same type).

So if `a` has not implemented `__lt__`, then `a < b` will attempt `b > a` : `b.__gt__(a)`. 

The same applies for <=.
For `==`, we get `!=` for free, as Python will just do `not(a == b)`.

**Total Ordering**

In many cases, we can use the `@totalordering` decorator to derive almost all rich comparisons from just two base ones: the `__eq__` and one other one, maybe `__lt__`, or `__le__`, etc.

For example, if `==` and `<` is defined, then:
- `a <= b` is `a == b or a < b`
- `a > b` is `b < a`
- `a >= b` is `a == b or b < a`
- `a != b` is `not(a == b)`

On the other hand if we define `==` and `<=`, then:
- `a < b` is `a <= b and not(a == b)`
- `a >= b` is `b <= a`
- `a > b` is `b <= a and not(b == a)`
- `a != b` is `not(a == b)`

In [11]:
from functools import total_ordering

@total_ordering
class Number:
    def __init__(self, x):
        self.x = x
        
    def __eq__(self, other):
        print('__eq__ called...')
        if isinstance(other, Number):
            return self.x == other.x
        return NotImplemented
    
    def __lt__(self, other):
        print('__lt__ called...')
        if isinstance(other, Number):
            return self.x < other.x
        return NotImplemented

In [12]:
a = Number(1)
b = Number(2)
c = Number(1)

In [13]:
a < b

__lt__ called...


True

In [14]:
a <= b

__lt__ called...


True

You'll notice that `__eq__` was not called - that's because `a < b` was True, and short-circuit evaluation. In this next example though, you'll see both methods are called:

In [15]:
a <= c

__lt__ called...
__eq__ called...


True

# 04 - Hashing and Equality

Since this has been covered in more detail elsewhere, I will just jot down some reminders and useful tips:

Recall that for an object to be usable in a mapping type (key in a dictionary, element of a set, etc.) it must be **hashable**.

Therefore we should implement `__hash__` and `__eq__`. 

If `__eq__` is implemented, `__hash__` is implicitly set to `None` unless `__hash__` is implemented. Check out [Part 3: Section 3 - Subsection 05 Custom Classes and Hashing](https://github.com/nasiqziyan/python-deepdive-summaries/blob/main/Part%203/Summaries/Section%203%20Summary.ipynb) for more info. 

Always remember the rule for hash functions: **If two objects compare equal (==), they must hash equal.**

If `__eq__` has been overrided, then two objects may compare equal but Python knows they can't possible hash equal since, if `__hash__` hasn't been implemented, it will default to using the memory ID which is unique for all objects. 

All objects inherit from `object`. Since this class has implemented default behaviours for `__hash__`, `__eq__`, etc., all objects will use this unless overridden.

Generally speaking, if we want to hash an instance based off one of its properties e.g. `self.name`, then we should make that property immutable. We do this by implementing the getter `@property` but not the setter.

# 05 - Booleans

As we know, all objects in Python have an associated boolean value. This can be overridden with `__bool__`. 

If `__bool__` is not defined, 
- Python will look for `__len__`. If that returns **0**, we return `False`; else, we return `True`.
- if neither `__bool__` or `__len__` is present, we always return `True`.

Therefore, for custom iterables, we only need to implement `__len__` and not `__bool__` to stay consistent with other collection types.

So by default, all objects are truthy.

# 06 - Callables

- Any object can be amde to emulate a callable by implementing a `__call__` method. Many tools in the standard library actually implement this. For example, `partial` from `functools` is a class that behaves like a function via `__call__`.

- We can find out if any object is a callable with `callable()`, e.g. `callable(print) -> True`.

#### Example 1 - Cache with a cache-miss counter

I want to implement a dictionary to act as a cache, but I also want to keep track of the cache misses so I can later evaluate if my caching strategy is effective or not.

The `defaultdict` class can be useful as a cache; if a value is not found in our `defaultdict`, Python will call the `defaultdict's` callable which we can set up to increment a counter. 

**simple approach**:

In [3]:
miss_counter = 0

In [4]:
def default_value():
    global miss_counter
    miss_counter += 1
    return 'N/A'

And now we can use it this way:

In [6]:
from collections import defaultdict

d = defaultdict(default_value)

In [7]:
d['a'] = 1
d['a']
d['b']
d['c']

'N/A'

In [8]:
miss_counter

2

This works, but is not very good - the `default_value` function **relies** on us having a global `miss_counter` variable - if we don't have it our function won't work. Additionally we cannot use it to keep track of different cache instances since they would all use the same instance of `miss_counter`.

**Better Approach**

We can instead use the `__call__` to treat a class like a callable:

In [21]:
class DefaultValue:
    def __init__(self, default_value):
        self.default_value = default_value
        self.counter = 0
        
    def __call__(self):
        self.counter += 1
        return self.default_value

In [22]:
def_1 = DefaultValue(None)

cache_1 = defaultdict(def_1)

cache_1['a'] = 1
cache_1['a']
cache_1['b']
cache_1['c']

In [23]:
def_1.counter

2

In [24]:
def_2 = DefaultValue(0)

cache_2 = defaultdict(def_2)

cache_2['a'] = 1
cache_2['a']
cache_2['b']
cache_2['c']

0

In [25]:
def_2.counter

2

#### Example 2 - Profiler Decorator

Let's say that we want to write a profiler decorator to decorate functions in the module scope only (decorating methods will be looked at in Section 8 - Descriptors). 

The profiler will return the number of times the function was called (counter), the total elapsed time over all function calls and the average time (total elapsed / counter).

Let's say we want these values accessible from the function. We can do this by adding attributes to our function (remember everything is an object and objects have attributes).

Here's how we might do it:

##### Closure Approach

In [54]:
from time import perf_counter, sleep
from functools import wraps
import random

def profiler(fn):
    _counter = 0
    _time_elapsed = 0
    _avg_time = 0

    def inner(*args, **kwargs):
        nonlocal _counter
        nonlocal _time_elapsed
        nonlocal _avg_time
        _counter += 1
        start = perf_counter()
        result = fn()
        end = perf_counter()
        _time_elapsed += (end - start)
        _avg_time = _time_elapsed / _counter
        return result

    inner.counter = _counter
    inner.time_elapsed = _time_elapsed
    inner.avg_time = _avg_time

    return inner

@profiler
def func_1():
    sleep(random.random())

In [55]:
func_1(), func_1()

(None, None)

In [56]:
func_1.counter

0

Why didn't this work?

When `func_1` was called, `_counter = 0` executed then `inner.counter = _counter` executed, but nothing inside `inner()` executed. 

So `inner.counter` was set to point to 0.

After `inner` is called, `_counter` does indeed update, but `inner.counter` is still pointing to the old reference. How do we fix this?

We simply want `inner.counter` to point to the same **cell** (see Part 1, Section 7, Subsection 3) as `inner's` counter and `profiler's` counter. 

We therefore need to make `inner.counter` a **callable** which returns the free variable `_counter`

In [57]:
from time import perf_counter, sleep
from functools import wraps
import random

def profiler(fn):
    _counter = 0
    _time_elapsed = 0
    _avg_time = 0

    def inner(*args, **kwargs):
        nonlocal _counter
        nonlocal _time_elapsed
        nonlocal _avg_time
        _counter += 1
        start = perf_counter()
        result = fn()
        end = perf_counter()
        _time_elapsed += (end - start)
        _avg_time = _time_elapsed / _counter
        return result

    def counter():
        return _counter

    def time_elapsed():
        return _time_elapsed

    def avg_time():
        return _avg_time

    inner.counter = counter
    inner.time_elapsed = time_elapsed
    inner.avg_time = avg_time

    return inner

@profiler
def func_1():
    sleep(random.random())

In [58]:
func_1(), func_1()

(None, None)

In [59]:
func_1.counter()

2

In [61]:
func_1.avg_time()

0.09571774999858462

##### Class Approach

The last approach was a little convoluted...

Here's how we would make a decorator using a class. Note that we will calculate avg_time more lazily. We could've done this above by modifying the `avg_time` closure to compute `_time_elapsed / _counter`, but I wanted the pattern to be clear.

In [63]:
class Profiler:
    def __init__(self, fn):
        self.counter = 0
        self.total_elapsed = 0
        self.fn = fn
        
    def __call__(self, *args, **kwargs):
        self.counter += 1
        start = perf_counter()
        result = self.fn(*args, **kwargs)
        end = perf_counter()
        self.total_elapsed += (end - start)
        return result
        
    @property
    def avg_time(self):
        return self.total_elapsed / self.counter

In [64]:
@Profiler
def func_1():
    sleep(random.random())

At this point, `func_1` is an **instance** of `Profiler`. These instances are **callable** due to `__call__`:

In [65]:
func_1(), func_1()

(None, None)

In [66]:
func_1.avg_time

0.6357646000014938

# 07 - The __del__ Method

The `__del__` method is a **class finaliser**. 

The **garbage collector** destroys objects that are no longer referenced anywhere.

It is the method that is called right before the object is destroyed by the **garbage collector (GC)** -> the GC therefore determines when `__del__` is called, not us.

This is *even when we use the `del` keyword*. The `del` keyword only removes that current reference to the object, but if there are more references, `__del__` won't be called. If our object raises an exception, that exception is an object with a **stack trace** (`.__traceback__`) which holds a reference to that object (`ex.__traceback__.tb_frame.f_locals` is a dictionary containing a key which is the variable name associated with that object, and the value is the object itself). If we destroy our reference to it, but forget to destroy the stack trace's reference, that object **does not** get destroyed.

If we do want to clean up resources after assignment, we should not use the `__del__` keyword as it's nondeterministic. Instead, we should use **context managers**.

As this method is not used very often, I won't expand upon it in any more depth.

# 08 - The __format__ Method

The `format` method is yet another representation function alongside `__str__` and `__repr__`.
```python
format(value, format_spec)
```
We can use it for floats, datetimes, etc.

In [67]:
format(1/3, '.2f')

'0.33'

In [72]:
from datetime import datetime

now = datetime.now()
format(now, '%a %Y-%m-%d  %I:%M %p')

'Thu 2024-07-11  04:34 PM'

We can override the format specification through `__format__`, but in general we don't want to do so, as it's fairly complex, and it's often easier to delegate to an object which has already implemented formatting e.g. `float()` takes any number and formats it according to its own specification.

In [70]:
float(0.500000)

0.5

Below, we will delegate to `datetime's` formatting specification. This is used when a `datetime` object is passed as a value to the `format()` function:

In [73]:
class Person:
    def __init__(self, name, dob):
        self.name = name
        self.dob = dob
        
    def __repr__(self):
        print('__repr__ called...')
        return f'Person(name={self.name}, dob={self.dob.isoformat()})'
    
    def __str__(self):
        print('__str__ called...')
        return f'Person({self.name})'
    
    def __format__(self, date_format_spec):
        print(f'__format__ called with {repr(date_format_spec)}...')
        dob = format(self.dob, date_format_spec)
        return f'Person(name={self.name}, dob={dob})'

So now have:

In [74]:
from datetime import date

p = Person('Alex', date(1900, 10, 20))

In [75]:
str(p)

__str__ called...


'Person(Alex)'

In [76]:
repr(p)

__repr__ called...


'Person(name=Alex, dob=1900-10-20)'

In [77]:
format(p, '%B %d, %Y')

__format__ called with '%B %d, %Y'...


'Person(name=Alex, dob=October 20, 1900)'