## Special Methods for Sequences

### Vector: A User-Defined Sequence Type

* Goal: Create a class to represent multidimensional Vector
* `Vector` will behave like standard Python immutable flat sequence
  * Baisc sequence protocol: `__len__` and `__getitem__`
  * Safe representation of instances with many items
  * Proper slicing support, producing new vector instances
  * Aggregate hashing, taking into account every contained element value
  * Custom formatting language extension
  * Dynamic attribute access with `__getattr__`
* Strategy: Composition instead of inheritance ---> store components in an array of floats

### Vector Take #1: Vector2d Compatible

* Constructor: Take data as an iterable argument in the constructor, i.e:
```
Vector([1, 2, 3])
```
  * Could have subclassed `Vector` from `Vector2d` but incompatible constructors make subclassing not advisable.

* Use `reprlib` module to produce limited-length representation
  * Because of its role in debugging, calling `repr()` on an object should never raise exception

In [4]:
from array import array
import reprlib
import math
import operator


class Vector1:
    typecode = 'd'

    def __init__(self, components):
        self._components = array(self.typecode, components)  # self._components: protected attribute, array with Vector components

    def __iter__(self):
        return iter(self._components)  # iterator over self._components

    def __repr__(self):
        components = reprlib.repr(self._components)
        components = components[components.find('['):-1]
        return f'Vector({components})'

    def __str__(self):
        return str(tuple(self))

    def __bytes__(self):
        return (bytes([ord(self.typecode)]) +
                bytes(self._components))

    def __eq__(self, other):
        return tuple(self) == tuple(other)

    def __abs__(self):
        return math.hypot(*self)  # since python 3.8: math.hypot accecpts N-dimensional points

    def __bool__(self):
        return bool(abs(self))

    @classmethod
    def frombytes(cls, octets):
        typecode = chr(octets[0])
        memv = memoryview(octets[1:]).cast(typecode)
        return cls(memv)  # Pass memoryview withount unpacking with * to constructor

### Protocols and Duck Typing

* Don't need to inherit from any special class to create a fully functional sequence type in Python.
* Just need to implement the methods that fulfill the sequence *protocol*.
* Protocol is an informal interface defined only in documentation and not in code.
* Sequence protocol in Python: just implement `__len__` and `__getitem__` methods.
* All that matters is that it provides the necessary methods ----> This became know as *duck typing*
* Because protocols are informal and unenforced, can often get away with implementing just part of a protocl if know the specific context where a class will be used, e.g to support iteration only `__getitem__` is needed, no need to `__len__`.
* Two kinds of protoclos: *static protocol* and *dynamic protocol*
* One key difference between them is that static protocol implementations must provide all methods defined in the protocol class (More about protocols in Chapter 13).

### Vector Take #2: A sliceable Sequence

* Need `__len__` and `__getitem__` in `Vector` to support sequence protocol.
* Slice of a `Vector` should be a `Vector` instance ----> Need to analyze the arguments get in `__getitem__`.

In [5]:
# example 12-4
class MySeq:
    def __getitem__(self, index):
        return index

s = MySeq()
print(s[1])
print(s[1:4])
print(s[1:4:2])
print(s[1:4:2, 9])  # presence of commas inside [] means __getitem__ receives a tuple
print(s[1:4:2, 7:9])

1
slice(1, 4, None)
slice(1, 4, 2)
(slice(1, 4, 2), 9)
(slice(1, 4, 2), slice(7, 9, None))


In [6]:
dir(slice)  # start, stop, step attributes, indices mthod

['__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'indices',
 'start',
 'step',
 'stop']

In [7]:
help(slice.indices)

Help on method_descriptor:

indices(...)
    S.indices(len) -> (start, stop, stride)
    
    Assuming a sequence of length len, calculate the start and stop
    indices, and the stride length of the extended slice described by
    S. Out of bounds indices are clipped in a manner consistent with the
    handling of normal slices.



* `indices` method gracefully handle missing or negative indices and slices that are longer than the original sequence.
* This method produces normalized tuples of nonnegative start, stop and stride integers tailored to a sequence of the given length

In [8]:
print(slice(None, 10, 2).indices(5))    # The length of the sequence is 5.
print(slice(-3, None, None).indices(5))

(0, 5, 2)
(2, 5, 1)


* `Vector` won't need the `slice.indices()` method because `array` in constructor will handle it.

In [9]:
class Vector2(Vector1):
    def __len__(self):
        return len(self._components)

    def __getitem__(self, key):
        if isinstance(key, slice):
            cls = type(self)
            return cls(self._components[key])  # invoke the class to build another Vector instance from a slice of the _components array
        index = operator.index(key)
        return self._components[index]  # return specific item from _components

* `operator.index()` ----> calls the `__index__` special method.
* `operator.index()` and `__index__` defined in PEP 357 to allow any of numerous types of integers in `Numpy` to be used as indexes and slice arguments.
* Key difference between `int()` and `operator.index()` is that the former is intended for this specific purpose, e.g , `int(3.14)` returns 3 and `operator.index(3.14)` raise `TypeError`.


In [10]:
# example 12-7
v = Vector2(range(7))
v[-1]

6.0

In [11]:
v[1:4]

Vector([1.0, 2.0, 3.0])

In [12]:
v[-1:]  # slice of len = 1 creates a Vector

Vector([6.0])

In [13]:
v[1,2]

TypeError: 'tuple' object cannot be interpreted as an integer

### Vector Take #3: Dynamic Atrribute Access

* Want to access the first few components with shortcut letters like x, y, z instead of `v[0]`, `v[1]`, `v[2]`.
* Can provide read-only access to x, y using `@property`.
* `__getattr__` special method is better.
* `__getattr__` is invoked by the interpreter when attribute lookup fails.
* Given expression `obj.x`, Python checks if the `obj` intance has an attribute named `x`.
  * if not, search goes to the class (`obj.__class__`) and then up to the inheritance graph.
  * if not found, `__getattr__` method defined in class of `obj` is called with `self` and name of attribute as string.

In [14]:
class vector3(Vector2):
    __match_args__ = ('x', 'y', 'z', 't')  # allow positional pattern matching on dynamic attributes supported by __getattr__

    def __getattr__(self, name):
        cls = type(self)  # <2>
        try:
            pos = cls.__match_args__.index(name)  # get position of name in __match_args__
        except ValueError:  # if name not found
            pos = -1
        if 0 <= pos < len(self._components):  # <5>
            return self._components[pos]
        msg = f'{cls.__name__!r} object has no attribute {name!r}'  # <6>
        raise AttributeError(msg)

v3 = vector3(range(5))
print(v3.x)
v3.x = 10  # Assigning new value to x should raise an exception
print(v3.x)
v3  #  vector components did not change

0.0
10


Vector([0.0, 1.0, 2.0, 3.0, 4.0])

In [15]:
print(v3.__dict__)

{'_components': array('d', [0.0, 1.0, 2.0, 3.0, 4.0]), 'x': 10}


* After assinging value to attribute `x`, `v3` object now has `x` attribute so `__getattr__` will no longer be called to retrieve `v3.x`.
* The interpreter just return the value 10 that is bound to `v3.x`.
* `__getattr__` pays no attention to instance attributes other than `self._components` from where it retrieves the values of the virtual attributes listed in `__match_args__`.
* Need to customize the logic for setting attribute in order to avoid inconsistency by implementing `__setattr__`.
* When implementing `__getattr__`, very often need to code `__setattr__` to avoid inconsistent behavior in objects.
* If want to allow changing components, can implement:
    * `__setitem__` to enable `v[0] = 1.1`
    * `__setattr__` to make `v.x = 1.1` work.
* `Vector` is immutable so no need to allow changing components.

In [None]:
class Vector3(Vector2):
    __match_args__ = ('x', 'y', 'z', 't')

    def __getattr__(self, name):
        cls = type(self)
        try:
            pos = cls.__match_args__.index(name)
        except ValueError:
            pos = -1
        if 0 <= pos < len(self._components):
            return self._components[pos]
        msg = f'{cls.__name__!r} object has no attribute {name!r}'
        raise AttributeError(msg)

    def __setattr__(self, name, value):
        cls = type(self)
        if len(name) == 1:
            if name in cls.__match_args__:
                error = 'readonly attribute {attr_name!r}'
            elif name.islower():
                error = "can't set attributes 'a' to 'z' in {cls_name!r}"
            else:
                error = ''
            if error:
                msg = error.format(cls_name=cls.__name__, attr_name=name)
                raise AttributeError(msg)
        super().__setattr__(name, value)  # call __setattr__ on superclass for standard behavior

* Can use `__slots__` at class level to prevent setting new instance attributes.
* Using `__slots__` just to prevent instance attribute creation is not recommended.
* `__slots__` should be used only to save memory.

### Vector Take #4: Hashing and a Faster ==

* `__hash__` and `__eq__` will make `Vector` instances hashable.
* Can't use tuple to compute hash like in `Vector2d` because `Vector` instances may be dealing with thousands of components.
* Use `^` (xor) operator to the hashes of every component in succession ----> `functools.reduce` will be used.
* Key idea of `reduce()` is to reduce a series of values to a single value:
  * First argument to `reduce()` is a two-argument function.
  * Second argument is an iterable.
  * Third argument is initializer. `initializer` is the value returned if the sequence is empty and is used as the first argument in reducing loop, so it should be the identity value of the operation.
  * When using `reduce`, it's good practice to provide `initializer` to prevent `TypeError` exception.
  * When calling `reduce(fn, list)`, `fn` will be applied to the first pair of elements of the iterable ----> `fn(list[0], list[1])` ----> producing first result `r1`.
  * Then `fn` is applied to `r1` and next element ----> `fn(r1, list[2])` ----> producing second result `r2`.
  * `fn(r2, list[3])` ----> producing `r3`
  * until last element when a single result `rN` is returned.
* For `^` operator can use `operator.xor`.
* `operator` module provides functionality of all python infix operators in function form ----> less need to `lambda`.
* Can use `map-reduce` computation ----> The mapping step produces one hash for each component, reduce step aggregates all hashes with the xor operator.
* In Python3, `map` is lazy; it creates a generator that yields the results on demand, thus saving memory.
* `__eq__` method for `Vector` instances that may have thousands of components is inefficient. It builds two tuples copying the entire contents of the operands just to use `__eq__` of tuple. Also it considers `Vector([1,2])` equal to `(1, 2)` which may be a problem

In [None]:
import functools

class Vector4(Vector3):
    def __eq__(self, other):
        return (len(self) == len(other) and  # Checking len, because zip stops producing values without warning as soon as one of the inputs is exhausted.
                all(a == b for a, b in zip(self, other)))  # zip produces a generator of tuples made from the items in each iterable.

    def __hash__(self):
        hashes = (hash(x) for x in self)
        return functools.reduce(operator.xor, hashes, 0)

* `all` function used in `__eq__` can produce same aggregate computation of `for` loop in one line. If all comparisons between corresponding components in the operands are `True`, result is `True`. As soon as one comparison is `False`, `all` returns `False`.

### Vector Take #5: Formatting

* `Vector` will use spherical coordinates.
* Change custom format suffix from `'p'` to `'h'`.
* `'h'` code will produce a display like `<r, a1, a2, a3>`, `r` is the magnitude, and the remaining numbers are the angular components a1, a2, a3.
* `angle(n)` method compute one of the angular coordinates.
* `angles()` return an iterable of all angular coordinates.

In [16]:
import itertools

class Vector:
    typecode = 'd'

    def __init__(self, components):
        self._components = array(self.typecode, components)

    def __iter__(self):
        return iter(self._components)

    def __repr__(self):
        components = reprlib.repr(self._components)
        components = components[components.find('['):-1]
        return f'Vector({components})'

    def __str__(self):
        return str(tuple(self))

    def __bytes__(self):
        return (bytes([ord(self.typecode)]) +
                bytes(self._components))

    def __eq__(self, other):
        return (len(self) == len(other) and
                all(a == b for a, b in zip(self, other)))

    def __hash__(self):
        hashes = (hash(x) for x in self)
        return functools.reduce(operator.xor, hashes, 0)

    def __abs__(self):
        return math.hypot(*self)

    def __bool__(self):
        return bool(abs(self))

    def __len__(self):
        return len(self._components)

    def __getitem__(self, key):
        if isinstance(key, slice):
            cls = type(self)
            return cls(self._components[key])
        index = operator.index(key)
        return self._components[index]

    __match_args__ = ('x', 'y', 'z', 't')

    def __getattr__(self, name):
        print("test")
        cls = type(self)
        try:
            pos = cls.__match_args__.index(name)
        except ValueError:
            pos = -1
        if 0 <= pos < len(self._components):
            return self._components[pos]
        msg = f'{cls.__name__!r} object has no attribute {name!r}'
        raise AttributeError(msg)

    def angle(self, n):
        r = math.hypot(*self[n:])
        a = math.atan2(r, self[n-1])
        if (n == len(self) - 1) and (self[-1] < 0):
            return math.pi * 2 - a
        else:
            return a

    def angles(self):
        return (self.angle(n) for n in range(1, len(self)))

    def __format__(self, fmt_spec=''):
        if fmt_spec.endswith('h'):  # hyperspherical coordinates
            fmt_spec = fmt_spec[:-1]
            coords = itertools.chain([abs(self)],
                                     self.angles())  # use itertools.chain to produce genexp to iterate seamlessly over the magnitude and angular coordinates
            outer_fmt = '<{}>'
        else:
            coords = self
            outer_fmt = '({})'
        components = (format(c, fmt_spec) for c in coords)
        return outer_fmt.format(', '.join(components))

    @classmethod
    def frombytes(cls, octets):
        typecode = chr(octets[0])
        memv = memoryview(octets[1:]).cast(typecode)
        return cls(memv)

In [18]:
v1 = Vector(range(3))
print(v1)

(0.0, 1.0, 2.0)


### Lecturers



*   Mohammad Mehdi Heidari: [Linkedin](https://www.linkedin.com/in/mohammad-mehdi-heidari-70b76611a)
*   Saeed Hemmati : [Linkedin](https://www.linkedin.com/in/saeed-hemati/)

Presentation Date : 12-08-2023

### Reviewers



1.   Mahya Asgarian, Review Date : 12-08-2023, [Linkedin](https://www.linkedin.com/in/mahya-asgarian-9a7b13249/)
