# Classes & Instances

#### Definition

In contrast to all the built-in data types introduced so far, *classes* are **user-defined data types** that allow the programmer to create new and **abstract** ways to look at data **and its associated behavior** and to manage the **state encapsulated** in **concrete** *instances* of these custom data types.

Classes enable the **[object-oriented programming](https://en.wikipedia.org/wiki/Object-oriented_programming)** (OOP) paradigm where **large programs** are broken down into many **small components** and **code** is **re-used**. This way, a program that is too big for a programmer to fully understand directly becomes maintainable via its easier to comprehend individual pieces.

#### Approximate Simplified Definition

Classes are user-defined types that enable a programming style where different components of a computer program are modeled after real world objects. In this sense, classes are abstract blueprints for concrete instances in the real world.

While this latter definition is often true, advanced programmers highlight primarily the **managing state** and **code maintainability and re-use** aspects when it comes to object-orientation.

Often times, we see the terminology "classes & objects" instead of "classes & instances" used by programmers on forums like [Stack Overflow](https://stackoverflow.com) but also in books on Python. In this notebook, we will be more precise as both classes and instances are actually objects (i.e., they have an identity or memory address, a type that specifies the behavior, and a value).

## Example: Vectors and Matrices

Neither core Python nor the Standard Library offer an implementation of common [linear algebra](https://en.wikipedia.org/wiki/Linear_algebra) functionalities. While the popular third-party library [numpy](http://www.numpy.org/) is the de-facto standard and is recommended to be used for real-life production scenarios, we will showcase how one could use Python's object-oriented language features to implement common matrix and vector operations throughout this notebook.

Naively, we could model a vector $\vec{y} = \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix}$ using a tuple or a list depending on if we wish the vector to be a mutable object or not.

In [1]:
y = (1, 2, 3)

In [2]:
y

(1, 2, 3)

We can extend this approach and model a matrix $\bf{X} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}$ as either a tuple of tuples or a list of lists or a mixture thereof.

Observe that we need to decide if we wish to use a list of rows or a list of columns (the same with tuples) and go with the former, which is a popular convention.

In [3]:
X = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [4]:
X

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

While this way of representing vectors and matrices in memory keeps things simple, Python's object-oriented features allow us to model their inherent semantics and eventually to create a new **[domain-specific language](https://en.wikipedia.org/wiki/Domain-specific_language)** (DSL) that simplifies working with vectors and matrices.

For example, semantically we should be able to multiply a matrix with a vector. However, Python does not know how to multiply a list of lists with a tuple.

In [5]:
X * y

TypeError: can't multiply sequence by non-int of type 'tuple'

In the next section we will introduce two user-defined data types `Vector` and `Matrix` that are then iteratively refined throughout this notebook to build up more and more behavior around the tuple and list of lists approaches that remain at the heart of the new data types representing the encapsulated state (i.e., the entries of the vectors and matrices).

## Class Definition

The `class` statement creates a new variable that points to a so-called **class object** in memory.

Following the **header** line, the indented **body** syntactically consists of function definitions (i.e., the `def` statements) and variable assignments (the `dummy_class_variable`). Any code put here is executed just as if it would be without the `class` statement but the class now acts as a **namespace**, i.e., we can access the resulting names with the dot operator `.` on the class object. These names are also called **class attributes**.

Within this context, a function is referred to as a **method** that will be **bound** to future **instance objects** created from the class object. The binding process means that Python implicitly inserts a reference to a specific instance object as the first argument to any method invocation that we have to capture in the `def` statement. By convention, we name this first parameter of every method `self` as it references the concrete instance object on which the method is invoked. Then, within the scope of this method we can set and access attributes via the dot operator `.` on `self`.

Python allows us to define so-called **magic methods** (named in the "dunder" style) on a class that make the new data type fit into Python's language features seaminglessly. While some important magic methods are discussed in detail in this notebook, see the [Language Reference](https://docs.python.org/3/reference/datamodel.html) for an exhaustive list of all such methods and the arguments they expect. Magic methods not defined on a class are implicitly replaced by their default implementation as mentionend in the Reference.

The example below shows three magic methods. `__init__()` is the one responsible for **initializing** a new instance object, which usually means setting **instance attributes**. The example expects a new vector to be created from some iterable object (e.g., a list or tuple) passed in as the `data` argument and then stores its elements in a list `_entries` on the new instance referenced as `self` after converting them to floats. The other two magic methods `__repr__()` and `__str__()` are discussed further below.

A best practice is to **seperate** the **usage** of a new data type **from its implementation**. By convention, attributes (both on classes and instances) that should not be accessed from anywhere outside of methods have a name starting with an underscore "\_" (many other programming languages use the keywords *private* or *protected* instead of this convention). On the contrary, attributes not starting with an underscore "\_", can safely be accessed from anywhere in a program. In the example, the instance attribute `self._entries` is such an implementation detail: we could have decided to store a vector's entries in a tuple instead of a list which should not affect how we use a vector instance later. This concept is also known as **information hiding** in the software engineering literature.

Lastly, as indicated by [PEP 257](https://www.python.org/dev/peps/pep-0257/) and also section 3.8.4 of the [Google Python Style Guide](https://github.com/google/styleguide/blob/gh-pages/pyguide.md), we should use docstrings to document relevant parts of the new data type. With respect to naming, classes are named according to the CamelCase convention while instances are treated like normal variables and named in snake\_case. Note that after this first example of `Vector` and `Matrix` classes we do not always repeat all the magic methods such as `__repr__()` or `__str__()` and leave out the docstrings for brevity.

In [6]:
class Vector:
    """A standard one-dimensional vector from linear algebra.

    Note that all the entries are converted to floats.
    """

    dummy_class_variable = "I am a vector"

    def __init__(self, data):
        """Initiate a new vector.

        Args:
            data (iterable): The vector's entries.
                Must have at least one element.
        Raises:
            ValueError: If the provided data do not have enough entries.
        """
        self._entries = list(float(x) for x in data)
        if len(self._entries) == 0:
            raise ValueError("A vector must have at least one entry!")

    def __repr__(self):
        return "Vector((" + ", ".join("{:.3f}".format(x) for x in self._entries) + "))"

    def __str__(self):
        return "Vector({:.1f}, ..., {:.1f})[{:d}]".format(
                   self._entries[0], self._entries[-1], len(self._entries))

`Vector` is a full-fledged object on its own. Its type `type` indicates that it represents a user-defined data type and it evaluates to its fully qualified name.

In [7]:
id(Vector)

35520280

In [8]:
type(Vector)

type

In [9]:
Vector

__main__.Vector

The docstrings are transformed into nice help texts.

In [10]:
Vector?

We can use the built-in function [vars()](https://docs.python.org/3/library/functions.html#vars) as an alternative to [dir()](https://docs.python.org/3/library/functions.html#dir) to obtain a brief summary of the attributes on `Vector`.

In [11]:
vars(Vector)

mappingproxy({'__module__': '__main__',
              '__doc__': 'A standard one-dimensional vector from linear algebra.\n\n    Note that all the entries are converted to floats.\n    ',
              'dummy_class_variable': 'I am a vector',
              '__init__': <function __main__.Vector.__init__(self, data)>,
              '__repr__': <function __main__.Vector.__repr__(self)>,
              '__str__': <function __main__.Vector.__str__(self)>,
              '__dict__': <attribute '__dict__' of 'Vector' objects>,
              '__weakref__': <attribute '__weakref__' of 'Vector' objects>})

The dot operator `.` can be used to access class attributes and also the methods which are ordinary function objects.

In [12]:
Vector.dummy_class_variable

'I am a vector'

In [13]:
Vector.__init__

<function __main__.Vector.__init__(self, data)>

Every class object also comes with a special hidden `__name__` attribute that tells us the name of the class. This is usually the same as the variable `Vector` which points at the class object. At first, this seems kind of redundant but we will see a usage for it at the end of this notebook when we learn about inheritance.

In [14]:
Vector.__name__

'Vector'

To **create an instance** object of type `Vector`, we **call a class** object just like a function with the `(...)` operator. This is forwarded to the `__init__()` magic method behind the scenes.

In [15]:
v = Vector([1, 2, 3])

Note that although the `__init__()` method above defines two parameters, we have to call it with exactly one argument (that must be an iterable of at least one element) as Python implicitly inserts the newly created instance object (i.e., `v`) as the first argument.

In [16]:
Vector()

TypeError: __init__() missing 1 required positional argument: 'data'

In [17]:
Vector(1, 2, 3)

TypeError: __init__() takes 2 positional arguments but 4 were given

In [18]:
Vector(())

ValueError: A vector must have at least one entry!

`v` is a full-fledged object as well. In general, as the overarching storyline of all notebooks so far indiciates, everything in Python is an object.

In [19]:
id(v)

139932827657104

In [20]:
type(v)

__main__.Vector

Every instance object comes with a special instance attribute `__class__` that also references the corresponding class object.

In [21]:
v.__class__

__main__.Vector

When we evaluate `v` in the command line, we actually see the return value of the `__repr__()` magic method. How we should use it, is specified in the [Language Reference](https://docs.python.org/3/reference/datamodel.html#object.__repr__). In brief, `__repr__()` should return a string that when entered in the Python command line, creates a new instance object with the same state as the original one we evaluated. In other words, it should return a string representation of the object optimized for direct consumption by a machine. This is often useful in debugging or logging scenarios in large applications.

In [22]:
v

Vector((1.000, 2.000, 3.000))

To make this magic more explicit, we could alternatively call the [repr()]( https://docs.python.org/3/library/functions.html#repr) built-in function with `v` as the argument (Note that the quotes in the previous cell are removed by Jupyter).

In [23]:
repr(v)

'Vector((1.000, 2.000, 3.000))'

On the contrary, the `__str__()` magic method should return a human-readable representation of the object according to the [Language Reference](https://docs.python.org/3/reference/datamodel.html#object.__str__). We can use the built-ins [str()](https://docs.python.org/3/library/functions.html#func-str) or [print()](https://docs.python.org/3/library/functions.html#print) to retrieve this string explicitly. As implemented above, this representation only shows a vector's first and last entries followed by the vector's length in brackets.

In [24]:
str(v)

'Vector(1.0, ..., 3.0)[3]'

Note that the [print()](https://docs.python.org/3/library/functions.html#print) function by default does not show the enclosing quotes.

In [25]:
print(v)

Vector(1.0, ..., 3.0)[3]


Below is a first implementation of the `Matrix` class that stores the data internally as a list of lists.

In [26]:
class Matrix:
    """A standard m-by-n-dimensional matrix from linear algebra.

    Note that all the entries are converted to floats.
    """

    def __init__(self, data):
        """Initiate a new matrix.

        Args:
            data (iterable of iterables): The matrix's entries.
                Must be provided with rows first, then columns.
                The number of column entries must be consistent per row
                    while the first row sets the correct number.
                Must have at least one element in total.
        Raises:
            TypeError: If the number of columns is inconsistent across the rows.
            ValueError: If the provided data do not have enough entries.
        """
        self._entries = list(list(float(col) for col in row) for row in data)
        for row in self._entries[1:]:
            if len(row) !=  len(self._entries[0]):
                raise TypeError("Each row must have the same number of entries!")
        if len(self._entries) == 0:
            raise ValueError("A matrix must have at least one entry!")

    def __repr__(self):
        return ("Matrix((" + ", ".join("(" + ", ".join("{:.3f}".format(col) for col in row)
                                       + ",)" for row in self._entries) + "))")

    def __str__(self):
        return "Matrix(({:.1f}, ...), ..., (..., {:.1f}))[{:d}x{:d}]".format(
                   self._entries[0][0], self._entries[-1][-1],
                   len(self._entries), len(self._entries[0]))

In [27]:
id(Matrix)

35736040

In [28]:
type(Matrix)

type

In [29]:
Matrix

__main__.Matrix

Let's create a new `Matrix` instance from a list of tuples. The string representations work as above.

In [30]:
m = Matrix([(1, 2, 3), (4, 5, 6), (7, 8, 9)])

In [31]:
id(m)

139932827481760

In [32]:
type(m)

__main__.Matrix

In [33]:
m

Matrix(((1.000, 2.000, 3.000,), (4.000, 5.000, 6.000,), (7.000, 8.000, 9.000,)))

In [34]:
print(m)

Matrix((1.0, ...), ..., (..., 9.0))[3x3]


Wrong usage of `Matrix` results in the documented exceptions.

In [35]:
Matrix()

TypeError: __init__() missing 1 required positional argument: 'data'

In [36]:
Matrix(())

ValueError: A matrix must have at least one entry!

In [37]:
Matrix([(1, 2, 3), (4, 5)])

TypeError: Each row must have the same number of entries!

## Computed Properties

After creation an instance may exhibit certain properties that depend on the concrete data it represents but are not captured explicitly by the defined class or instance attributes. For example, a `Matrix` instance can be thought of having `n_rows` and `n_cols` properties that result from the individual entries passed to it upon creation and are integers representing its dimensions $m$ and $n$.

Python provides a [property()](https://docs.python.org/3/library/functions.html#property) built-in that can be used together with a special `@` syntax (covered in a later notebook on so-called decorators) to create **derived** attributes that are computed from an object's current state as shown below. Note that we already use the new properties in the `__init__()` method.

In [38]:
class Matrix:

    def __init__(self, data):
        self._entries = list(list(float(col) for col in row) for row in data)
        for row in self._entries[1:]:
            if len(row) !=  self.n_cols:
                raise TypeError("Each row must have the same number of entries!")
        if self.n_rows == 0:
            raise ValueError("A matrix must have at least one entry!")

    def __repr__(self):
        return ("Matrix((" + ", ".join("(" + ", ".join("{:.3f}".format(col) for col in row)
                                       + ",)" for row in self._entries) + "))")

    @property
    def n_rows(self):
        return len(self._entries)

    @property
    def n_cols(self):
        return len(self._entries[0])

In [39]:
m = Matrix([(1, 2, 3), (4, 5, 6)])

`m` is a $2$ by $3$ matrix.

In [40]:
m.n_rows, m.n_cols

(2, 3)

In its basic form, properties are read-only attributes. This makes sense for `Matrix` instances as we cannot just define how many rows and columns there are while keeping the entries unchanged.

In [41]:
m.n_rows = 3

AttributeError: can't set attribute

A later notebook on so-called descriptors will extend the nature of properties into managed attributes.

## Instance Methods & Class Methods

In addition to the magic methods so far, we can define normal **instance methods** and **class methods** (the latter using the `@` syntax from above with the [classmethod()](https://docs.python.org/3/library/functions.html#classmethod) built-in).

Instance methods depend on the state (i.e., entries) of a concrete instance object. We can see this as instance methods usually need to access some attributes on the `self` variable in order to perform their work. If a method does not need access to `self`, it might conceptually not be an instance method to start with. An example of an instance method is `transpose()` below that switches the rows and columns of a matrix and returns a new `Matrix` instance based off that.

Class methods on the contrary do not depend on the state of a concrete instance but on the class object instead. Python implicitly inserts a reference to the latter whenever such a method is invoked. By convention, we name this parameter `cls`. Class methods are commonly used to provide an alternative way to create new instance objects. In the example below, `from_columns()` expects an iterable of columns (instead of rows) and creates a new `Matrix` instance.

There is also a [staticmethod()](https://docs.python.org/3/library/functions.html#staticmethod) built-in to define methods that are independent from both the class and instance objects but nevertheless are related semantically to the class somehow.

In [42]:
class Matrix:

    def __init__(self, data):
        self._entries = list(list(float(col) for col in row) for row in data)
        for row in self._entries[1:]:
            if len(row) !=  self.n_cols:
                raise TypeError("Each row must have the same number of entries!")
        if self.n_rows == 0:
            raise ValueError("A matrix must have at least one entry!")

    def __repr__(self):
        return ("Matrix((" + ", ".join("(" + ", ".join("{:.3f}".format(col) for col in row)
                                       + ",)" for row in self._entries) + "))")

    @property
    def n_rows(self):
        return len(self._entries)

    @property
    def n_cols(self):
        return len(self._entries[0])

    def transpose(self):
        return self.__class__(((self._entries[col][row] for col in range(self.n_rows))
                               for row in range(self.n_cols)))

    @classmethod
    def from_columns(cls, data):
        return cls(data).transpose()

In [43]:
m = Matrix([(1, 2, 3), (4, 5, 6)])

In [44]:
m

Matrix(((1.000, 2.000, 3.000,), (4.000, 5.000, 6.000,)))

The `transpose()` method returns a new instance object where rows and columns are switched. This method can be 
chained which negates the effect of the switching.

In [45]:
m.transpose()

Matrix(((1.000, 4.000,), (2.000, 5.000,), (3.000, 6.000,)))

In [46]:
n = m.transpose().transpose()

In [47]:
n

Matrix(((1.000, 2.000, 3.000,), (4.000, 5.000, 6.000,)))

However, the instance `n` is a different object than the original instance `m`.

In [48]:
m is n

False

Surprisingly, the equality operator `==` returns a wrong result as `m` and `n` are actually matrices with the exact same entries. We will correct this in the section on operator overloading below.

In [49]:
m == n

False

We can use the alternative constructor `from_columns()` to create an equivalent matrix from a list of columns instead of rows.

In [50]:
m = Matrix.from_columns([(1, 4), (2, 5), (3, 6)])

In [51]:
m

Matrix(((1.000, 2.000, 3.000,), (4.000, 5.000, 6.000,)))

## Sequence Emulation

A *sequence* type $s$ is any type that allows indexing for an integer $i$ where $0 \le i \lt$ [len($s$)](https://docs.python.org/3/library/functions.html#len).

In order to make `Vector` and `Matrix` instances sequentially iterable, we need to define the `__len()__` and `__getitem__()` magic methods that are part of the [Language Reference](https://docs.python.org/3/reference/datamodel.html#emulating-container-types)'s documentation on container types. While the former calculates the number of elements in a container type, the latter implements the indexing operator `[...]`.

Note that we use the [isinstance()](https://docs.python.org/3/library/functions.html#isinstance) built-in function to validate the type of the `index` parameter and that more background on this function is given in the section on inheritance further below. Also, we implicitly use the `__len__()` method inside the `__init__()` method already with `len(self)`.

In [52]:
class Vector:

    def __init__(self, data):
        self._entries = list(float(x) for x in data)
        if len(self) == 0:
            raise ValueError("A vector must have at least one entry!")

    def __len__(self):
        return len(self._entries)

    def __getitem__(self, index):
        if not isinstance(index, int):
            raise TypeError("index must be an integer!")
        return self._entries[index]

Now we can obtain the number of elements with the [len()](https://docs.python.org/3/library/functions.html#len) built-in function, index into `Vector` objects, and also iterate over them with a `for` statement.

In [53]:
v = Vector([1, 2, 3, 4])

In [54]:
len(v)

4

In [55]:
v[0]

1.0

In [56]:
for entry in v:
    print(entry, end="  ")

1.0  2.0  3.0  4.0  

Note that indexing so far is a read-only operation.

In [57]:
v[0] = 99

TypeError: 'Vector' object does not support item assignment

Observe that as a matrix is two-dimensional, we need to choose if we first flatten the rows or columns. We choose the approach where we first iterate over the first row, then the second, and so on, which is called a **[row major approach](https://en.wikipedia.org/wiki/Row-_and_column-major_order)** in the software engineering literature.

In addition to indexing by integer, we also implement indexing by tuples of integers for the `Matrix` class where the first item is an integer representing the row and the second representing the column. Deciding what to do inside a function or method dependent on the type of an argument is known as **type dispatching**.

Also, note how the `__len__()` method depends on the properties `n_rows` and `n_cols`. In general, it is often the case that the (magic) methods are interwoven.

In [58]:
class Matrix:

    def __init__(self, data):
        self._entries = list(list(float(col) for col in row) for row in data)
        for row in self._entries[1:]:
            if len(row) !=  self.n_cols:
                raise TypeError("Each row must have the same number of entries!")
        if len(self) == 0:
            raise ValueError("A matrix must have at least one entry!")

    @property
    def n_rows(self):
        return len(self._entries)

    @property
    def n_cols(self):
        return len(self._entries[0])

    def __len__(self):
        return self.n_rows * self.n_cols

    def __getitem__(self, index):
        if isinstance(index, int):
            if index < 0:
                index += len(self)
            if not (0 <= index < len(self)):
                raise IndexError("integer index out of range!")
            row, col = divmod(index, self.n_cols)
            return self._entries[row][col]
        elif (isinstance(index, tuple) and len(index) == 2
            and isinstance(index[0], int) and isinstance(index[1], int)
        ):
            return self._entries[index[0]][index[1]]
        raise TypeError("index must be either an integer or a tuple of two integers!")

In [59]:
m = Matrix([(1, 1), (4, 4)])

In [60]:
len(m)

4

In [61]:
m[0]

1.0

In [62]:
m[-1]

4.0

In [63]:
m[1, 1]

4.0

In [64]:
for entry in m:
    print(entry, end="  ")

1.0  1.0  4.0  4.0  

## Mutability vs. Immutability

In the above implementation we store the entries of a `Vector` or `Matrix` instance in either a list or a list of lists in the instance attribute `_entries`, which is by the convention of the leading underscore an implementation detail and should not be directly accessed via the dot operator `.` from the outside of the instance (i.e., anything but `self._entries` is not allowed).

So long as we adhere to this convention, `Vector` and `Matrix` instances are immutable. If we do not want to rely on the users of our classes to just adhere to our "soft" rules, we could use tuples or tuples of tuples instead and thereby enforce the immutable nature.

In line with that, we implemented the `transpose()` method such that it creates and returns a new instance. We could have chosen a different path and make the method change the `self._entries` attribute in place behind the scenes. Such decisions are better made consciously when designing a new data type. The main trade-off in the context of data science applications is that immutable data types are typically easier to reason about when it comes to issues such as code correctness whereas mutable data types are more memory efficient and make algorithms faster as less copying operations take place in memory. However, this trade-off only becomes critical when we deal with big amounts of data.

#### Customizing Indexing & Slicing

Analogous to the `__getitem__()` magic method above, there are also the `__setitem__()` and `__delitem__()` magic methods that either assign to or delete (with the `del` statement) an index of a container type (see the full documentation in the [Language Reference](https://docs.python.org/3/reference/datamodel.html#emulating-container-types)). Whereas deleting an individual entry in a `Vector` or `Matrix` instance does not make sense semantically, we could allow changing an entry like in the example below (only shown for the `Vector` case for brevity). While the example only shows the indexing case, everything described in this sub-section can also be generalized to the slicing case.

Consequently, we can design user-defined data types to be explicitly mutable as well.

In the example, `__setitem__()` just "forwards" the assignment to the embedded list. This is a design principle known as [delegation](https://en.wikipedia.org/wiki/Delegation_%28object-oriented_programming%29) in the software engineering literature.

In [65]:
class Vector:

    def __init__(self, data):
        self._entries = list(float(x) for x in data)
        if len(self) == 0:
            raise ValueError("A vector must have at least one entry!")

    def __len__(self):
        return len(self._entries)

    def __getitem__(self, index):
        if not isinstance(index, int):
            raise TypeError("index must be an integer!")
        return self._entries[index]

    def __setitem__(self, index, value):
        if not isinstance(index, int):
            raise TypeError("index must be an integer!")
        self._entries[index] = value

    def __repr__(self):
        return "Vector((" + ", ".join("{:.3f}".format(x) for x in self) + "))"

In [66]:
v = Vector([99, 2, 3, 4])

In [67]:
v

Vector((99.000, 2.000, 3.000, 4.000))

`v` can now we changed in place.

In [68]:
v[0] = 1

In [69]:
v

Vector((1.000, 2.000, 3.000, 4.000))

#### Customizing Attribute Access

Analogous to the previous sub-section, Python also allows us to customize the way getting, setting, and deleting an instance attribute via the dot operator `.` works. This is achieved with the `__getattr__()`, `__setattr__()`, `__delattr__()`, and `__getattribute__()` magic methods as described in the [Language Reference](https://docs.python.org/3/reference/datamodel.html#customizing-attribute-access) but not covered here due to its similarity with the content in the previous sub-section. [getattr()](https://docs.python.org/3/library/functions.html#getattr), [hasattr()](https://docs.python.org/3/library/functions.html#hasattr), and [delattr()](https://docs.python.org/3/library/functions.html#delattr) are the corresponding built-in functions that deal with the aforementioned magic methods on instances similar as the indexing operator `[...]` does in the previous sub-section.

## Polymorphic Functions

A function is considered **polymorphic** if it can work with several data types. The main advantage is re-use of the function's code.

We know such functions already: The [sum()](https://docs.python.org/3/library/functions.html#sum) built-in is a trivial example that works on all kinds of iterables. As we implemented the `Vector` and `Matrix` classes to be iterable, we can use them here too.

In [70]:
sum((1, 2, 3, 4))

10

In [71]:
sum([1, 2, 3, 4])

10

In [72]:
sum({1, 2, 3, 4})

10

In [73]:
sum({1: 996, 2: 997, 3: 998, 4: 999})

10

In [74]:
sum(v)

10.0

In [75]:
sum(m)

10.0

A polymorphic function with a semantic meaning in the context of linear algebra would be one that calculates the [Euclidean norm](https://en.wikipedia.org/wiki/Norm_%28mathematics%29#Euclidean_norm) for vectors, which is basically a generalization of the popular [Pythagorean theorem](https://en.wikipedia.org/wiki/Pythagorean_theorem). Extending the same computation to a matrix results in the even more general [Frobenius norm](https://en.wikipedia.org/wiki/Matrix_norm#Frobenius_norm):

$$\lVert \bf{X} \rVert_F = \sqrt{ \sum_{i=1}^m \sum_{j=1}^n x_{ij}^2 }$$

The `norm` function below can handle both a `Vector` and a `Matrix` object and is therefore polymorphic.

In [76]:
from math import sqrt

def norm(vector_or_matrix):
    """Calculate the Frobenius or Euclidean norm of a matrix or vector."""
    return sqrt(sum(x ** 2 for x in vector_or_matrix))

In [77]:
norm(v)

5.477225575051661

In [78]:
norm(m)

5.830951894845301

An important criterion if different classes are compatible with each other in the sense that they can be used by the same polymorphic function is that they implement the same **interface**. Whereas many other programming languages formalize this [concept](https://en.wikipedia.org/wiki/Protocol_%28object-oriented_programming%29), in core Python the term refers to the rather loose idea that different classes define the same (public) attributes and methods and implement the various protocols behind the magic methods in the same way. However, the [abc](https://docs.python.org/3/library/abc.html) module in the Standard Library adds the possibility to force different classes to implement a pre-defined interface.

## Representations of Data

> "If you change the way you look at things, the things you look at change."
> -- philosopher and personal coach [Dr. Wayne Dyer](https://en.wikipedia.org/wiki/Wayne_Dyer)

Sometimes it is helpful to view a vector as a matrix with either one row or one column. On the contrary, such a matrix can always be interpreted as a vector again. Changing the representation of data can be viewed as changing an object's type in Python. To be precise, the `as_matrix()` and `as_vector()` methods below create new `Matrix` or `Vector` instances out of existing `Vector` or `Matrix` instances, respectively.

Note that from now on we use tuples and tuples of tuples to store the entries in the `self._entries` instance attribute to make our instances fully immutable.

In [79]:
class Vector:

    def __init__(self, data):
        self._entries = tuple(float(x) for x in data)
        if len(self) == 0:
            raise ValueError("A vector must have at least one entry!")

    def __len__(self):
        return len(self._entries)

    def __getitem__(self, index):
        if not isinstance(index, int):
            raise TypeError("index must be an integer!")
        return self._entries[index]

    def __repr__(self):
        return "Vector((" + ", ".join("{:.3f}".format(x) for x in self) + "))"

    def as_matrix(self, column=True):
        if column:
            return Matrix([x] for x in self)
        return Matrix([(x for x in self)])

In [80]:
class Matrix:

    def __init__(self, data):
        self._entries = tuple(tuple(float(col) for col in row) for row in data)
        for row in self._entries[1:]:
            if len(row) !=  self.n_cols:
                raise TypeError("Each row must have the same number of entries!")
        if len(self) == 0:
            raise ValueError("A matrix must have at least one entry!")

    @property
    def n_rows(self):
        return len(self._entries)

    @property
    def n_cols(self):
        return len(self._entries[0])

    def __len__(self):
        return self.n_rows * self.n_cols

    def __getitem__(self, index):
        if isinstance(index, int):
            if index < 0:
                index += len(self)
            if not (0 <= index < len(self)):
                raise IndexError("integer index out of range!")
            row, col = divmod(index, self.n_cols)
            return self._entries[row][col]
        elif (isinstance(index, tuple) and len(index) == 2
            and isinstance(index[0], int) and isinstance(index[1], int)
        ):
            return self._entries[index[0]][index[1]]
        raise TypeError("index must be either an integer or a tuple of two integers!")

    def __repr__(self):
        return ("Matrix((" + ", ".join("(" + ", ".join("{:.3f}".format(col) for col in row)
                                       + ",)" for row in self._entries) + "))")

    def as_vector(self):
        if not (self.n_rows == 1 or self.n_cols == 1):
            raise TypeError("One dimension (m or n) must be 1 so that the "
                            "matrix can be converted into a vector!")
        return Vector(x for x in self)

In [81]:
v = Vector([1, 2, 3])

In [82]:
v

Vector((1.000, 2.000, 3.000))

Let's interpret `v` as a column vector and create a matrix of dimension $3$ by $1$.

In [83]:
m = v.as_matrix()

In [84]:
m

Matrix(((1.000,), (2.000,), (3.000,)))

In [85]:
m.n_rows, m.n_cols

(3, 1)

By chaining `as_matrix()` and `as_vector()` we get a new `Vector` instance back that is equivalent to the given `v`.

In [86]:
v.as_matrix().as_vector()

Vector((1.000, 2.000, 3.000))

In the same way, we can also interpret `v` as a row vector and create a $1$ by $3$ matrix.

In [87]:
m = v.as_matrix(column=False)

In [88]:
m

Matrix(((1.000, 2.000, 3.000,)))

In [89]:
m.n_rows, m.n_cols

(1, 3)

In [90]:
v.as_matrix(column=False).as_vector()

Vector((1.000, 2.000, 3.000))

Interpreting a matrix as a vector only works if one of the two dimensions $m$ or $n$ is $1$.

In [91]:
m = Matrix([(1, 2), (3, 4)])

In [92]:
m.as_vector()

TypeError: One dimension (m or n) must be 1 so that the matrix can be converted into a vector!

## Operator Overloading

Using magic methods such as `__add__()`, `__sub__()`, `__mul__()`, or others, user-defined data types can emulate [numeric types](https://docs.python.org/3/reference/datamodel.html#emulating-numeric-types) in that instance objects can be added to or subtracted from one another or be multiplied together. We will use this to implement all the arithmetic rules from linear algebra.

The OOP concept behind this is **operator overloading** as first mentioned in the context of string concatenation. This is not limited to arithmetic operators as we will see in the sub-section on relational operators at the end of this section.

The expression `1 + 2.0` is translated by Python into an implicit method invocation of the form `1.__add__(2.0)`. This is why all the magic methods behind binary operators take two arguments `self` and, by convention, `other`. To allow a binary operator to work with objects of different data types, Python expects the `1` object here to return `NotImplemented` (not to raise a `NotImplementedError`) if it does not know how to deal with the `2.0` object as an argument and then proceeds by invoking the *reverse* special method `2.0.__radd__(1)`. With this protocol, one can create new data types that know how to execute arithmetic operators with old data types without having to change the old types.

Note that to implement [scalar multiplication](https://en.wikipedia.org/wiki/Scalar_multiplication) for our `Vector` and `Matrix` classes we need to be able to verify that the scalar is some sort of numeric value. We have seen `int` and `float` so far but Python knows a lot more numeric types. The Standard Library provides a [Number](https://docs.python.org/3/library/numbers.html#numbers.Number) class in the [numbers](https://docs.python.org/3/library/numbers.html#numbers) module that we can use for this validation.

In [93]:
from numbers import Number

In [94]:
class Vector:

    def __init__(self, data):
        self._entries = tuple(float(x) for x in data)
        if len(self) == 0:
            raise ValueError("A vector must have at least one entry!")

    def __len__(self):
        return len(self._entries)

    def __getitem__(self, index):
        if not isinstance(index, int):
            raise TypeError("index must be an integer!")
        return self._entries[index]

    def __repr__(self):
        return "Vector((" + ", ".join("{:.3f}".format(x) for x in self) + "))"

    def __add__(self, other):
        if isinstance(other, self.__class__):
            if len(self) != len(other):
                raise ValueError("The vectors need to be of the same length!")
            return self.__class__(x + y for (x, y) in zip(self, other))
        return NotImplemented

    def __radd__(self, other):
        return self + other

    def __sub__(self, other):
        if isinstance(other, self.__class__):
            if len(self) != len(other):
                raise ValueError("The vectors need to be of the same length!")
            return self.__class__(x - y for (x, y) in zip(self, other))
        return NotImplemented

    def __rsub__(self, other):
        return self - other

    def __mul__(self, other):
        if isinstance(other, Number):
            return self.__class__(x * other for x in self)
        return NotImplemented

    def __rmul__(self, other):
        return self * other

    def __truediv__(self, other):
        if isinstance(other, Number):
            return self.__class__(x / other for x in self)
        return NotImplemented

    def as_matrix(self, column=True):
        if column:
            return Matrix([x] for x in self)
        return Matrix([(x for x in self)])

In [95]:
v = Vector([1, 2, 3])

`__mul__()` and `__rmul__()` implement scalar multiplication. Vectors, however, cannot be multiplied with each other mathematically. In that case the magic method returns `NotImplemented` which raises a `TypeError` eventually.

In [96]:
2 * v

Vector((2.000, 4.000, 6.000))

In [97]:
v * 3

Vector((3.000, 6.000, 9.000))

In [98]:
v * v

TypeError: unsupported operand type(s) for *: 'Vector' and 'Vector'

`__truediv__()` implements the division operator `/` while `__floordiv__()` would implement the floor division operator `//`.

In [99]:
v / 3

Vector((0.333, 0.667, 1.000))

`__add__()`, `__radd__()`, `__sub__()`, and `__rsub__()` implement vector addition and subtraction according to standard linear algebra rules (i.e., both vectors must have the same length).

In [100]:
v + v

Vector((2.000, 4.000, 6.000))

In [101]:
v - v

Vector((0.000, 0.000, 0.000))

In [102]:
w = Vector([8, 9])

In [103]:
v + w

ValueError: The vectors need to be of the same length!

For `Matrix` objects the implementation is a bit more involved as we need to distinguish between matrix-matrix, matrix-vector, vector-matrix, and scalar multiplication and check for correct dimensions. To review the underlying rules, check this [article](https://en.wikipedia.org/wiki/Matrix_multiplication) or watch the video next.

In [104]:
from IPython.display import YouTubeVideo
YouTubeVideo("OMA2Mwo0aZg", width="60%")

Basically, a matrix-matrix multiplication of two matrices $\bf{A}$ and $\bf{B}$ with dimensions $m$ by $n$ and $n$ by $p$ can be described as follows: $\bf{C} = \bf{A} * \bf{B}$ , where $c_{ij} = \sum_{k=1}^{n} a_{ik} * b_{kj}$. The $c_{ij}$, $a_{ij}$, and $b_{ij}$ refer to the entries in row $i$ and column $j$ in the respective matrices. Note that it does make a difference if we multiply $\bf{A}$ with $\bf{B}$ from the right or left. For more background on how to implement this summation as an algorithm, check this [article](https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm).

In case of multiplying a matrix with a vector we follow the common convention that a vector on the left is interpreted as a row vector and a vector on the right as a column vector. The vectors' length have to match the matrix's corresponding dimension.

In [105]:
class Matrix:

    def __init__(self, data):
        self._entries = tuple(tuple(float(col) for col in row) for row in data)
        for row in self._entries[1:]:
            if len(row) !=  self.n_cols:
                raise TypeError("Each row must have the same number of entries!")
        if len(self) == 0:
            raise ValueError("A matrix must have at least one entry!")

    @property
    def n_rows(self):
        return len(self._entries)

    @property
    def n_cols(self):
        return len(self._entries[0])

    def __len__(self):
        return self.n_rows * self.n_cols

    def __getitem__(self, index):
        if isinstance(index, int):
            if index < 0:
                index += len(self)
            if not (0 <= index < len(self)):
                raise IndexError("integer index out of range!")
            row, col = divmod(index, self.n_cols)
            return self._entries[row][col]
        elif (isinstance(index, tuple) and len(index) == 2
            and isinstance(index[0], int) and isinstance(index[1], int)
        ):
            return self._entries[index[0]][index[1]]
        raise TypeError("index must be either an integer or a tuple of two integers!")

    def __repr__(self):
        return ("Matrix((" + ", ".join("(" + ", ".join("{:.3f}".format(col) for col in row)
                                       + ",)" for row in self._entries) + "))")

    def __add__(self, other):
        if isinstance(other, self.__class__):
            if (self.n_rows != other.n_rows) or (self.n_cols != other.n_cols):
                raise ValueError("The matrices need to be of the same dimension!")
            return self.__class__((s_col + o_col for (s_col, o_col) in zip(s_row, o_row))
                                  for (s_row, o_row) in zip(self._entries, other._entries))
        return NotImplemented

    def __radd__(self, other):
        if isinstance(other, Vector):  # needed to break an infinite recursion
            raise TypeError("Vectors and matrices cannot be added!")
        return self + other

    def __sub__(self, other):
        if isinstance(other, self.__class__):
            if (self.n_rows != other.n_rows) or (self.n_cols != other.n_cols):
                raise ValueError("The matrices need to be of the same dimension!")
            return self.__class__((s_col - o_col for (s_col, o_col) in zip(s_row, o_row))
                                  for (s_row, o_row) in zip(self._entries, other._entries))
        return NotImplemented

    def __rsub__(self, other):
        if isinstance(other, Vector):  # needed to break an infinite recursion
            raise TypeError("Vectors and matrices cannot be subtracted!")
        return self - other

    def _matrix_multiply(self, other):
        if self.n_cols != other.n_rows:
            raise ValueError("The matrices need to have compatible dimensions!")
        return Matrix((sum(((self[i,k] * other[k,j]) for k in range(self.n_cols)))
                       for j in range(other.n_cols)) for i in range(self.n_rows))

    def __mul__(self, other):
        if isinstance(other, Number):
            return self.__class__((x * other for x in row) for row in self._entries)
        elif isinstance(other, Vector):
            return self._matrix_multiply(other.as_matrix()).as_vector()
        elif isinstance(other, self.__class__):
            return self._matrix_multiply(other)
        return NotImplemented

    def __rmul__(self, other):
        if isinstance(other, Number):
            return self * other
        elif isinstance(other, Vector):
            return other.as_matrix(column=False)._matrix_multiply(self).as_vector()
        return NotImplemented

    def __truediv__(self, other):
        if isinstance(other, Number):
            return self.__class__((x / other for x in row) for row in self._entries)
        return NotImplemented

    def as_vector(self):
        if not (self.n_rows == 1 or self.n_cols == 1):
            raise TypeError("One dimension (m or n) must be 1 so that the "
                            "matrix can be converted into a vector!")
        return Vector(x for x in self)

In [106]:
m = Matrix([(1, 2, 3), (4, 5, 6)])
n = Matrix([(7, 8, 9), (10, 11, 12), (13, 14, 15)])

Scalar multiplication, addition, and subtraction work as before.

In [107]:
10 * m

Matrix(((10.000, 20.000, 30.000,), (40.000, 50.000, 60.000,)))

In [108]:
(2 * m + m * 3) / 5

Matrix(((1.000, 2.000, 3.000,), (4.000, 5.000, 6.000,)))

In [109]:
m - m

Matrix(((0.000, 0.000, 0.000,), (0.000, 0.000, 0.000,)))

Matrix-matrix multiplication works if the dimensions are compatible.

In [110]:
m * n

Matrix(((66.000, 72.000, 78.000,), (156.000, 171.000, 186.000,)))

In [111]:
n * m

ValueError: The matrices need to have compatible dimensions!

The same holds for matrix-vector and vector-matrix multiplication. These operations always return `Vector` instances in line with standard linear algebra.

In [112]:
m * v

Vector((14.000, 32.000))

In [113]:
v * n

Vector((66.000, 72.000, 78.000))

In [114]:
v * m

ValueError: The matrices need to have compatible dimensions!

#### Relational Operators

As we have seen above, two different `Vector` instances with the exact same entries do not compare equal. The reason is that for user-defined types Python by default only assumes two instances to be equal if they are actually the same object in memory. This is, of course, semantically wrong for vectors and matrices.

In [115]:
v = Vector((1, 2, 3))
w = Vector((1, 2, 3))

In [116]:
v == w

False

In [117]:
v == v

True

As with the [Python Reference](https://docs.python.org/3/reference/datamodel.html#object.__eq__), we can implement the `__eq__()` magic method to control how the equality comparison is carried out. For brevity, we do this only for the `Vector` class below. It would be the exact same code in the `Matrix` case. Note that the `__eq__()` method exits early if the first pair of entries does not match.

In [118]:
class Vector:

    def __init__(self, data):
        self._entries = tuple(float(x) for x in data)
        if len(self) == 0:
            raise ValueError("A vector must have at least one entry!")

    def __len__(self):
        return len(self._entries)

    def __getitem__(self, index):
        if not isinstance(index, int):
            raise TypeError("index must be an integer!")
        return self._entries[index]

    def __repr__(self):
        return "Vector((" + ", ".join("{:.3f}".format(x) for x in self) + "))"

    def __eq__(self, other):
        if isinstance(other, self.__class__):
            for x, y in zip(self, other):
                if x != y:
                    return False
            return True
        return NotImplemented

In [119]:
v = Vector((1, 2, 3))
w = Vector((1, 2, 3))

In [120]:
v == w

True

## The whole Picture

Below are the two final implementations of the `Vector` and `Matrix` classes that integrate all of the implementations introduced above. The code is cleaned up and fully documented as we should always do. Furthermore, both classes are adapted to make it easier to **sub-class** them as we will see in detail in the next section.

What that means is that the `storage` and `typing` **class attributes** are added referencing the [tuple()](https://docs.python.org/3/library/functions.html#func-tuple) and [float()](https://docs.python.org/3/library/functions.html#float) built-ins. These class attributes are then used in the `__init__()` magic methods. Surprisingly, they can be used just like instance attributes on the `self` object directly. The reason for that is how **lexical scoping** works for variables in the context of classes and instances. If a variable is not defined on an instance object, Python first checks if it can fall back to the same name on the corresponding class object before it raises an `AttributeError`. If we want to refer to a class attribute right away, we could replace the `self.storage` and `self.typing` with `self.__class__.storage` and `self.__class__.typing` respectively.

In addtion to the new class attributes, we replace the text strings "Vector" and "Matrix" in the `__repr__()` and `__str__()` magic methods with `self.__class__.__name__` which is implicitly replaced again into the class names as text strings by Python. This way, we only use the names of the classes in the `class` statement header line, which avoids a bit of redundancy.

In [121]:
class Vector:
    """A standard one-dimensional vector from linear algebra.

    The class is designed for sub-classing in such a way that
    the user can adapt the typing class attribute to change,
    for example, how the entries are stored (e.g., as integers).

    Attributes:
        storage (callable): Must return an iterable that is used
            to store the entries of the vector. Defaults to tuple.
        typing (callable): Type conversion applied to all vector
            entries upon creation. Defaults to float.
    """

    storage = tuple
    typing = float

    def __init__(self, data):
        """Initiate a new vector.

        Args:
            data (iterable): The vector's entries in the correct order.
                Must have at least one element.
        Raises:
            ValueError: If the provided data do not have enough entries.
        """
        self._entries = self.storage(self.typing(x) for x in data)
        if len(self) == 0:
            raise ValueError("A vector must have at least one entry!")

    def __repr__(self):
        args = ", ".join("{:.3f}".format(x) for x in self)
        return "{}(({}))".format(self.__class__.__name__, args)

    def __str__(self):
        entries = "({:.1f}, ..., {:.1f})".format(self[0], self[-1])
        length = "[{:d}]".format(len(self))
        return self.__class__.__name__ + entries + length

    def __len__(self):
        return len(self._entries)

    def __getitem__(self, index):
        if not isinstance(index, int):
            raise TypeError("index must be an integer!")
        return self._entries[index]

    def __add__(self, other):
        if isinstance(other, self.__class__):
            if len(self) != len(other):
                raise ValueError("The vectors need to be of the same length!")
            return self.__class__(x + y for (x, y) in zip(self, other))
        return NotImplemented

    def __radd__(self, other):
        return self + other

    def __sub__(self, other):
        if isinstance(other, self.__class__):
            if len(self) != len(other):
                raise ValueError("The vectors need to be of the same length!")
            return self.__class__(x - y for (x, y) in zip(self, other))
        return NotImplemented

    def __rsub__(self, other):
        return self - other

    def __mul__(self, other):
        if isinstance(other, Number):
            return self.__class__(x * other for x in self)
        return NotImplemented

    def __rmul__(self, other):
        return self * other

    def __truediv__(self, other):
        if isinstance(other, Number):
            return self.__class__(x / other for x in self)
        return NotImplemented

    def __eq__(self, other):
        if isinstance(other, self.__class__):
            for x, y in zip(self, other):
                if x != y:
                    return False
            return True
        return NotImplemented

    def as_matrix(self, column=True):
        """Convert the vector into a matrix.

        Args:
            column (bool): If the vector should be interpreted as
                as a column vector or not. Defaults to True.
        Returns:
            Matrix
        """
        if column:
            return Matrix([x] for x in self)
        return Matrix([(x for x in self)])

In [122]:
class Matrix:
    """A standard m-by-n-dimensional matrix from linear algebra.

    The class is designed for sub-classing in such a way that
    the user can adapt the typing class attribute to change,
    for example, how the entries are stored (e.g., as integers).

    Attributes:
        storage (callable): Must return an iterable that is used
            to store the entries of the matrix. Defaults to tuple.
        typing (callable): Type conversion applied to all vector
            entries upon creation. Defaults to float.
    """

    storage = tuple
    typing = float

    def __init__(self, data):
        """Initiate a new matrix.

        Args:
            data (iterable of iterables): The matrix's entries in row-major order.
                The number of column entries must be consistent per row
                    while the first row sets the correct number.
                Must have at least one element in total.
        Raises:
            TypeError: If the number of columns is inconsistent across the rows.
            ValueError: If the provided data do not have enough entries.
        """
        self._entries = self.storage(self.storage(self.typing(col) for col in row) for row in data)
        for row in self._entries[1:]:
            if len(row) !=  self.n_cols:
                raise TypeError("Each row must have the same number of entries!")
        if len(self) == 0:
            raise ValueError("A matrix must have at least one entry!")

    @classmethod
    def from_columns(cls, data):
        """Initiate a new matrix.

        Args:
            data (iterable of iterables): The matrix's entries in column-major order.
                The number of column entries must be consistent per row
                    while the first row sets the correct number.
                Must have at least one element in total.
        Raises:
            TypeError: If the number of columns is inconsistent across the rows.
            ValueError: If the provided data do not have enough entries.
        """
        return cls(entries).transpose()

    def __repr__(self):
        args = (", ".join("(" + ", ".join("{:.3f}".format(col) for col in row) + ",)"
                          for row in self._entries))
        return "{}(({}))".format(self.__class__.__name__, args)

    def __str__(self):
        entries = "(({:.1f}, ...), ..., (..., {:.1f}))".format(self[0], self[-1])
        dims = "[{:d}x{:d}]".format(self.n_rows, self.n_cols)
        return self.__class__.__name__ + entries + dims

    @property
    def n_rows(self):
        return len(self._entries)

    @property
    def n_cols(self):
        return len(self._entries[0])

    def __len__(self):
        return self.n_rows * self.n_cols

    def __getitem__(self, index):
        if isinstance(index, int):
            if index < 0:
                index += len(self)
            if not (0 <= index < len(self)):
                raise IndexError("integer index out of range!")
            row, col = divmod(index, self.n_cols)
            return self._entries[row][col]
        elif (isinstance(index, tuple) and len(index) == 2
            and isinstance(index[0], int) and isinstance(index[1], int)
        ):
            return self._entries[index[0]][index[1]]
        raise TypeError("index must be either an integer or a tuple of two integers!")


    def __add__(self, other):
        if isinstance(other, self.__class__):
            if (self.n_rows != other.n_rows) or (self.n_cols != other.n_cols):
                raise ValueError("The matrices need to be of the same dimension!")
            return self.__class__((s_col + o_col for (s_col, o_col) in zip(s_row, o_row))
                                  for (s_row, o_row) in zip(self._entries, other._entries))
        return NotImplemented

    def __radd__(self, other):
        if isinstance(other, Vector):  # needed to break an infinite recursion
            raise TypeError("Vectors and matrices cannot be added!")
        return self + other

    def __sub__(self, other):
        if isinstance(other, self.__class__):
            if (self.n_rows != other.n_rows) or (self.n_cols != other.n_cols):
                raise ValueError("The matrices need to be of the same dimension!")
            return self.__class__((s_col - o_col for (s_col, o_col) in zip(s_row, o_row))
                                  for (s_row, o_row) in zip(self._entries, other._entries))
        return NotImplemented

    def __rsub__(self, other):
        if isinstance(other, Vector):  # needed to break an infinite recursion
            raise TypeError("Vectors and matrices cannot be subtracted!")
        return self - other

    def _matrix_multiply(self, other):
        if self.n_cols != other.n_rows:
            raise ValueError("The matrices need to have compatible dimensions!")
        return self.__class__((sum(((self[i,k] * other[k,j]) for k in range(self.n_cols)))
                               for j in range(other.n_cols)) for i in range(self.n_rows))

    def __mul__(self, other):
        if isinstance(other, Number):
            return self.__class__((x * other for x in row) for row in self._entries)
        elif isinstance(other, Vector):
            return self._matrix_multiply(other.as_matrix()).as_vector()
        elif isinstance(other, self.__class__):
            return self._matrix_multiply(other)
        return NotImplemented

    def __rmul__(self, other):
        if isinstance(other, Number):
            return self * other
        elif isinstance(other, Vector):
            return other.as_matrix(column=False)._matrix_multiply(self).as_vector()
        return NotImplemented

    def __truediv__(self, other):
        if isinstance(other, Number):
            return self.__class__((x / other for x in row) for row in self._entries)
        return NotImplemented

    def __floordiv__(self, other):
        if isinstance(other, Number):
            return self.__class__((x // other for x in row) for row in self._entries)
        return NotImplemented

    def as_vector(self):
        """Convert the matrix into a vector.

        Returns:
            Vector
        Raises:
            TypeError: If not one of the two dimensions is 1.
        """
        if not (self.n_rows == 1 or self.n_cols == 1):
            raise TypeError("One dimension (m or n) must be 1 so that the "
                            "matrix can be converted into a vector!")
        return Vector(x for x in self)

    def transpose(self):
        """Transpose the rows and columns of the matrix.

        Returns:
            Matrix
        """
        return self.__class__(((self._entries[col][row] for col in range(self.n_rows))
                               for row in range(self.n_cols)))

Let's do some math with bigger matrices and vectors.

In [123]:
import random

In [124]:
random.seed(42)

We populize the matrix with random numbers in the range between $0$ and $1000$.

In [125]:
m = Matrix((1000 * random.random() for _ in range(50)) for _ in range(100))

We quickly lose track with all the numbers in the matrix, which is why we implemented the `__str__()` method as a summary representation. Observe that the name "Matrix" is printed although we do not explicitly use it in `__repr__()` and `__str__()`.

In [126]:
m

Matrix(((639.427, 25.011, 275.029, 223.211, 736.471, 676.699, 892.180, 86.939, 421.922, 29.797, 218.638, 505.355, 26.536, 198.838, 649.884, 544.941, 220.441, 589.266, 809.430, 6.499, 805.819, 698.139, 340.251, 155.479, 957.213, 336.595, 92.746, 96.716, 847.494, 603.726, 807.128, 729.732, 536.228, 973.116, 378.534, 552.041, 829.405, 618.520, 861.707, 577.352, 704.572, 45.824, 227.898, 289.388, 79.792, 232.791, 101.001, 277.974, 635.684, 364.832,), (370.181, 209.507, 266.978, 936.655, 648.035, 609.131, 171.139, 729.127, 163.402, 379.455, 989.523, 640.000, 556.950, 684.614, 842.852, 776.000, 229.048, 32.100, 315.453, 267.741, 210.983, 942.910, 876.368, 314.678, 655.439, 395.632, 914.548, 458.852, 264.880, 246.628, 561.368, 262.742, 584.586, 897.823, 399.401, 219.321, 997.538, 509.526, 90.909, 47.116, 109.649, 627.446, 792.079, 422.160, 63.528, 381.619, 996.121, 529.114, 971.078, 860.780,), (11.481, 720.722, 681.710, 536.970, 266.825, 640.962, 111.552, 434.765, 453.724, 953.816, 875.853, 2

In [127]:
print(m)

Matrix((639.4, ...), ..., (..., 353.9))[100x50]


In [128]:
v = Vector(1000 * random.random() for _ in range(50))

In [129]:
v

Vector((129.713, 562.634, 519.706, 631.858, 492.504, 179.907, 609.406, 708.587, 979.258, 1.581, 23.987, 625.461, 117.926, 848.070, 799.564, 998.987, 414.041, 333.792, 560.416, 637.504, 11.297, 201.187, 281.627, 790.196, 307.773, 506.690, 323.924, 6.131, 685.836, 341.362, 724.397, 615.993, 29.117, 175.629, 330.515, 337.937, 672.473, 916.163, 797.254, 645.652, 481.496, 627.200, 892.058, 536.968, 335.110, 783.989, 413.953, 742.585, 835.106, 299.344))

In [130]:
print(v)

Vector(129.7, ..., 299.3)[50]


The arithmetic works as before.

In [131]:
w = m * v

In [132]:
print(w)

Vector(11378937.3, ..., 13593029.3)[100]


We can multiply `m` with its transpose or the other way round.

In [133]:
n = m * m.transpose()

In [134]:
print(n)

Matrix((14370711.3, ...), ..., (..., 16545418.2))[100x100]


In [135]:
o = m.transpose() * m

In [136]:
print(o)

Matrix((32618511.5, ...), ..., (..., 32339164.8))[50x50]


## Inheritance

The last important OOP building block is the ability to create new **child** classes that **inherit** their attributes and methods from **parent** classes. This idea is also often called **sub-classing**. The main advantage here is re-use of code. A second benefit is the ability to get a clearer picture as how different parts of an application relate to each other from an generalization vs. specialization point of view.

For example, let us create new child classes `IntegerVector` and `IntegerMatrix` that can only store integers. The following instances will thus need less memory for its entries. The syntax for this is easy: All we have to do is to reference the parent class in parenthesis in the `class` statement's header line. Then, we adjust the `typing` class attributes that control how the new classes work. To also make the string representations a bit nicer, we replace the `{:.1f}` and `{:.3f}` with `{:d}` in the `__repr__()` and `__str__()` methods so that all the entries are shown as pure integers (i.e., without the ".000"s). Generally, if we define an attribute or method on a child class that is also defined on its parent class, it is overwritten by the child. Lastly, we need to also overwrite the `as_matrix()` and `as_vector()` methods so that they return instances of the new child classes as well.

In [137]:
class IntegerVector(Vector):
    """A standard one-dimensional vector from linear algebra.

    Instances of this class store all entries as integers.
    """

    typing = int

    def __repr__(self):
        args = ", ".join("{:d}".format(x) for x in self)
        return "{}(({}))".format(self.__class__.__name__, args)

    def __str__(self):
        entries = "({:d}, ..., {:d})".format(self[0], self[-1])
        length = "[{:d}]".format(len(self))
        return self.__class__.__name__ + entries + length

    def as_matrix(self, column=True):
        """Convert the vector into a matrix.

        Args:
            column (bool): If the vector should be interpreted as
                as a column vector or not. Defaults to True.
        Returns:
            Matrix
        """
        if column:
            return IntegerMatrix([x] for x in self)
        return IntegerMatrix([(x for x in self)])

In [138]:
class IntegerMatrix(Matrix):
    """A standard two-dimensional matrix from linear algebra.

    Instances of this class store all entries as integers.
    """

    typing = int

    def __repr__(self):
        args = (", ".join("(" + ", ".join("{:d}".format(col) for col in row) + ",)"
                          for row in self._entries))
        return "{}(({}))".format(self.__class__.__name__, args)

    def __str__(self):
        entries = "(({:d}, ...), ..., (..., {:d}))".format(self[0], self[-1])
        dims = "[{:d}x{:d}]".format(self.n_rows, self.n_cols)
        return self.__class__.__name__ + entries + dims

    def as_vector(self):
        """Convert the matrix into a vector.

        Returns:
            Vector
        Raises:
            TypeError: If not one of the two dimensions is 1.
        """
        if not (self.n_rows == 1 or self.n_cols == 1):
            raise TypeError("One dimension (m or n) must be 1 so that the "
                            "matrix can be converted into a vector!")
        return IntegerVector(x for x in self)

Here we also see the generalizing effect of the `self.__class__.__name__` in the parents' classes' `__repr__()` and `__str__()` methods.

In [139]:
m = IntegerMatrix((1000 * random.random() for _ in range(5)) for _ in range(10))

In [140]:
m

IntegerMatrix(((724, 315, 535, 208, 685,), (799, 888, 353, 969, 280,), (31, 983, 626, 842, 570,), (389, 595, 864, 751, 705,), (314, 31, 413, 497, 237,), (451, 950, 216, 302, 112,), (784, 777, 913, 443, 24,), (482, 500, 190, 923, 732,), (523, 775, 143, 871, 822,), (938, 676, 132, 724, 459,)))

In [141]:
print(m)

IntegerMatrix((724, ...), ..., (..., 459))[10x5]


In [142]:
v = IntegerVector(1000 * random.random() for _ in range(5))

In [143]:
v

IntegerVector((713, 522, 4, 932, 687))

The arithmetic operations work as before.

In [144]:
m * v

IntegerVector((1347233, 2130103, 1714067, 1775670, 867739, 1176735, 1397602, 1968546, 2154507, 2012295))

If we are ever in doubt if a class is a sub-class or child of another class, we can use the [issubclass()](https://docs.python.org/3/library/functions.html#issubclass) built-in and check.

In [145]:
issubclass(IntegerVector, Vector)

True

In [146]:
issubclass(IntegerMatrix, Vector)

False

Also, the [isinstance()](https://docs.python.org/3/library/functions.html#isinstance) built-in is aware of the inheritance when validating an instance's type wheras the [type()](https://docs.python.org/3/library/functions.html#type) built-in is not.

In [147]:
isinstance(v, IntegerVector)

True

In [148]:
isinstance(v, Vector)

True

In [149]:
type(v) is IntegerVector

True

In [150]:
type(v) is Vector

False

Another sub-classing example is to adapt the `storage` class attributes and create mutable versions of vectors. Here, we also need to add a new method `__setitem__()` that is not used in our final version of `Vector`.

In [151]:
class MutableVector(Vector):
    """A standard one-dimensional vector from linear algebra.

    Instances of this class are mutable objects via the indexing operator.
    """

    storage = list

    def __setitem__(self, index, value):
        if not isinstance(index, int):
            raise TypeError("index must be an integer!")
        self._entries[index] = value

In [152]:
v = MutableVector(1000 * random.random() for _ in range(5))

In [153]:
v

MutableVector((594.288, 87.076, 466.876, 46.140, 520.578))

In [154]:
v[0] = -1

In [155]:
v

MutableVector((-1.000, 87.076, 466.876, 46.140, 520.578))

## Example revisited: Comparison with numpy

We started out by realizing that Python (incl. the Standard Library) provides us no good data type to model a vector $\vec{v}$ or a matrix $\bf{X}$. Then, we built up two custom data types `Vector` and `Matrix` that basically wrap a simple tuple (for $\vec{v}$) and a list of lists (for $\bf{X}$) so that we can interact with these values in a "natural" and Pythonic way. By doing this, we extended Python with our own little "dialect" or DSL.

If we feel like sharing our linear algebra toolbox with the world, we could easily do so on either [GitHub](https://github.com) or [PyPI](https://pypi.org). However, for the domain of linear algebra this would be rather pointless as there is already a widely adopted library with [numpy](http://www.numpy.org/) that not only has a lot more features than ours but also is implemented in C, which makes it a lot faster with big data.

Let's take a quick look at numpy and compare it with our DSL using the example from the top.

In [156]:
y = (1, 2, 3)
X = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [157]:
X * y

TypeError: can't multiply sequence by non-int of type 'tuple'

The creation of vectors and matrices is similar to our DSL. However, numpy uses the more general concept of an **n-dimensional array** where a vector is only a special case of a matrix and a matrix is yet another special case of an even higher dimensional structure.

In [158]:
import numpy as np

In [159]:
y_arr = np.array(y)
X_arr = np.array(X)

In [160]:
y_vec = IntegerVector(y)
X_mat = IntegerMatrix(X)

The string representations are very similar. numpy scores a bonus point by making the matrix a bit clearer.

In [161]:
y_arr

array([1, 2, 3])

In [162]:
y_vec

IntegerVector((1, 2, 3))

In [163]:
X_arr

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [164]:
X_mat

IntegerMatrix(((1, 2, 3,), (4, 5, 6,), (7, 8, 9,)))

numpy arrays come with a `shape` instance attribute that returns a tuple with the dimensions, even for "vectors".

In [165]:
y_arr.shape

(3,)

In [166]:
X_arr.shape

(3, 3)

In [167]:
X_mat.n_rows, X_mat.n_cols

(3, 3)

[len()](https://docs.python.org/3/library/functions.html#len) built-in function does not return the number of entries in the array but of the rows instead. This is equivalent to the first item in the `shape` tuple.

In [168]:
len(y_arr)

3

In [169]:
len(y_vec)

3

In [170]:
len(X_arr)

3

In [171]:
len(X_mat)

9

The `transpose()` method also exists for arrays.

In [172]:
X_arr.transpose()

array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

In [173]:
X_mat.transpose()

IntegerMatrix(((1, 4, 7,), (2, 5, 8,), (3, 6, 9,)))

To perform matrix-matrix, matrix-vector, or vector-matrix multiplication, we need to use the `dot()` method and pass the right multiplicant in as an argument. If we use the `*` operator on arrays, a different type of "multiplication" is performed.

In [174]:
X_arr.dot(y_arr)

array([14, 32, 50])

In [175]:
X_arr * y_arr

array([[ 1,  4,  9],
       [ 4, 10, 18],
       [ 7, 16, 27]])

In [176]:
X_mat * y_vec

IntegerVector((14, 32, 50))

Scalar multiplication, however, works as expected.

In [177]:
10 * y_arr

array([10, 20, 30])

In [178]:
10 * y_vec

IntegerVector((10, 20, 30))

Because we implemented our classes to support the sequence protocol, numpy's one-dimensional arrays are actually able to work with them. Note that the `*` operator is applied on a per-entry basis.

In [179]:
y_arr + y_vec

array([2, 4, 6])

In [180]:
y_arr * y_vec

array([1, 4, 9])

In [181]:
X_arr + X_mat

ValueError: operands could not be broadcast together with shapes (3,3) (9,) 

We conclude that it is rather easy to extend Python in a way that makes the resulting application code read like core Python again. As there are many well established third-party packages out there, it is unlikely that we have to implement a fundamental library ourselves. Rather, we are more likely to use, for example, numpy arrays in the same way as we used tuples and lists of lists here to built up a higher level of abstraction.