# Generalizing to higher dimensions

## Intro

Linear algebra generalizes all the concepts we've handled thus far in 2D, 3D and 4D to any number of dimensions.

In this chapter, we will also create *programmatic generalizations* and define *vector spaces*. Vector spaces are collections of objects that we can treat like vectors. These can be arrows in the plane, tuples of numbers, or objects that we can treat like vectors such as images (which we can combine to create new images).

The key operations in a vector space will be vector addition, and scalar multiplication, as with these you will be able to do linear combinations.

## Generalizing the definition of a vector

Python supports OOP, and we will leverage this technique as it will let us create a *parent class* from which 2D, 3D and vectors from higher dimensions can inherit from.

![OOP with vectors](../images/vectors-oop.png)

### Creating a class for vectors in the 2D plane

In our previous examples, our 2D and 3D vectors have been defined as *coordinate vectors*. That is, they were represented by tuples of numbers that were their coordinates.

Now we will use a class instead of a tuple, to have more control of the definitions and operations.

In [10]:
class Vec2():
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def add(self, v2):
        return Vec2(self.x + v2.x, self.y + v2.y)

    def scale(self, scalar):
        return Vec2(self.x * scalar, self.y * scalar)

    def __eq__(self, other):
        return self.x == other.x and self.y == other.y

# Constructing Vectors
v = Vec2(1.6, 3.8)
print('v=({}, {})'.format(v.x, v.y))

# Adding vectors
v = Vec2(3, 4)
w = v.add(Vec2(-2, 6))
print('w=({}, {})'.format(w.x, w.y))

# Scaling vectors
u = Vec2(1, 1).scale(50)
print('({}, {}).scale({}) = ({}, {})'.format(1, 1, 50, u.x, u.y))

# equality
print(Vec2(3, 4) == Vec2(3, 4))


v=(1.6, 3.8)
w=(1, 10)
(1, 1).scale(50) = (50, 50)
True


Note that the approach is slightly different now. You call a constructor to initialize a vector, and the vector operations are class methods rather than standalone functions.

Note also that the `Vec2.add(...)` method returns a new vector.

### Improving the `Vec2` class

In the same way we defined the `__eq__(...)` method, we can define some other additional methods that would provide some syntactic sugar that would improve the development experience for the users of our library.

In [None]:
class Vec2():
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def add(self, v2):
        return Vec2(self.x + v2.x, self.y + v2.y)

    def scale(self, scalar):
        return Vec2(self.x * scalar, self.y * scalar)

    def __eq__(self, other):
        return self.x == other.x and self.y == other.y

    def __add__(self, v2):
        return self.add(v2)
    
    def __mul__(self, scalar):
        return self.scale(scalar)
    
    def __rmul__(self, scalar):
        return self.scale(scalar)

Finally, we can also define the `__repr__(...)` method that will be called when we need a string representation of the vector.

In [None]:
class Vec2():
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def add(self, v2):
        return Vec2(self.x + v2.x, self.y + v2.y)

    def scale(self, scalar):
        return Vec2(self.x * scalar, self.y * scalar)

    def __eq__(self, other):
        return self.x == other.x and self.y == other.y

    def __add__(self, v2):
        return self.add(v2)
    
    def __mul__(self, scalar):
        return self.scale(scalar)
    
    def __rmul__(self, scalar):
        return self.scale(scalar)
    
    def __repr__(self):
        return 'Vec2({}, {})'.format(self.x, self.y)

Now we can do amazing things such as:

In [1]:
class Vec2():
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def add(self, v2):
        return Vec2(self.x + v2.x, self.y + v2.y)

    def scale(self, scalar):
        return Vec2(self.x * scalar, self.y * scalar)

    def __eq__(self, other):
        return self.x == other.x and self.y == other.y

    def __add__(self, v2):
        return self.add(v2)
    
    def __mul__(self, scalar):
        return self.scale(scalar)
    
    def __rmul__(self, scalar):
        return self.scale(scalar)
    
    def __repr__(self):
        return 'Vec2({}, {})'.format(self.x, self.y)

print(3.0 * Vec2(1, 0) + 4.0 * Vec2(0, 1))        

Vec2(3.0, 4.0)


### Repeating the process with 3D vectors

The same approach can be taken for 3D vectors:
+ We will define the class
+ Add methods for addition and scalar product
+ Add equality methods
+ Enable operator overloading for `+` and `*`.

In [3]:
class Vec3():
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
    
    def add(self, other):
        return Vec3(self.x + other.x, self.y + other.y, self.z + other.z)

    def scale(self, scalar):
        return Vec3(scalar * self.x, scalar * self.y, scalar * self.z)

    def __eq__(self, other):
        return self.x == other.x and self.y == other.y and self.z == other.z

    def __add__(self, v2):
        return self.add(v2)

    def __mul__(self, scalar):
        return self.scale(scalar)

    def __rmul__(self, scalar):
        return self.scale(scalar)

    def __repr__(self):
        return 'Vec3({}, {}, {})'.format(self.x, self.y, self.z)

print(2.0 * (Vec3(1, 0, 0) + Vec3(0, 1, 0)))        

Vec3(2.0, 2.0, 0.0)


Having written the classes for vectors in 3D and 3D opens the door for generalization. There are multiple ways of generalizing the approach, but we will focus on how we use the vectors, and not in how they work. This approach will let us define functions that will work for any number of dimensions: it will let us separate the *what* from the *how*.

### Building a vector base class

The basic operations we do with our `Vec2` and `Vec3` classes has been:
+ constructing new instances
+ add vectors to vectors
+ multiply scalars to vectors
+ testing equality
+ representing the vector as a string

Out of these operations, only vector addition and scalar product are distinctive vector operations: the rest are just Python related methods.

Let's define our base class for vectors according to this fact.


In [None]:
from abc import ABCMeta, abstractmethod

class Vector(metaclass=ABCMeta):

    @abstractmethod
    def scale(self, scalar):
        pass

    @abstractmethod
    def add(self, other):
        pass

    def __mul__(self, scalar):
        return self.scale(scalar)

    def __rmul__(self, scalar):
        return self.scale(scalar)

    def __add__(self, v2):
        return self.add(v2)

The `abc` module contains helper classes, functions, and method decorators that help define abstract base classes &mdash; a class that is not intended to be instantiated, but instead should be used as a template for classes that inherit from it.

As a result, with our `Vector` class definition we are giving the template for the classes that will inherit from it, forcing them to have methods for vector addition and scalar product.

The `@abstractmethod` decorators is used to tag the method as abstract, that is, not implemented in the base class but in the concrete subclasses.

By contrast, the methods that implement the operator overloading for `*` and `+` can be fully specified in our base class.

We can now define `Vec2` as a child class of `Vec`:

In [None]:
from abc import ABCMeta, abstractmethod

class Vector(metaclass=ABCMeta):

    @abstractmethod
    def scale(self, scalar):
        pass

    @abstractmethod
    def add(self, other):
        pass

    def __mul__(self, scalar):
        return self.scale(scalar)

    def __rmul__(self, scalar):
        return self.scale(scalar)

    def __add__(self, v2):
        return self.add(v2)

class Vec2(Vector):
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def add(self, other):
        return Vec2(self.x + other.x, self.y + other.y)

    def scale(self, scalar):
        return Vec2(scalar * self.x, scalar * self.y)

    def __eq__(self, other):
        return self.x == other.x and self.y == other.y

    def __repr__(self):
        return 'Vec2({}, {})'.format(self.x, self.y)        

This approach has saved us from repeating ourselves: The methods that were identical between `Vec2` and `Vec3` are now part of the base class, while the remaining methods which were different for `Vec2` and `Vec3` must be implemented by the concrete class.

This approach also let us *enrich* the base class as we discover more operations that we want to provide to the child classes that inherit from `Vector`, for example, `subtract(...)`:

In [4]:
from abc import ABCMeta, abstractmethod

class Vector(metaclass=ABCMeta):

    @abstractmethod
    def scale(self, scalar):
        pass

    @abstractmethod
    def add(self, other):
        pass

    def __mul__(self, scalar):
        return self.scale(scalar)

    def __rmul__(self, scalar):
        return self.scale(scalar)

    def __add__(self, v2):
        return self.add(v2)
    
    def subtract(self, other):
        return self.add(-1 * other)

    def __sub__(self, other):
        return self.subtract(other)

class Vec2(Vector):
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def add(self, other):
        return Vec2(self.x + other.x, self.y + other.y)

    def scale(self, scalar):
        return Vec2(scalar * self.x, scalar * self.y)

    def __eq__(self, other):
        return self.x == other.x and self.y == other.y

    def __repr__(self):
        return 'Vec2({}, {})'.format(self.x, self.y)  

print(Vec2(1, 3) - Vec2(5, 1))        

Vec2(-4, 2)


Note how we have *enriched* the child class `Vec2` with a `subtract(...)` method without changing the `Vec2` source code at all.

### Defining vector spaces

In this section, we will switch from Python code to mathematical language to see how we can define a similar generalization of the vector concept.

In math, a vector is defined for what it does, rather than for what it is. Let's start with an incomplete definition of a vector:

> | DEFINITION |
| :--------- |
| A vector is an object equipped with a *suitable* way to add it to other vectors and multiply it by scalars. |

What is missing from the previous definition is the explicit definition *suitability*. That can be given as a set of rules that any *object* candidate to qualify as a vector must fulfill:

1. Adding vectors in any order shouldn't matter: $ v + w = w + v $
2. Adding vectors in any grouping shouldn't matter: $ u + (v + w) = (u + v) + w $

Note that for example, strings and concatenation do not fulfill these rules and therefore, do not qualify as vectors as `"hot" + "dog" != "dog" + "hot"`.

Additional rules must be given for well-behaved scalar multiplications:

1. Multiplying vectors by several scalars should be the same as multiplying by all the scalars at once. That is, if $ a $ and $ b $ are scalars and $ v $ i s a vector, then $ a (b v) = (a b) v $.
2. Multiplying a vector by 1 should leave the vector unchanged: $ 1 v = v $.
3. Addition of scalars should be compatible with scalar multiplication: $ a v + b v = (a + b) v$.
4. Addition of vectors should be compatible with scalar multiplication: $ a (v + w) = av + aw $. 

The takeaway from these rules is that we have a way to check if an object can be effectively considered a vector, and lead us to define the concept of a *vector space* as a collection of compatible vectors:

> | DEFINITION |
| :--------- |
| A vector space is a collection of objects called vectors, equipped with suitable vector addition and scalar multiplication operations, such that every linear combination of vectors in the collection produces a vector that is also in the collection. |

A collection like `[Vec2(1, 0), Vec2(5, -3), Vec2(1.1, 0.8)]` is a group of vectors but it is not a vector space as the linear combination `1 * Vec2(1, 0) + 1 * Vec2(5, -3) = Vec2(6, -3)` which is not a vector of the collection.

As you can imagine, most vector spaces are infinite sets containing of the possible linear combinations of the elements of the vector space and the infinite number of scalars there are. Such an infinite set is the collection of all possible 2D vectors.

There are two implications of the fact that vector spaces need to contain all their scalar multiples.

First, $ 0 \cdot v = \vec{0} $ that is, no matter what vector you pick, multiplying it by the scalar 0 will render the *zero vector*. Adding the *zero vector* to any vector leaves that vector unchanged: $ \vec{0} + v = v $.

Second, every vector $ v $has an opposite vector $ -1 \cdot v $ written as $ -v $. As $ v + (-v) = (1 + (-1)) v = 0 v = \vec{0} $. That is, for every vector there is another vector in the vector space that *cancels it out* by addition. 

### Unit testing vector space classes

In Math, suitability is guaranteed by *writing a proof*. In code, we rely in writing unit tests.

For instance, we can check rule #4 for well-behaved scalar multiplication doing:


In [1]:
from vector2d import Vec2

s = -3
u, v = Vec2(42, -10), Vec2(1.5, 8)

# should be true
s * (u + v) == s * v + s * u

True

Obviously, the test is quite weak, but we're off to a good start.

We can improve it a little bit by selecting random numbers and using Python's assert:

In [2]:
from vector2d import Vec2
from random import uniform

def random_scalar():
    return uniform(-10, 10)

def random_vec2():
    return Vec2(random_scalar(), random_scalar())

a = random_scalar()
u, v = random_vec2(), random_vec2()

# this should fail because right and left parts differ by very small fractions
assert a * (u + v) == a * v + a * u    

AssertionError: 

The previous test can be fixed by using Python's `math.isclose(...)`. Also, we can make the test more robust by testing the rule hundred times rather than just once:

In [5]:
from vector2d import Vec2
from random import uniform
from math import isclose

def random_scalar():
    return uniform(-10, 10)

def random_vec2():
    return Vec2(random_scalar(), random_scalar())

def approx_equal_vec2(v, w):
    return isclose(v.x, w.x) and isclose(v.y, w.y)

for _ in range(0, 100):
    a = random_scalar()
    u, v = random_vec2(), random_vec2()

    assert approx_equal_vec2(a * (u + v),  a * v + a * u)

Now we can test the 6 vector space properties defined above:
1. $ v + w = w + v $

2. $ u + (v + w) = (u + v) + w $

1. $ a (b v) = (a b) v $

2. $ 1 v = v $

3. $ a v + b v = (a + b) v $

4. $ a (v + w) = av + aw $

In [6]:
from vector2d import Vec2
from random import uniform
from math import isclose

def random_scalar():
    return uniform(-10, 10)

def random_vec2():
    return Vec2(random_scalar(), random_scalar())

def approx_equal_vec2(v, w):
    return isclose(v.x, w.x) and isclose(v.y, w.y)

def test(eq, a, b, u, v, w):
    assert eq(u + v, v + u)
    assert eq(u + (v + w), (u + v) + w)
    assert eq(a * (b * v), (a * b) * v)
    assert eq(1 * v, v)
    assert eq((a + b) * v, a * v + b * v)
    assert eq(a * v + a * w, a * (v + w))

for _ in range(0, 100):
    a, b = random_scalar(), random_scalar()
    u, v, w = random_vec2(), random_vec2(), random_vec2()
    test(approx_equal_vec2, a, b, u, v, w)

print('Test completed :)')

Test completed :)


Note that this setup isn't completely generic, as we had to write special functions to generate random `Vec2` instances and to compare them. The important part is that the test function itself and the expressions are generic, so the amount of work needed when we need to validate for other vector spaces will be small.

## Exploring different vectors spaces

With the concept of *vector space* already introduced, we will explore some examples by taking a look at a new kind of object and implement it as a class that inherits from `Vector`.

By definition, the new class will support *addition*, *scalar multiplication* and the other well-known vector operations.

### Enumerating all coordinate vector spaces

We have implemented `Vec2`, `Vec3` and `Vec6`. But what about `Vec1`. Let's explore what happens when we create that class and explore its properties:

In [7]:
from vector import Vector

class Vec1(Vector):
    def __init__(self, x):
        self.x = x

    @classmethod
    def zero(cls):
        return Vec1(0)

    def add(self, other):
        return Vec1(self.x + other.x)

    def scale(self, scalar):
        return Vec1(scalar * self.x)

    def __eq__(self, other):
        if not self.__class__ in other.__class__.mro():
            return False
        else:
            return self.x == other.x

    def __repr__(self):
        return 'Vec1({})'.format(self.x)  


print(Vec1(2) + Vec1(3))
print(3 * Vec1(1))
print(Vec1(6) / 2)

Vec1(5)
Vec1(3)
Vec1(3.0)


The interesting takeaway is that `Vec1` defines a vector space in its own right! with vector addition being the regular real number addition and scalar multiplication being the regular multiplication.

The set of all real numbers (including integers, fractions, and irrational numbers) is denoted as $ \mathbb{R} $

> $ \mathbb{R} $ is a vector space where the scalars and the vectors are the same kind of objects.

Coordinate vector spaces are denoted $ \mathbb{R}_n $ where $ n $ is the dimension or number of coordinates so that $ \mathbb{R}_2 $ is the 2D plane and $ \mathbb{R}_3 $ is the 3D space.

As long as real numbers are used, any vector space is some $ \mathbb{R}_n $ in disguise.

The vector space $ \mathbb{R}_0 $ is the set of vectors with zero coordinates, which can also be implemented in Python:

In [9]:
from vector import Vector

class Vec0(Vector):
    def __init__(self):
        pass

    @classmethod
    def zero(cls):
        return Vec0()

    def add(self, other):
        return Vec0()

    def scale(self, scalar):
        return Vec0()

    def __eq__(self, other):
        return self.__class__ == other.__class__

    def __repr__(self):
        return 'Vec0()'


print(Vec0() + Vec0())
print(3 * Vec0())
print(Vec0() / 2)

Vec0()
Vec0()
Vec0()


### Identifying vector spaces in the wild

Let's consider the following dataset that consists of information about used Toyota Priuses.

```json
[
  ["Post status","Year","Model","Miles","Price","Source","Time Posted","Location","Title & Link"],
  ["For Sale","2005","Prius","114000","3500","craigslist","11/30 - 06:44","San Francisco, CA","toyota prius "],
  ["For Sale","2015","Prius","n/a","12500","craigslist","11/30 - 07:00","Seattle, WA","Toyota prius 2015 "],
```

The first element of the array are the columns of the dataset, while the remaining elements constitutes the data.

We can easily define a `CarForSale` class that would model the information in the dataset:



In [None]:
def CarForSale():
    def __init__(self, model_year, mileage, price, posted_datetime, model, source, location, description):
        self.model_year = model_year
        self.mileage = mileage
        self.price = price
        self.posted_datetime = posted_datetime
        self.model = model
        self.source = source
        self.location = location
        self.description = description

Let's consider whether it would be interesting to consider `CarForSale` objects as vectors:
> We could average them as a linear combination to see what typical Prius for sale looks like

Obviously, the numeric fields `model_year`, `mileage`, `price` can be added like coordinates of a regular vector, but the string properties can't. Thus, when we do arithmetics with this properties we will set them to the string `'(virtual)'` to remind us of this fact.

Also, we can't add datetimes, but we can add timespans. For example, we could use the day we retrieved the data as a reference point, and add the time spans since the cars were posted for sale.

Let's do those modifications in the class:

In [None]:
from datetime import datetime
from vector import Vector

class CarForSale(Vector):
    reference_date = datetime(2018, 11, 30, 12) # 30-Nov-2018-12:00:00

    def __init__(self, model_year, mileage, price, posted_datetime, model='(virtual)', source='(virtual)', location='(virtual)', description='(virtual)'):
        self.model_year = model_year
        self.mileage = mileage
        self.price = price
        self.posted_datetime = posted_datetime
        self.model = model
        self.source = source
        self.location = location
        self.description = description

    def add(self, other):
        def add_dates(d1, d2):
            age1 = CarForSale.reference_date - d1
            age2 = CarForSale.reference_date - d2
            sum_age = age1 + age2
            return CarForSale.reference_date - sum_age

        return CarForSale(
            self.model_year + other.model_year,
            self.mileage + other.mileage,
            self.price + other.price,
            add_dates(self.posted_datetime, other.posted_datetime)
        )

    def scale(self, scalar):
        def scale_date(d):
            age = CarForSale.reference_date - d
            return CarForSale.reference_date - (scalar * age)

        return CarForSale(
            scalar * self.model_year,
            scalar * self.mileage,
            scalar * self.price,
            scale_date(self.posted_datetime)
        )

    @classmethod
    def zero(cls):
        return CarForSale(0, 0, 0, CarForSale.reference_date)

Let's give it a run and see how the vector space operations work with this class. In order to do that we need load some data:

In [15]:
from car_for_sale import CarForSale

from json import loads, dumps
from pathlib import Path
from datetime import datetime

contents = Path('cargraph.json').read_text()
cg_objects = loads(contents)
cleaned = []

def parse_date(s):
    input_format='%m/%d - %H:%M'
    dt = datetime.strptime(s, input_format).replace(year=2018)
    return dt

for car in cg_objects[1:]:
    try:
        row = CarForSale(int(car[1]), float(car[3]), float(car[4]), parse_date(car[6]), car[2], car[5], car[7], car[8])
        cleaned.append(row)
    except:
        pass

cars = cleaned


print(cars[0].__dict__)

{'model_year': 2005, 'mileage': 114000.0, 'price': 3500.0, 'posted_datetime': datetime.datetime(2018, 11, 30, 6, 44), 'model': 'Prius', 'source': 'craigslist', 'location': 'San Francisco, CA', 'description': 'toyota prius '}
