# Coding the Matrix

https://codingthematrix.com/



## What we will cover today

- 

## Vectors are functions

### Examples:

- Consider the 4-vector e.g. [3.14159, 2.718281828, −1.0, 2.0]; Set of 4-vectors over R written as ${\rm I\!R}^4$  
This can be thought of as $0 \mapsto 3.14149,   1 \mapsto 2.718281828,   2 \mapsto -1.0,   3 \mapsto 2.0$  
  
- $\textit{GF(2)^5}$ is the set of 5-element bit sequences [0,0,0,0,0], [0,0,0,0,1],...  
  
- Let WORDS = set of all English words  
In Information retrieval, a document is represented ("bag of words" model) by a function $\textit f: WORDS \longmapsto {\rm I\!R}$ specifying, for each word, how many times it appears in the document.  
We would refer to such a function a WORDS-vector over ${\rm I\!R}$  



## Vectors are functions


### Definition:
For a field ${\rm I\!F}$ and a set $\textit D$, a $\textit {D-vector}$ over ${\rm I\!F}$ is a function from $\textit D$ to ${\rm I\!F}$. The set of such functions is written as ${\rm I\!F}^D$


## Representation using Python dictionaries

- ```{0:3.14159, 1:2.718281828, 2:-1.0, 3:2.0}```

- WORDS-vector over ${\rm I\!R}$  
For any single document, most words are not represented. They should be mapped to zero.  
Our convention for representing vectors by dictionaries: we are allowed to omit key-value pairs when value is zero.

**Example** “The rain in Spain falls mainly on the plain” would be represented by the dictionary
```
{’on’: 1, ’Spain’: 1, ’in’: 1, ’plain’: 1, ’the’: 2, ’mainly’: 1, ’rain’: 1, ’falls’: 1}
```

## Sparsity

A vector most of whose values are zero is called a sparse vector.  
If no more than k of the entries are nonzero, we say the vector is k-sparse.  


A k-sparse vector can be represented using space proportional to k.  
**Example:** when we represent a corpus of documents by WORD-vectors, the storage required is proportional to the total number of words in all documents.  


Most signals acquired via physical sensors (images, sound, ...) are not exactly sparse.
But, lossy compression: making them sparse while preserving perceptual similarity.

## What can we represent with a vector

- Document (for information retrival)  
- Binary string (for cryptography / information theory)  
- Collection of attributes
    * Senate voting record
    * Demographic record of a consumer
    * Characteristics of cancer cells  
- State of a system
    * Population distribution in the world
    * Number of copies in a computer network
    * State of pseudorandom generator
- Probability distribution e.g. ```{1:1/6, 2:1/6, 3:1/6, 4:1/6, 5:1/6, 6:1/6}```

## Vector addition: Translation and vector addition

### Definition of vector addition:

$[u_1, u_2,...,u_n]\ +\ [v_1, v_2,....,v_n]\ =\ [u_1+v_1, u_2+v_2,...., u_n+v_n]$

**Question:** Suppose we represent n-vectors by n-element lists. Write a procedure ```addn(v,w)``` to compute the sum of two vectors so represented. 

In [1]:

def addn(v, w): return [v[i]+w[i] for i in range(len(v))] 


## Vectors

- Zero Vector 
- Associativity of vector addition
- Commutativity of vector addition
- Scalar multiplication
- Distributive Law
    * Convex combination
    * Affine combination


## Playing with GF(2)

Galois Field 2 has two elements: 0 and 1  


Addition is like exclusive-or (xor):  


| + |  0  |  1  |
|---|-----|-----|
|**0**|  0  |  1  |
|**1**|  1  |  0  |


Multiplication is like ordinary multiplication:

| x |  0  |  1  |
|---|-----|-----|
|**0**|  0  |  0  |
|**1**|  0  |  1  |
    


Usual algebraic laws still hold, e.g. multiplication distributes over addition  
$a.(b\ +\ c)\ =\ a.b\ +\ a.c$


## Draw: Lights Out

## Dictionary based representation of vectors

Python ```class Vec``` with two fields (instance variables):
- ```f```, the function, represented by a Python dictionary, and
- ```D```, the domain of the function, represented by a Python set.

In [2]:
class Vec:
    def __init__(self, labels, function):
        self.D = labels
        self.f = function

In [3]:
v = Vec({'A', 'B', 'C'}, {'A':1})

for d in v.D:
    if d in v.f:
        print(v.f[d])

**Quiz**: Write a procedure zero_vec(D) with the following spec:  
- *input*: a set D
- *output*: an instance of ```Vec``` representing a $\textit {D-vector}$ all of whose entries have value zero 



In [4]:
def zero_vec(D): 

SyntaxError: unexpected EOF while parsing (<ipython-input-4-77abd55fd4b2>, line 1)

### Setter and getter

In [None]:
def setitem(v, d, val): v.f[d] = val

**Quiz**: Write a procedure ```getitem(v,d)``` with the following spec:
- *input*: an instance ```v``` of ```Vec```, and an element ```d``` of the set ```v.D```
- *output*: the value of entry ```d``` of ```v```

Use the sparse-representation convention

```
>>> getitem(v, 'A')
1
```

In [5]:
def geitem(v,d): 

SyntaxError: unexpected EOF while parsing (<ipython-input-5-e4d61168900d>, line 1)

### Scalar-vector multiplication

**Quiz** Write a procedure ```scalar_mul(v, alpha)``` with the following spec:
- *input*: an instance of ```Vec``` and a scalar ```alpha```
- *output*: a new instance of ```Vec``` that represents the scalar-vector product ```alpha``` times ```v```

Hints:
- sparse output 
- can use ```getitem(v, d)```
- do not modify the vector that is passed in
- new instance should point to the same set ```D``` as the old instance

```
>>> scalar_mul(v, 2)
<__main__.Vec object at 0x10058cd10>

>>> scalar_mul(v,2).f
{'A': 2.0, 'C':0, 'B': 4.0}
```

In [None]:
def scalar_mul(v, alpha):

### Addition

**Quiz** Write a procedure ```add(u, v)``` with the following spec:
- *input*: instances ```u``` and ```v``` of ```Vec```
- *output*: an intance of ```Vec``` that is the vector sum of ```u``` and ```v```

e.g.
```
>>> u = Vec(v.D, {'A':5., 'C':10.})
>>> add(u, v)
<__main__.Vec object at 0x10058cd10>
>>> add(u, v).f
{'A': 6.0, 'C': 10.0', 'B': 2.0}
```



In [None]:
def add(u, v):

## Dot-product

For two $\textit {D-vectors}$ $u$ and $v$, the *dot-product* is the sum of the product of corresponding entries:


\begin{equation*}
\mathbf{u \cdot v} = \sum_{k \in D} \mathbf{u}[k] \mathbf{v}[k]
\end{equation*}

For example, for traditional vectors $\mathbf{u} = [u_1,....,u_n]$ and $\mathbf{v} = [v_1,....,v_n]$,
\begin{equation*}
\mathbf{u \cdot v} = u_1v_1 + u_2v_2 + \dots + u_nv_n
\end{equation*}