# Definitions of some basic terms

## Scalar
* It's number of some kind (real, natural, etc)
* Named usually with lower case italics and defined by specifying it's kind, example: $s \in I\!R$

## Vector
* It's an ordered array of scalars
* If we considered a vector a point in space each element is a coordinate on it's corresponding axis
* Named optionally with the vector symbol (lower-case bold or with arrow symbol): $\mathbf{v}$ or $\vec{v}$
* If not specified otherwise vectors have dimension nx1 (meaning they are column-vectors)
* Examples:
    * $\vec{v} \in I\!R^n$ 
    * $\vec{v}= \begin{pmatrix} v_{1} \\ v_{2} \\ \vdots \\ v_{n} \end{pmatrix} $
* Vectors have:
 * Direction
 * Magnitude (noted as $\| \mathbf{v} \|$)

### p-norm or $L^p$-norm

Having a vector $\vec{x} = (x_1 \cdots x_n)$ the p-norn $\| \mathbf{\vec{x}} \|_p, p \in I\!R, p \ge 1$ is defined as:

* $\| \mathbf{\vec{x}} \|_p = \sqrt[p]{|x_1|^p + \cdots + |x_n|^p}$
* Distances between two points:
 * With $p=1$ it's called the "Manhattan distance" (a.k.a. rectilinear or taxicab distance)
 * With $p=2$ is called the Euclidean distance (we use this one to calculate the length or magnitude of a vector)

For Python's Numpy:
* Euclidean norm:
 * numpy.linalg.norm(myVector) or np.linalg.norm(v1 - v2)
* p-norm for vectos:  numpy.linalg.norm(myVector, ord=p)

## Matrices
* Two dimensional arrays of scalars
* Examples:
    * $A \in I\!R^{m \times n}$
    * $A = \begin{pmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots  & \vdots  & \ddots & \vdots  \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n}  \end{pmatrix}$
* Two matrices can be added or subtracted if they have the same dimensions

For Python's Numpy:
* myNdArray = numpy.array(myMatrix)

### Transpose

For Python's Numpy:
* myNdArray.T

In [23]:
import numpy as np

A=np.array([[1,2,3],[4,5,6],[7,8,9]])
print(f'A=\n{A}\n\n'
      f'A\'=\n{A.T}')

A=
[[1 2 3]
 [4 5 6]
 [7 8 9]]

A'=
[[1 4 7]
 [2 5 8]
 [3 6 9]]


### Matrix element-wise multiplication (Hadamard product)

$C = A \circ B$

* $A=\begin{pmatrix}a_{11} & a_{12} \\ a_{21} & a_{22}\end{pmatrix}, B=\begin{pmatrix}b_{11} & b_{12} \\ b_{21} & b_{22}\end{pmatrix}$

* $C=\begin{pmatrix}a_{11}*b_{11} & a_{12}*b_{12} \\ a_{21}*b_{21} & a_{22}*b_{22}\end{pmatrix}$


### Matrix multiplication (dot product)
$C=A \cdot B$
 * Two matrices (A and B) can be multiplied if the number of columns of the first equals the number rows on the second
 * If $A \in I\!R^{m \times n}$, $B \in I\!R^{n \times p}$ then $C \in I\!R^{m \times p}$  
 * $C = A \cdot B : c_{i,j} = \sum_{k} a_{i,k}b_{k,j} $ 
   * $c_{i,j}$ contains the total sum of the element-wise multiplication of (i)th row of A with (j)th col of B 

Examples:
* $A=\begin{pmatrix} a_1 \\ a_2 \\a_3 \end{pmatrix} , X=\begin{pmatrix} x_1 & x_2 & 1 \end{pmatrix}, AX=(a_1x_1+a_2x_2+a_3)$

In [24]:
import numpy as np

A=np.array([1, 2, 3])

"""transpose documentation:
    For a 1-D array this has no effect, as a transposed vector is simply the same vector.
    To convert a 1-D array into a 2D column vector, an additional dimension must be added.
    np.atleast_2d(a).T achieves this, as does a[:, np.newaxis]."""
x1,x2=4,5
B=np.array([x1, x2, 1])
B = np.atleast_2d(B).T

print(f'A={A}\nB={B})')
print(f'A*B={A.dot(B)}')
print(f' =[{A[0]}*{B[0,0]}+{A[1]}*{B[1,0]}+{A[2]}*{B[2,0]}]')

A=[1 2 3]
B=[[4]
 [5]
 [1]])
A*B=[17]
 =[1*4+2*5+3*1]


* $A=\begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix},\ B=\begin{pmatrix} b_{11} & b_{12} & b_{13}\\ b_{21} & b_{22} & b_{23}\end{pmatrix}$

* $C=A \cdot B = \begin{pmatrix} a_{11}b_{11} + a_{12}b_{21} & a_{11}*b_{12} + a_{12}*b_{22} & a_{11}*b_{13} + a_{12}*b_{23}\\ a_{21}b_{11} + a_{22}b_{21} & a_{21}*b_{12} + a_{22}*b_{22} & a_{21}*b_{13} + a_{22}*b_{23}\end{pmatrix}$ 

In [3]:
import numpy as np

A=np.array([[1, 2], [3, 4]])
B=np.array([[5, 6, 7], [8, 9, 10]])
print(f'A=\n{A}')
print(f'B=\n{B}')
print(f'AB=\n{A.dot(B)}')


A=
[[1 2]
 [3 4]]
B=
[[ 5  6  7]
 [ 8  9 10]]
AB=
[[21 24 27]
 [47 54 61]]


#### Matrix multiplication properties:
* $A \cdot B \neq B \cdot A$
* $A \cdot (B \cdot C) = (A \cdot B) \cdot C$
* $A \cdot (B+C) = A \cdot B + A \cdot C$ 
* $(A+B) \cdot C = A \cdot C + B \cdot C$

For Python's Numpy:
* element-wise product: myNdArray1 * myNdArray2
* dot product: myNdArray1.dot(myNdArray2)

### Dot multiplication of two vectors

* $\vec{a} \cdot \vec{b} = \|\vec{a}\|\|\vec{b}\|cos(\theta)$
* We can simplify to $\vec{a} \cdot \vec{b} = \|\vec{b}\|\ scalarProj_ba$
 * $cos(\theta)$ is equal to the length of the projection of $\vec{a}$ over $\vec{b}$ (or the scalar projection) divided by $\|\vec{a}\|$
   * The scalar projection of $\vec{a}$ over $\vec{b}$ is
     * $a_b = \|\vec{a}\|cos(\theta)$
   * $proj_ba = a_b\ \hat{b}$ where $\hat{b}$ is the unit vector in the direction of $\vec{b}$

We can see that the dot product of two vectors, in a way, meassures "how much the go in the same direction" so to speak.

Properties:
* Two non-zero vectors are orthogonal if and only if $\vec{a} \cdot \vec{b} = 0$
* $\vec{a} \cdot \vec{a} = \|\vec{a}\|$

### Some alternative ways to understand what the dot product of a matrix and a vector does...

* A transformation:
 * Having $A \in I\!R^{m \times n}, \vec{x} \in I\!R^{n \times 1}$
 * Then we can see the dot product result as an "output-space" ($I\!R^{m \times 1}$) we want to "translate" or "transform" $\vec{x}$ to.
* Linear or weighted combination of column-vectors:
 * Having $A=\begin{pmatrix} \vec{v_1} & \cdots & \vec{v_n}\end{pmatrix},\ \vec{x}=\begin{pmatrix} x_1 \\ \vdots \\ x_n\end{pmatrix}$
 * Then $A \cdot \vec{x} = x_1\vec{v_1} + \cdots + x_n\vec{v_n}$
* Dot product of row-vectors
 * Having $A=\begin{pmatrix} \vec{v_1}^T \\ \vdots \\ \vec{v_m}^T\end{pmatrix},\ \vec{x}=\begin{pmatrix} x_1 \\ \vdots \\ x_n\end{pmatrix}$
 * Then $A \cdot \vec{x} =\begin{pmatrix} \vec{v_1}^T \cdot \vec{x}\\ \vdots \\ \vec{v_m}^T\cdot \vec{x}\end{pmatrix}$


### Identity matrix and inverse

The identity matrix $I_n$ is a $n \times n$ matrix contains 1s on the upper-left to lower-right diagonal and zeros
on the rest. E.g.:

$I_3 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix}$

Identity matrix properties (acts like one for real numbers on multiplication):
$A \cdot I = I \cdot A = A$

The inverse matrix $A^{-1}$ verifies that $A\cdot A^{-1} = I$.
Not all matrices have inverse (for those cases some math libraries include functions to obtain pseudo-inverse matrices).

A singular matrix is a square matrix that does not have an inverse.

For Python's Numpy:
* numpy.identity(n)
* numpy.linalg.inv(myMatrix)
* numpy.linalg.pinv(myMatrix)


Examples:

In [8]:
import numpy as np

A = np.array([[1,2],[3,4]])
invA = np.linalg.inv(A)
pinvA = np.linalg.pinv(A) #Moore-Penrose psuedo inverse

np.set_printoptions(precision=20, suppress=False)
print(f'A=\n{A}\n')
print(f'invA=\n{invA}\n')
print(f'pinvA=\n{pinvA}\n')
print(f'A * invA=\n{A.dot(invA)}\n')
print(f'A * pinvA=\n{A.dot(pinvA)}\n')

A=
[[1 2]
 [3 4]]

invA=
[[-1.9999999999999996  0.9999999999999998]
 [ 1.4999999999999998 -0.4999999999999999]]

pinvA=
[[-2.0000000000000018  1.0000000000000007]
 [ 1.5000000000000018 -0.5000000000000007]]

A * invA=
[[1.000000000000000e+00 0.000000000000000e+00]
 [8.881784197001252e-16 9.999999999999996e-01]]

A * pinvA=
[[ 1.0000000000000018e+00 -6.6613381477509392e-16]
 [ 1.7763568394002505e-15  9.9999999999999911e-01]]



### Eigenvectors and eigenvalues 

Eigenvectors are also called characteristic vectors.

Having:
* $A \cdot X = \lambda X : A \in C^{n \times n}, X \in C^n, X\ is\ a\ non-zero\ vector$
* we call X an eigenvector of A and the corresponding $\lambda$ an eigenvalue.
* X changes length but not direction
* Also true $A \cdot X - \lambda X = 0$ and $(A - \lambda I)\cdot X = 0$ y $(\lambda I - A)\cdot X=0: X\neq0$

The idea is we are able to identify with vectors don't change direction.

#### In the context of graph theory
With $A$ being the adjacency matrix (can be weighted) of graph $G$:
* $A$'s greatest eigenvalue and it's corresponding eigenvector are used to study nodes centrality
* Eigenvector centrality (here $c$) of a node gives us an indication of it's relative influence or relevance on the graph
 * If a node is pointed to by many nodes with high centrality score said node will gave a high score
 * $c(x_i) = \frac{1}{\lambda} \sum_{x_j \in N(x_i)} c(x_j) = \frac{1}{\lambda} \sum_{x_j \in G} [connected(x_i, x_j)] c(x_j)$
 (NOTE! "[" and "]" here are Iverson's brackets, 1 if condition is true and 0 otherwise).
* This idea is used by Google's pagerank algorithm

#### Eigenvectors are also used with the covariance matrix:
* Covariance measures how much two much two random variables vary together
* Covariance matrix $C \in R^{d \times d}, C_{i,j}=\sigma(x_i, x_j)$
 * d is the number of dimensions (variables, features, etc)
 * $x_i, x_j$ are random variables
 * $\sigma(x_i, x_j)= \sigma(x_j, x_i)$
* In this context "The eigenvectors are unit vectors representing the direction of the largest variance of the data,
while the eigenvalues represent the magnitude of this variance in the corresponding directions." Source:https://datascienceplus.com/understanding-the-covariance-matrix/
 * This is used on Principal Component Analysis (PCA)

For Python's Numpy:
* w,v = numpy.linalg.eig(myMatrix) -> returns eigenvalues & normalized eigenvectors
 * column v[:,i] is the eigenvector corresponding to the eigenvalue w[i]
 * TIP: if we need eigenvectors as rows use v.T instead

Example 1:

In [10]:
import numpy as np

A = np.array([[1,2],[3,4]])
e_values, e_vectors = np.linalg.eig(A)
e_vectors_aux = e_vectors.T

for l, v in zip(e_values, e_vectors_aux):
    print(f'eigenvalue={l},\neigenvector={v}')
    print(f'=> A * eigenVector={A.dot(v.T)}') #works even without ".T" but beware of dimensions!!!
    print(f'=> eigenvalue * eigenvector={l*v}\n')

print(f'Calculating all at once.')
print(f'A * eigenVectorMatrix=\n{A.dot(e_vectors)}')

eigenvalue=-0.3722813232690143,
eigenvector=[-0.82456484  0.56576746]
=> A * eigenVector=[ 0.30697009 -0.21062466]
=> eigenvalue * eigenvector=[ 0.30697009 -0.21062466]

eigenvalue=5.372281323269014,
eigenvector=[-0.41597356 -0.90937671]
=> A * eigenVector=[-2.23472698 -4.88542751]
=> eigenvalue * eigenvector=[-2.23472698 -4.88542751]

Calculating all at once.
A * eigenVectorMatrix=
[[ 0.30697009 -2.23472698]
 [-0.21062466 -4.88542751]]


Example 2, simple tree graph. Check root node has higher eigenvector centrality

In [9]:
import numpy as np

#Graph connectivity matrix
A = np.array([[0,.3,.7],[.3, 0, 0], [.7, 0, 0]])
e_values, e_vectors = np.linalg.eig(A)
e_vectors_aux = e_vectors.T

for l, v in zip(e_values, e_vectors_aux):
    if l>0:
        print(f'eigenvalue={l},\neigenvector={v}')


eigenvalue=0.7615773105863907,
eigenvector=[0.70710678 0.27854301 0.64993368]


## Putting it all together

### Matrix and equation systems

Let' assume we have the following system of equations:
* $3x_1 + 2x_2 = 13$
* $6x_1 - 3x_2 = 6$

We can rewrite then as:

$ A = \begin{pmatrix} 3 & 2 \\ 6 & -3 \end{pmatrix},\ \vec{x}= \begin{pmatrix}x_1 \\ x_2\end{pmatrix},\ \vec{b}=\begin{pmatrix}13 \\ 6\end{pmatrix}$

$A \cdot \vec{x} = \vec{b}$

Let's solve using what we've learned so far:

* $A^{-1} \cdot A \cdot \vec{x} =  A^{-1} \cdot \vec{b}$
* $I \cdot \vec{x} =  A^{-1} \cdot \vec{b}$
* $\vec{x} =  A^{-1} \cdot \vec{b}$

Once we've calculated $A^{-1}$ we can even use it to solve with different values of $\vec{b}$.



#### Example 1 with Python's Numpy

In [13]:
import numpy as np

A = np.array([[3,2],[6,-3]])
b1 = np.array([[13],[6]])
b2 = np.array([[21],[3]])

invA = np.linalg.inv(A)

print(f'Solution for b1, x=\n{invA.dot(b1)}')
print(f'Check A.dot(x)=\n{A.dot(invA.dot(b1))}\n')
print(f'Solution for b2, x=\n{invA.dot(b2)}')
print(f'Check A.dot(x)=\n{A.dot(invA.dot(b2))}\n')

A = np.array([[3,0],[6,0]]) #What happends here?
invA = np.linalg.inv(A) #Will throw "LinAlgError: Singular matrix"
#invA = np.linalg.pinv(A) #Will give as a WRONG answer
print(f'Solution for b1, x={invA.dot(b1)}')
print(f'Check our result A.dot(x)={A.dot(invA.dot(b1))}\n')

Solution for b1, x=
[[2.42857143]
 [2.85714286]]
Check A.dot(x)=
[[13.]
 [ 6.]]

Solution for b2, x=
[[3.28571429]
 [5.57142857]]
Check A.dot(x)=
[[21.]
 [ 3.]]



LinAlgError: Singular matrix

### Solve for the coefficients of a quadratic function

$ax^2+bx+c=y$
* we need to known three points (one for each unknown)
* let's say we know $(x_1, y_1),\ (x_2, y_2),\ (x_3, y_3)$

We can write:

$\begin{pmatrix}x_1^2 & x_1 & 1 \\ x_2^2 & x_2 & 1 \\ x_3^2 & x_3 & 1\end{pmatrix} \cdot \begin{pmatrix} a \\ b \\ c\end{pmatrix} =  \begin{pmatrix} y_1 \\ y_2 \\ y_3\end{pmatrix}$

It's the same as before $A \cdot \vec{x} = \vec{b}$

Again $\vec{x} =  A^{-1} \cdot \vec{b}$


#### Example 2 with Python's Numpy

In [14]:
import numpy as np

A = np.array([[1,1,1],[4,2,1], [9,3,1]])
b = np.array([[4],[5],[6]])
invA = np.linalg.inv(A)
x = invA.dot(b)
print(f'Solution invA.dot(b) = {x}\n')
print(f'Equation:\n{x[0][0]} * x^2 + {x[1][0]} * x + {x[2][0]} = y\n')

print(f'Check our result A.dot(x)=\n{A.dot(x)}')

Solution invA.dot(b) = [[-4.4408921e-16]
 [ 1.0000000e+00]
 [ 3.0000000e+00]]

Equation:
-4.440892098500626e-16 * x^2 + 1.0 * x + 3.0000000000000018 = y

Check our result A.dot(x)=
[[4.]
 [5.]
 [6.]]


## Tensors
* multidimensional arrays (matrices are 2D tensors)