<a href="https://colab.research.google.com/github/milica-golocorbin/math_for_dl/blob/main/Linear%20Algebra.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import numpy as np
import torch
import tensorflow as tf

# Data Structures for Linear Algebra

## Tensors - Fundamental data structure

Tensors are array of numbers. Tensors are machine learning generalization of vectors and matrices to any number of dimensions.

- zero dimensional SCALAR tensor
- one dimensional VECTOR tensor
- two dimensional MATRIX tensor
- higher dimensional n TENSOR

In [2]:
# SCALAR TENSORS
# single number, no dimensions

# base python
num = 25
type(num)

# pytorch
num_pt = torch.tensor(25)
num_pt

# tensorflow
num_tf = tf.Variable(25)
num_tf

<tf.Variable 'Variable:0' shape=() dtype=int32, numpy=25>

In [3]:
# number of dimensions - pytorch
num_pt.shape

torch.Size([])

In [4]:
# number of dimensions - tensorflow
num_tf.shape

TensorShape([])

In [5]:
result = num_pt + num_tf
# result
result.numpy()

50

In [6]:
# VECTOR TENSORS
# one dimensional array of numbers(scalars)

# numpy
y = np.array([25, 2, 5])
y

# pytorch
y_pt = torch.tensor([25, 2, 5])
y_pt

# tensorflow
y_tf = tf.Variable([25, 2, 5])
y_tf

<tf.Variable 'Variable:0' shape=(3,) dtype=int32, numpy=array([25,  2,  5], dtype=int32)>

In [7]:
y.shape

(3,)

In [8]:
y[0]

25

## Tensor Transposition

Transpose converts column to row and vice versa.

**$(X^T)_i,_j = X_j,_i$**

In [9]:
# vector transposition
# row vector (1,3) -> column vector (3,1)

y_t = y.T
y_t

z = np.array([[25, 2, 3]])
z_t = z.T
z_t

array([[25],
       [ 2],
       [ 3]])

In [10]:
z_t.shape

(3, 1)

In [11]:
# MATRIX TENSORS
# two dimensional array of numbers(scalars)

# numpy
M = np.array([[25, 2], [5, 26], [3, 7]])
M

# pytorch
M_pt = torch.tensor([[25, 2], [5, 26], [3, 7]])
M_pt

# tensorflow
M_tf = tf.Variable([[25, 2], [5, 26], [3, 7]])
M_tf

<tf.Variable 'Variable:0' shape=(3, 2) dtype=int32, numpy=
array([[25,  2],
       [ 5, 26],
       [ 3,  7]], dtype=int32)>

In [12]:
M.shape
# (row, column)

(3, 2)

In [13]:
M.size

6

In [14]:
M[:,0]
# first column

array([25,  5,  3])

In [15]:
M[1,:]
# second row

array([ 5, 26])

In [16]:
tf.rank(M_tf)

<tf.Tensor: shape=(), dtype=int32, numpy=2>

In [17]:
# matrix transposition

M.T

array([[25,  5,  3],
       [ 2, 26,  7]])

In [18]:
# matrix transposition pytorch

M_pt.T

tensor([[25,  5,  3],
        [ 2, 26,  7]])

In [19]:
# matrix transposition tensorflow

tf.transpose(M_tf)

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[25,  5,  3],
       [ 2, 26,  7]], dtype=int32)>

In [20]:
y

array([25,  2,  5])

In [21]:
# arithmetic operations
#vectors
y * 2

array([50,  4, 10])

In [22]:
y + 2

array([27,  4,  7])

In [23]:
M

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [24]:
# arithmetic operations
#matrix
M * 2

array([[50,  4],
       [10, 52],
       [ 6, 14]])

In [25]:
M + 2

array([[27,  4],
       [ 7, 28],
       [ 5,  9]])

In [26]:
# with pytorch

M_pt * 2

tensor([[50,  4],
        [10, 52],
        [ 6, 14]])

In [27]:
torch.add(M_pt, 2)

tensor([[27,  4],
        [ 7, 28],
        [ 5,  9]])

In [28]:
torch.mul(M_pt, 2)

tensor([[50,  4],
        [10, 52],
        [ 6, 14]])

In [29]:
# with tensorflow

M_tf * 2

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[50,  4],
       [10, 52],
       [ 6, 14]], dtype=int32)>

In [30]:
tf.add(M_tf, 2)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[27,  4],
       [ 7, 28],
       [ 5,  9]], dtype=int32)>

In [31]:
tf.multiply(M_tf, 2)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[50,  4],
       [10, 52],
       [ 6, 14]], dtype=int32)>

## HADAMARD PRODUCT

If two tensors have same size, we multiply element by element. It is element wise product.

$A \odot X$

In [32]:
A = M + 2

A

array([[27,  4],
       [ 7, 28],
       [ 5,  9]])

In [33]:
M

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [34]:
A + M

array([[52,  6],
       [12, 54],
       [ 8, 16]])

In [35]:
# HADAMARD PRODUCT
A * M

array([[675,   8],
       [ 35, 728],
       [ 15,  63]])

In [36]:
# Same in PyTorch and TensorFlow.

## REDUCTION

Calculating the sum across all elements of a tensor.

- For vector **x** of length *n*, we calculate: 

  $\sum^n_{i=1} x_i$

  $x_i$ refers to all of the elements from the first element(i=1), to the $n^{th}$ one. Sigma($\sum$) means sum.

- For matrix **X** with *m* by *n* dimensions, we calcualte:

   $\sum^m_{i=1} \sum^n_{j=1} X_{i,j}$

Basically with this formula we are saying **sum all elements**. Reduce to total.

In [37]:
y

array([25,  2,  5])

In [38]:
y.sum()

32

In [39]:
M

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [40]:
M.sum()

68

In [41]:
torch.sum(y_pt)

tensor(32)

In [42]:
torch.sum(M_pt)

tensor(68)

In [43]:
tf.reduce_sum(y_tf)

<tf.Tensor: shape=(), dtype=int32, numpy=32>

In [44]:
tf.reduce_sum(M_tf)

<tf.Tensor: shape=(), dtype=int32, numpy=68>

In [45]:
# Sum only axis

M.sum(axis=0)
# summing columns

array([33, 35])

In [46]:
# Sum only axis

M.sum(axis=1)
# summing rows

array([27, 31, 10])

In [47]:
torch.sum(M_pt, 0)

tensor([33, 35])

In [48]:
tf.reduce_sum(M_tf, 1)

<tf.Tensor: shape=(3,), dtype=int32, numpy=array([27, 31, 10], dtype=int32)>

## DOT PRODUCT

It is like combination of HADAMARD PRODUCT and REDUCTION.

It is done between two vectors.

If we have two vectors(x,y) with the same length, we can calculate dot product between them. This is annotated by:

- $x * y$
- $x^T y$
- $<x,y>$

We calculate products in an element-wise and then sum reductively across the procucts to scalar value.

$x * y = \sum^n_{i=1} x_iy_i$

for first element of x and for the first element of y, we calculate the product(we multiply them) and we do that for the second, all the way up to up to $n^{th}$ element of x by $n^{th}$ element of y. Once we have all of the products, we than sum up, $\sum$.

DOT PRODUCT IS PERFORMED AT EVERY ATRIFICIAL NEURON IN A DEEP NEURAL NETWORK.

In [49]:
y

array([25,  2,  5])

In [50]:
x = np.array([0,1,2])
x

array([0, 1, 2])

In [51]:
# manual dot product
res = 25 * 0 + 2 * 1 + 5 * 2
res

12

In [52]:
# dot product numpy
np.dot(x,y)

12

In [53]:
y_pt

tensor([25,  2,  5])

In [54]:
x_pt = torch.tensor([0,1,2])
x_pt

tensor([0, 1, 2])

In [55]:
# dot product pytorch

res_pt = torch.dot(y_pt, x_pt)
res_pt

tensor(12)

In [56]:
y_tf

<tf.Variable 'Variable:0' shape=(3,) dtype=int32, numpy=array([25,  2,  5], dtype=int32)>

In [57]:
x_tf = tf.Variable([0,1,2])
x_tf

<tf.Variable 'Variable:0' shape=(3,) dtype=int32, numpy=array([0, 1, 2], dtype=int32)>

In [58]:
# # dot product tensorflow (doesn't have dot)
res_tf = tf.reduce_sum(tf.multiply(x_tf, y_tf))
res_tf

<tf.Tensor: shape=(), dtype=int32, numpy=12>

# Matrix Multiplication

second matrix has to have same number of rows as first matrix has to have columns.

Resulting matrix will have as many rows as first matrix has rows, and as many columns as second matrix.

It is first element wise multiplication and after that sumation.

It's like dot product but with matrices.

## Matrix with Vector

In [59]:
M

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [68]:
b = np.array([1,2])
b

array([1, 2])

In [69]:
np.dot(M,b)

array([29, 57, 17])

In [70]:
M_pt

tensor([[25,  2],
        [ 5, 26],
        [ 3,  7]])

In [71]:
b_pt = torch.tensor([1,2])
b_pt

tensor([1, 2])

In [72]:
torch.matmul(M_pt, b_pt)

tensor([29, 57, 17])

In [73]:
M_tf

<tf.Variable 'Variable:0' shape=(3, 2) dtype=int32, numpy=
array([[25,  2],
       [ 5, 26],
       [ 3,  7]], dtype=int32)>

In [74]:
b_tf = tf.Variable([1,2])
b_tf

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([1, 2], dtype=int32)>

In [75]:
tf.linalg.matvec(M_tf, b_tf)

<tf.Tensor: shape=(3,), dtype=int32, numpy=array([29, 57, 17], dtype=int32)>

## Matrix with Matrix

In [78]:
B = np.array([[1,2,-1],[3,4,-2],[5,6,-1]])
B

array([[ 1,  2, -1],
       [ 3,  4, -2],
       [ 5,  6, -1]])

In [76]:
M

array([[25,  2],
       [ 5, 26],
       [ 3,  7]])

In [80]:
np.dot(B,M)

array([[ 32,  47],
       [ 89,  96],
       [152, 159]])

In [82]:
B_pt = torch.tensor([[1,2,-1],[3,4,-2],[5,6,-1]])
B_pt

tensor([[ 1,  2, -1],
        [ 3,  4, -2],
        [ 5,  6, -1]])

In [83]:
torch.matmul(B_pt, M_pt)

tensor([[ 32,  47],
        [ 89,  96],
        [152, 159]])

In [84]:
B_tf = tf.Variable([[1,2,-1],[3,4,-2],[5,6,-1]])
B_tf

<tf.Variable 'Variable:0' shape=(3, 3) dtype=int32, numpy=
array([[ 1,  2, -1],
       [ 3,  4, -2],
       [ 5,  6, -1]], dtype=int32)>

In [85]:
tf.matmul(B_tf, M_tf)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[ 32,  47],
       [ 89,  96],
       [152, 159]], dtype=int32)>