<a href="https://colab.research.google.com/github/thihanaung-thnn/mathematics/blob/main/linear_algebra_basic_operations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

These are notes from **Basics of Linear Algebra for Machine Learning** written by **Jason Brownlee**.If you get an intuition in linear algebra, you should try [3blue1brown - Essence of Linear Algebra Series.](https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab) For me,it is the best linear algebra teaching series. 

In [None]:
import numpy as np

## **Introduction to NumPy**


### **Creating an NumPy array**

In [None]:
# create array 
l = [1.0, 2.0, 3.0]
a = np.array(l)
print(a)
print(a.shape)
print(a.dtype)

[1. 2. 3.]
(3,)
float64


In [None]:
# create empty array
np.empty([3,3])

array([[4.64104333e-310, 3.60739284e-313, 1.38338381e-322],
       [4.64104333e-310, 0.00000000e+000, 0.00000000e+000],
       [4.94065646e-323, 0.00000000e+000, 0.00000000e+000]])

In [None]:
# create zero array
np.zeros([3,4])

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [None]:
# create one array
a = np.ones([3,3])
print(a)

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


In [None]:
# combining arrays
# vertical stack
a1 = np.array([1,2,3])
a2 = np.array([4,5,6])
a3 = np.vstack((a1,a2))
print(a3)
print(a3.shape)

[[1 2 3]
 [4 5 6]]
(2, 3)


In [None]:
a3 = np.hstack((a1,a2))
print(a3)
print(a3.shape)

[1 2 3 4 5 6]
(6,)


### **Index, slicing, reshaping NumPy Arrays**

In [None]:
# one_dimensional array
data1 = [11,22,33,44,55]
data1 = np.array(data1)
print(data1) 
print(f"First number from array : {data1[0]}")
print(f"Last number from array  : {data1[-1]}")

[11 22 33 44 55]
First number from array : 11
Last number from array  : 55


In [None]:
# two_dimensional array
data2 = np.array([[11,22],[33,44],[55,66]])
print(f"data : \n{data2}\n")
print(f"data[0,]  : {data2[0,]}")
print(f"data[:,0] : {data2[:,0]}")
print(f"data[1,1] : {data2[1,1]}")

data : 
[[11 22]
 [33 44]
 [55 66]]

data[0,]  : [11 22]
data[:,0] : [11 33 55]
data[1,1] : 44


In [None]:
# slicing one dimension
# data[from:to]
data1

array([11, 22, 33, 44, 55])

In [None]:
print(data1[:])

[11 22 33 44 55]


In [None]:
print(data1[0:3])

[11 22 33]


In [None]:
# slicing two dimensions
data = np.array([[11,22,33],[44,55,66],[77,88,99]])
print(data)

[[11 22 33]
 [44 55 66]
 [77 88 99]]


In [None]:
# column slice
X, y = data[:,:-1], data[:,-1]
print(X)
print('='*14)
print(y)

[[11 22]
 [44 55]
 [77 88]]
[33 66 99]


In [None]:
# row slice 
split = 2
train, test = data[:split, :], data[split:, :]
print(train)
print('='*14)
print(test)

[[11 22 33]
 [44 55 66]]
[[77 88 99]]


In [None]:
# array shapes 
data = np.array([11,22,33,44,55])
print(data.shape)

(5,)


In [None]:
data = np.array([[11,22],[33,44],[55,66]])
print(data)
print('='*14)
print(f"Rows    : {data.shape[0]}")
print(f"Columns : {data.shape[1]}")

[[11 22]
 [33 44]
 [55 66]]
Rows    : 3
Columns : 2


In [None]:
# reshape 1D to 2D 
data = np.array([11,22,33,44,55])
print(f"before reshape : {data.shape}")

data = data.reshape(data.shape[0],1)
print(f"after reshape  : {data.shape}")

before reshape : (5,)
after reshape  : (5, 1)


In [None]:
# reshape 2D to 3D
data = np.array([[11,22],[33,44],[55,66]])
print(f"before reshape : {data.shape}")

data = data.reshape(data.shape[0], data.shape[1], 1)
print(f"ater reshape   : {data.shape}")

before reshape : (3, 2)
ater reshape   : (3, 2, 1)


### **Array Broadcasting**

In [None]:
# scalar and one_dimensional array
# [a1,a2,a3] + [b1,b2,b3] = [a1+b1,a2+b2,a3+b3]
# [a1,a2,a3] + b = [a1+b,a2+b,a3+b]  => broadcasting
a = np.array([1,2,3])
b = 2 
print(a + b)

[3 4 5]


In [None]:
# scalar and two dimensional array
A = np.array([[1,2,3],[4,5,6]])
print(A + b)

[[3 4 5]
 [6 7 8]]


In [None]:
# 1D vs 2D
c = np.array([2, 3, 4])
print(f"{A} + {c} \n============\n {A+c}")

[[1 2 3]
 [4 5 6]] + [2 3 4] 
 [[ 3  5  7]
 [ 6  8 10]]


In [None]:
# broadcasting error
d = np.array([1,2])
print(A)
print(d)
try: A + d 
except: print("ValueError: operands could not be broadcast together with shapes (2,3) (2,)")

[[1 2 3]
 [4 5 6]]
[1 2]
ValueError: operands could not be broadcast together with shapes (2,3) (2,)



## **Vectors and Matrices**




### **Vectors Arithmetic**

In [None]:
# defining vector 
v = np.array([1,2,3])
print(v)

[1 2 3]


$\begin{pmatrix}
a_1 \\
a_2 \\
a_3
\end{pmatrix} +
\begin{pmatrix}
b_1 \\
b_2 \\
b_3
\end{pmatrix} = 
\begin{pmatrix}
a_1 + b_1 \\
a_2 + b_2 \\
a_3 + b_3
\end{pmatrix}$

- same as for subtraction, multipliation and division

**In reality, vectors are not that horizontal format, they exists vertically, be careful the shapes in the next operations**

In [None]:
# vector addition 
a = np.array([1,2,3])
b = np.array([2,3,4])
print(a)
print(b)
print('-------')
print(a+b)

[1 2 3]
[2 3 4]
-------
[3 5 7]


In [None]:
# vector subtraction 
print('\n',a,'\n',b,'\n----------')
print(a-b)


 [1 2 3] 
 [2 3 4] 
----------
[-1 -1 -1]


In [None]:
# vector multiplication 
# c = (a1*b1, a2*b2, a3*b3)
print('\n',a,'\n',b,'\n----------')
print(a*b)


 [1 2 3] 
 [2 3 4] 
----------
[ 2  6 12]


In [None]:
# vector division 
# c = (a1/b1, a2/b2, a3/b3)
print('\n',a,'\n',b,'\n----------')
print(a/b)


 [1 2 3] 
 [2 3 4] 
----------
[0.5        0.66666667 0.75      ]


$\begin{pmatrix}
a_1 \\
a_2 \\
a_3
\end{pmatrix} .
\begin{pmatrix}
b_1 \\
b_2 \\
b_3
\end{pmatrix} 
= 
(a_1*b_1 + a_2*b_2 + a_3*b_3)$

In [None]:
# vector dot product
# c = (a1*b1 + a2*b2 + a3*b3)
print('\n',a,'\n',b,'\n----------')
print(a.dot(b))


 [1 2 3] 
 [2 3 4] 
----------
20


In [None]:
# vector scalar multiplication 
# c = (a1*s, a2*s, a3*s)
print(a)
print(3)
print('--------')
print(a*3)

[1 2 3]
3
--------
[3 6 9]


---

### **Vectors Norms**

- To calculate vector lengths or magnitudes ro size 
- the length of a vector is a non-negative number that describes the extent of the vector in space

#### **Vector $L^1$ Norm** ($L^1(v) = ||v||_1$)

> $ ||v||_1 = |a_1| + |a_2| + |a_3|$

> $ ||A||_1 = max_{1\leq j \leq n}(\sum_{i=1}^n |a_{ij}|)$


L1 norm is often used when fitting ML algorithms as a regularization method, eg. a method to keep the coefficients of the model small, and in turn the model less complex

In [None]:
a = np.arange(9) - 4
b = a.reshape((3,3))
print(a)
print(b)

print(f"L1 of a : {np.linalg.norm(a, 1)}")
print(f"L1 of b : {np.linalg.norm(b, 1)}")

[-4 -3 -2 -1  0  1  2  3  4]
[[-4 -3 -2]
 [-1  0  1]
 [ 2  3  4]]
L1 of a : 20.0
L1 of b : 7.0


Though b is matrix norm. 
How to calculate matrix norm?   
In this example, absolute sum of vectors are absolute sums of columns are 7,6,7, so maximum value 7 is returned. 

#### **Vector $L^2$ or Euclidean Norm** ($L^2(v) = ||v||_2$)


> $ ||v||_2 = \sqrt{(a_1^2 + a_2^2 + a_3^2)}$

> $||A||_E = \sqrt{\sum_{i=1}^n \sum_{j=1}^n (a_{ij})^2} $

---
In euclidean norm, matrix and vectors have the same norm.


In [None]:
print(a)
print(b)

print(f"L1 of a : {np.linalg.norm(a)}")
print(f"L1 of b : {np.linalg.norm(b)}")

[-4 -3 -2 -1  0  1  2  3  4]
[[-4 -3 -2]
 [-1  0  1]
 [ 2  3  4]]
L1 of a : 7.745966692414834
L1 of b : 7.745966692414834


#### **Vector Max Form** ($L^\infty = ||v||_\infty$)

> $||v||_\infty = maxa_1, a_2, a_3$

> $||A||_\infty = max_{1\leq i \leq n}(\sum_{j=1}^n |a_{ij}|) $

---

In [None]:
print(a)
print(b)

print(f"L1 of a : {np.linalg.norm(a, np.inf)}")
print(f"L1 of b : {np.linalg.norm(b, np.inf)}")

[-4 -3 -2 -1  0  1  2  3  4]
[[-4 -3 -2]
 [-1  0  1]
 [ 2  3  4]]
L1 of a : 4.0
L1 of b : 9.0


Infinity norms for matrix is the maximum absolute row sum. So, absolute row sums are 9,2,9 and 9 is returned. 

> L1 norm is the sum of the absolute values of the vector.  

> L2 norm is the square root of th sum of the squared vector values. 

> Max norm is the maximum vector values. 

### **Matrix Arithmetics**

**Matrix** - two-dimensional array of scalars with one or more columns and one or more rows.  

In [None]:
# create matrix 
A = np.array([[1,2,3],[4,5,6]])
print(A)

[[1 2 3]
 [4 5 6]]


$
\begin{pmatrix}
a_{1,1} & a_{1,2} \\
a_{2,1} & a_{2,2} \\
a_{3,1} & a_{3,2}
\end{pmatrix} +
\begin{pmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2} \\
b_{3,1} & b_{3,2}
\end{pmatrix} =
\begin{pmatrix}
a_{1,1}+b_{1,1} & a_{1,2}+b_{1,2} \\
a_{2,1}+b_{2,1} & a_{2,2}+b_{2,2} \\
a_{3,1}+b_{3,1} & a_{3,2}+b_{3,2}
\end{pmatrix}
$

In [None]:
A = np.array(([1,2,3],[4,5,6]))
B = np.array(([1,2,3],[4,5,6]))
print(A)
print(B)
print(A+B)

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[[ 2  4  6]
 [ 8 10 12]]


In [None]:
# subtraction
C = np.array([[3,2,1],[3,2,1]])
print(A,'\n')
print(C, '\n')
print(A-C)

[[1 2 3]
 [4 5 6]] 

[[3 2 1]
 [3 2 1]] 

[[-2  0  2]
 [ 1  3  5]]


**Matrix Multiplication (Hadamard Product)**
Two matrices of the same size can be multiplied together and called element-wise matrix multiplication.  
**It is not the typical operation meant when referring to matrix multiplicaition.** 

In [None]:
print(A, '\n')
print(C, '\n')
print(A*C)

[[1 2 3]
 [4 5 6]] 

[[3 2 1]
 [3 2 1]] 

[[ 3  4  3]
 [12 10  6]]


In [None]:
# Matrix division 
print(A, '\n')
print(C, '\n')
print(A/C)

[[1 2 3]
 [4 5 6]] 

[[3 2 1]
 [3 2 1]] 

[[0.33333333 1.         3.        ]
 [1.33333333 2.5        6.        ]]


**Matrix-Matrix Multipliation** called matrix dot product.

**Rule for matrix multiplication**  
- number of columns(n) in the first matrix(A) must equal to the number of rows(m) in the second matrix(B). 

> **C(m,k) = A(m,n).B(n,k)**

$
\begin{pmatrix}
a_{1,1} & a_{1,2} \\
a_{2,1} & a_{2,2} \\
a_{3,1} & a_{3,2}
\end{pmatrix} .
\begin{pmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2}
\end{pmatrix} =
\begin{pmatrix}
a_{1,1}*b_{1,1} + a_{1,2}*b_{2,1} & a_{1,1}*b_{1,2} + a_{1,2}*b_{2,2} \\
a_{2,1}*b_{1,1} + a_{2,2}*b_{2,1} & a_{2,1}*b_{1,2} + a_{2,2}*b_{2,2}  \\
a_{3,1}*b_{1,1} + a_{3,2}*b_{2,1} & a_{3,1}*b_{1,2} + a_{3,2}*b_{2,2}
\end{pmatrix}
$


In [None]:
# matrix dot product
A = np.array([[1,2],[3,4],[5,6]])
B = np.array([[1,2],[3,4]])
print(A, '\n')
print(B, '\n')
print(A.dot(B))

[[1 2]
 [3 4]
 [5 6]] 

[[1 2]
 [3 4]] 

[[ 7 10]
 [15 22]
 [23 34]]


In [None]:
A @ B

array([[ 7, 10],
       [15, 22],
       [23, 34]])

**Matrix-Vector Multiplication**

c = A.v or Av


$
\begin{pmatrix}
a_{1,1} & a_{1,2} \\
a_{2,1} & a_{2,2} \\
a_{3,1} & a_{3,2} 
\end{pmatrix} .
\begin{pmatrix}
v_1 & v_2
\end{pmatrix} = 
\begin{pmatrix}
a_{1,1}*v_1 + a_{1,2}*v_2 \\
a_{2,1}*v_1 + a_{2,2}*v_2 \\
a_{3,1}*v_1 + a_{3,2}*v_2 
\end{pmatrix}
$

In [None]:
B = np.array([0.5,0.5])
print(A, '\n')
print(B, '\n')
print(A.dot(B))

[[1 2]
 [3 4]
 [5 6]] 

[0.5 0.5] 

[1.5 3.5 5.5]


In [None]:
# matrix-scalar multiplication
b = 0.5
print(A, '\n')
print(b, '\n')
print(A*b)

[[1 2]
 [3 4]
 [5 6]] 

0.5 

[[0.5 1. ]
 [1.5 2. ]
 [2.5 3. ]]


### **Types of Matrices**

1. **Square matrix**
> matrix - number of rows(n) is equivalent to number of columns   
$n \equiv m$  

2. **Symmetric matrix**
> type of square matrix where top-right triangle is the same as bottom-left triangle.   
$M = M^T$

3. **Triangular matrix**
> type of square matrix that has all values in the upper-right or lower-right of the matrix with the remaining elements filled with zero values.  


In [None]:
M = np.array([[1,2,3],[1,2,3],[1,2,3]])
print(f"Square matrix: \n {M} \n") 

lower = np.tril(M)
print(f"Lower Triangular matrix: \n {lower} \n")

upper = np.triu(M)
print(f"Upper Triangular matrix: \n {upper} ")

Square matrix: 
 [[1 2 3]
 [1 2 3]
 [1 2 3]] 

Lower Triangular matrix: 
 [[1 0 0]
 [1 2 0]
 [1 2 3]] 

Upper Triangular matrix: 
 [[1 2 3]
 [0 2 3]
 [0 0 3]] 


4. **Diagnoal matrix**
> values outside of the main diagonal have a zero value, where the main diagonal is taken from the top left of the matrix to the bottom right.  

$D = \begin{pmatrix}
1&0&0 \\ 0&2&0 \\ 0&0&3
\end{pmatrix}$

$d = \begin{pmatrix}d_{1,1}\\d_{2,2}\\d_{3,3}\end{pmatrix} = \begin{pmatrix}1\\2\\3\end{pmatrix}$  

In [None]:
# np.diga() 
d = np.diag(M)
D = np.diag(d)

print(f"from square matrix : \n{M}")
print(f"extract diagonal vector : {d}")
print(f"create diagonal matrix from vector : \n{D}")

from square matrix : 
[[1 2 3]
 [1 2 3]
 [1 2 3]]
extract diagonal vector : [1 2 3]
create diagonal matrix from vector : 
[[1 0 0]
 [0 2 0]
 [0 0 3]]


5. **Identity Matrix**

> square matrix that does not change a vector when multiplied, 

$I = 
\begin{pmatrix} 1&0&0 \\ 0&1&0 \\ 0&0&0 \end{pmatrix}$


In [None]:
# np.identity()
print(np.identity(3))

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


6. **Orthogonal matrix**  
> type of square matrix whose columns and rows are orthonormal unit vectors eg. perpendicular and have a length or magnitude of 1   

 > Two vectors are orthogonal when their dot product equals zero. Length of each vector is 1  

$ \upsilon . \omega = 0 $ 

$\upsilon . \omega^T = 0$  

$ Q^T . Q = Q . Q^T = I $  

$ Q^T = Q^-1 $ 

eg- 
\begin{pmatrix}
1&0 \\ 0&-1 \end{pmatrix} 

In [None]:
# orthogonal matrix 
Q = np.array([[1,0],[0,-1]])
print(f"eg - \nOrthogonal matrix : \n {Q}")
print(f"Invese of matrix : \n {np.linalg.inv(Q)}")
print(f"Transpose matrix : \n {Q.T}")
print(f"Dot product of inverse and transpose matrices:\n{np.linalg.inv(Q) @ Q.T}")

eg - 
Orthogonal matrix : 
 [[ 1  0]
 [ 0 -1]]
Invese of matrix : 
 [[ 1.  0.]
 [-0. -1.]]
Transpose matrix : 
 [[ 1  0]
 [ 0 -1]]
Dot product of inverse and transpose matrices:
[[1. 0.]
 [0. 1.]]


### **Matrix Operations** 
- Transpose
- Inverse 
- Trace 
- Determinant
- Rank

**Transpose** - matrix with the number of columns and rows flipped.  

$ Transpose of 
\begin{pmatrix}
1&2 \\ 3&4 \\ 5&6
\end{pmatrix} =
\begin{pmatrix}
1&3&5 \\ 2&4&6
\end{pmatrix}
$

In [None]:
# matrix.T
A = np.array([[1,2],[3,4],[5,6]])
print(f"Matrix              : \n{A}")
print(f"Transpose of matrix : \n{A.T}")

Matrix              : 
[[1 2]
 [3 4]
 [5 6]]
Transpose of matrix : 
[[1 3 5]
 [2 4 6]]


**Inverse** - matrix inversion is a process that finds another matrix that when multiplied with the matrix results in an identity matrix.  

> $A.A^-1 = A^-1.A = I$  

In [None]:
# np.linalg.inv()
A = np.array([[1,2],[3,4]])
print(f"Matrix A \n {A}")
print(f"Inverse of A \n {np.linalg.inv(A)}")
print(f"Dot product of two matrix \n {A @ np.linalg.inv(A)}")

Matrix A 
 [[1 2]
 [3 4]]
Inverse of A 
 [[-2.   1. ]
 [ 1.5 -0.5]]
Dot product of two matrix 
 [[1.0000000e+00 0.0000000e+00]
 [8.8817842e-16 1.0000000e+00]]


**Trace** - trace of a square matrix is the sum of the values on the main diagonal of the matrix (top-left to bottom-right), trace operator gives the sum of all of the diagonal entries of a matrix  

> $ tr(A) = a_{1,1} + a_{2,2} + a_{3,3} $

In [None]:
# np.trace()
A = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(f"Matrix A \n {A}")
print(f"Trace of matrix A = {np.trace(A)}")

Matrix A 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Trace of matrix A = 15


**Determinant** - determinant of a square matrix is a scalar representation of the volumne of the matrix  

In [None]:
# np.linalg.det()
print(f"Matrix A \n {A}")
print(f"Determinant of A = {np.linalg.det(A)}")

Matrix A 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Determinant of A = 0.0


**Rank** - rank of a matrix is the estimate of the number of linearly independent rows or columns in a matrix. Number of dimensions spanned by all of the vectors within a matrix. eg. rank 0 = all vectors span a point, rank 1 = all vectors span a line, rank 2 = all vectors span 2D plane, etc. 

In [None]:
# np.linalg.matrix_rank()
v1 = np.array([1,2,3])
v2 = np.array([0,0,0,0,1])

print(f"Vector v1 = {v1}")
print(f"Rank of v1 = {np.linalg.matrix_rank(v1)}")
print(f"Vector v2 = {v2}")
print(f"Rank of v2 = {np.linalg.matrix_rank(v2)}")

B = np.array([[1,2],[1,2]])
print(f"Matrix A \n {A}")
print(f"Rank of matrix A = {np.linalg.matrix_rank(A)}")
print(f"Matrix B \n {B}")
print(f"Rank of matrix B = {np.linalg.matrix_rank(B)}")


Vector v1 = [1 2 3]
Rank of v1 = 1
Vector v2 = [0 0 0 0 1]
Rank of v2 = 1
Matrix A 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Rank of matrix A = 2
Matrix B 
 [[1 2]
 [1 2]]
Rank of matrix B = 1


### **Sparse Matrices**

> matrices that contain mostly zero values,
sparse matrices are distinct from matrices with mostly non-zero values  

$sparsity = \frac{count of nonzero elements}{total elements}$  

**Sparse matrices in Machine Learning**  
eg - 
**Data**  
- Whether or not a user has watched a movie in a movie catalog
- Count of number of listens of a song in a song catalog  
 
**Data Preparation**
- One hot encoding - used to represent categorical data as sparse binary vectors  
- Count encoding - used to represent the frequency of words in a vocabulary for a document  
- TF-IDF encoding - used to represent normalized word frequency scores in a vocabulary 

**Area of Study**
- NPL
- Recommender system
- computer vision  


**Sparse matrices in Python**

In [None]:
from scipy.sparse import csr_matrix 
A = np.array([
[1,0,0,1,0,0],
[0,0,2,0,0,1],
[0,0,0,0,2,0]
])
S = csr_matrix(A)
print(f"Sparse matrix A \n {A}\n")
print(f"Convert A to dense matrix (CSR method) \n {S}\n")
print(f"Convert dense matrix back to sparse matrix \n {S.todense()}")

Sparse matrix A 
 [[1 0 0 1 0 0]
 [0 0 2 0 0 1]
 [0 0 0 0 2 0]]

Convert A to dense matrix (CSR method) 
   (0, 0)	1
  (0, 3)	1
  (1, 2)	2
  (1, 5)	1
  (2, 4)	2

Convert dense matrix back to sparse matrix 
 [[1 0 0 1 0 0]
 [0 0 2 0 0 1]
 [0 0 0 0 2 0]]


**Calculate sparsity**  
> sparsity  = 1.0 - count_nonzero(A) / A.size

In [None]:
sparsity = 1.0 - np.count_nonzero(A) / A.size
print(f"Sparse matrix A \n {A}")
print(f"Sparsity of A = {sparsity}")

Sparse matrix A 
 [[1 0 0 1 0 0]
 [0 0 2 0 0 1]
 [0 0 0 0 2 0]]
Sparsity of A = 0.7222222222222222


## **Tensors and Tensor Arithmetic**



**Tensors** are a type of data structure used in linear algebra and like vectors and matrices, can calculate arithmetic operations with tensors. 

**Tensor** is a generalization of vectors and matrices and is easily understood as a multidimensional array.  

**Tensor** is an array of numbers arranged on a regular grid with a variable number of axes.  

Vector is one-dimensional or first order tensor and matrix is two-dimensional or second order tensor. 

e.g - 3 x 3 x 3 three-dimensional tensor T 
$t_{i,j,k}$

$T = 
\begin{pmatrix}
t_{1,1,1} & t_{1,2,1} & t_{1,3,1} \\
t_{2,1,1} & t_{2,2,1} & t_{2,3,1} \\
t_{3,1,1} & t_{3,2,1} & t_{3,3,1}
\end{pmatrix},
\begin{pmatrix}
t_{1,1,2} & t_{1,2,2} & t_{1,3,2} \\
t_{2,1,2} & t_{2,2,2} & t_{2,3,2} \\
t_{3,1,2} & t_{3,2,2} & t_{3,3,2}
\end{pmatrix},
\begin{pmatrix}
t_{1,1,3} & t_{1,2,3} & t_{1,3,3} \\
t_{2,1,3} & t_{2,2,3} & t_{2,3,3} \\
t_{3,1,3} & t_{3,2,3} & t_{3,3,3}
\end{pmatrix},
$ 

### **Tensors in Python**

In [None]:
# create tensor 
T = np.array([
              [[1,2,3],[4,5,6],[7,8,9]],
              [[11,12,13],[14,15,16],[17,18,19]],
              [[21,22,23],[24,25,26],[27,28,29]]
])
print(f"Tensor T : \n{T}")
print(f"Shape of T = {T.shape}")

Tensor T : 
[[[ 1  2  3]
  [ 4  5  6]
  [ 7  8  9]]

 [[11 12 13]
  [14 15 16]
  [17 18 19]]

 [[21 22 23]
  [24 25 26]
  [27 28 29]]]
Shape of T = (3, 3, 3)


### **Tensor Arithmetic**
same as vector and matrices 

In [None]:
# tensor addition
print(f"Tensor T: \n{T}\n")
print(f"Tensor T + T : \n {T+T}")

Tensor T: 
[[[ 1  2  3]
  [ 4  5  6]
  [ 7  8  9]]

 [[11 12 13]
  [14 15 16]
  [17 18 19]]

 [[21 22 23]
  [24 25 26]
  [27 28 29]]]

Tensor T + T : 
 [[[ 2  4  6]
  [ 8 10 12]
  [14 16 18]]

 [[22 24 26]
  [28 30 32]
  [34 36 38]]

 [[42 44 46]
  [48 50 52]
  [54 56 58]]]


In [None]:
# tensor subtraction 
print(f"Tensor T - T : \n {T-T}")

Tensor T - T : 
 [[[0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]]]


**Tensor Hadamard Product**  

$C = A o B$

In [None]:
# tensor hadamart product 
print(f"Tensor Hadamard Product of T*T :\n {T*T}")

Tensor Hadamard Product of T*T :
 [[[  1   4   9]
  [ 16  25  36]
  [ 49  64  81]]

 [[121 144 169]
  [196 225 256]
  [289 324 361]]

 [[441 484 529]
  [576 625 676]
  [729 784 841]]]


In [None]:
# tensor division 
print(f"Tensor division T/T : \n {T/T}")

Tensor division T/T : 
 [[[1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]]

 [[1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]]

 [[1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]]]


**Tensor Product**  
Tensor product operator is often denoted as 

$C = A \otimes B$

$\begin{pmatrix} a_1 \\ a_2 \end{pmatrix} \otimes
\begin{pmatrix} b_1 \\ b_2 \end{pmatrix} =
\begin{pmatrix}
a_1*\begin{pmatrix} b_1 \\ b_2 \end{pmatrix} \\
a_2*\begin{pmatrix} b_1 \\ b_2 \end{pmatrix}
\end{pmatrix}$

$\begin{pmatrix} a_1 \\ a_2 \end{pmatrix} \otimes
\begin{pmatrix} b_1 \\ b_2 \end{pmatrix} =
\begin{pmatrix}
a_1*b_1 & a_1*b_2 \\
a_2*b_1 & a_2*b_2
\end{pmatrix}$ 

$\begin{pmatrix}
a_{1,1} & a_{1,2} \\
a_{2,1} & a_{2,2}
\end{pmatrix} \otimes
\begin{pmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2}
\end{pmatrix} = 
\begin{pmatrix}
a_{1,1}*\begin{pmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2}
\end{pmatrix} 
& 
a_{1,2}* \begin{pmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2}
\end{pmatrix}\\
a_{2,1}* \begin{pmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2}
\end{pmatrix}
& 
a_{2,2}*\begin{pmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2}
\end{pmatrix}
\end{pmatrix}$

In [None]:
# np.tensordot order 1 tensors (vectors)
a = np.array([1,2])
b = np.array([3,4])
C = np.tensordot(a,b,axes=0) 
print(f"Tensor product of \n{a} and {b} is \n{C}")
print(f"dimension of a, b = {a.shape}, {b.shape}")
print(f"dimension of Tensor = {C.shape}")

Tensor product of 
[1 2] and [3 4] is 
[[3 4]
 [6 8]]
dimension of a, b = (2,), (2,)
dimension of Tensor = (2, 2)


In [None]:
m1 = np.array([[2,2],[2,2]])
m2 = np.array([[1,2],[3,4]])
T = np.tensordot(m1, m2, axes=0)
print(f"Tensor product of \n{m1} and \n{m2} \n{T}")
print(f"dimension of m1, m2 = {m1.shape}, {m2.shape}")
print(f"dimension of Tensor = {T.shape}")

Tensor product of 
[[2 2]
 [2 2]] and 
[[1 2]
 [3 4]] 
[[[[2 4]
   [6 8]]

  [[2 4]
   [6 8]]]


 [[[2 4]
   [6 8]]

  [[2 4]
   [6 8]]]]
dimension of m1, m2 = (2, 2), (2, 2)
dimension of Tensor = (2, 2, 2, 2)


There are other types of tensor multiplications such as tensor dot product and tensor contraction.