# Ch. 2 Linear Algebra

## 2.1 Scalars, Vectors, Matrices and Tensors  
>* **Scalar**: Single number  
>* **Vector**: Array of numbers
>* **Matrix**: 2-D array of numbers  
>* **Tensor**: Array of numbers on a regular grid with a variable number of axis

### Transpose
> ${\displaystyle \left( A ^{\operatorname {T}}\right)_{ij}=\left(A \right)_{ji}.}$

In [2]:
import torch
A = torch.randn(2, 3)
A

tensor([[ 1.3554, -0.4660, -0.4209],
        [-0.9665, -0.0524, -0.0628]])

In [3]:
A.shape

torch.Size([2, 3])

In [4]:
torch.transpose(A, 0, 1)

tensor([[ 1.3554, -0.9665],
        [-0.4660, -0.0524],
        [-0.4209, -0.0628]])

In [5]:
torch.transpose(A, 0, 1).shape

torch.Size([3, 2])

## 2.2 Multiplying Matrices and Vectors
> ${\displaystyle C = A B}.$  
> ${\displaystyle C_{i,j} = \sum _{k}A_{i,k}B_{k,j}.}$  
>* Not to be confused with element-wise product, or Hadamard product (${\displaystyle A \odot B}$). 

In [6]:
B = torch.randn(3, 2)
B

tensor([[-0.2425, -1.0879],
        [-0.3214, -0.2495],
        [ 1.4926, -0.5809]])

In [7]:
B.shape

torch.Size([3, 2])

In [8]:
A @ B

tensor([[-0.8072, -1.1137],
        [ 0.1575,  1.1010]])

In [9]:
(A @ B).shape

torch.Size([2, 2])

@ **is** distributive  
> ${\displaystyle A \left( B + C \right) = AB + AC}.$  

In [10]:
C = torch.randn_like(B)
C

tensor([[-1.8356, -0.8395],
        [ 0.6133,  2.1329],
        [-0.6385, -0.0362]])

In [11]:
C.shape

torch.Size([3, 2])

In [12]:
torch.all(torch.isclose(A @ (B + C), A @ B + A @ C))

tensor(True)

@ **is** associative  
> ${\displaystyle A \left( BC \right) = \left( AB \right) C}.$  

In [13]:
D = torch.randn(2, 6)
D

tensor([[-1.0897, -0.1621,  0.3102,  0.0769,  0.2311,  0.5780],
        [-1.1056, -0.0759,  0.5800,  0.5874,  1.8774,  0.6731]])

In [14]:
D.shape

torch.Size([2, 6])

In [15]:
torch.all(torch.isclose(A @ (B @ D), (A @ B) @ D))

tensor(True)

@ is **not always** commutative  
> ${\displaystyle AB = BA}.$ 

In [16]:
(A @ B).shape

torch.Size([2, 2])

In [17]:
(B @ A).shape

torch.Size([3, 3])

Transpose of matrix product  
> ${\displaystyle \left( AB \right) ^{\operatorname {T}} = B ^{\operatorname {T}} A ^{\operatorname {T}}}.$

In [18]:
torch.all(torch.isclose((A @ B).transpose(0, 1), B.transpose(0, 1) @ A.transpose(0, 1)))

tensor(True)

System of linear equations  
> ${\displaystyle A x = b}$  
> ${\displaystyle \\ \text {where} \: A \in \mathbb {R} ^ {m \times n}, \: b \in \mathbb {R} ^ {m}, \:  \text{and} \: x \in \mathbb{R} ^ {n}.}$

In [19]:
m, n = 4, 5
A = torch.randn(m, n)
x = torch.randn(n)
b = torch.randn(m)

In [20]:
(A @ x).shape

torch.Size([4])

In [21]:
b.shape

torch.Size([4])

## 2.3 Identity and Inverse Matrices  
>* **Identity Matrix**: a matrix that does not change any vector when we multiply that vector by that matrix.  
${\displaystyle I_{n} \in \mathbb {R} ^{n \times n}}$, and  
${\displaystyle \forall x \in \mathbb {R} ^{n}, \: I_{n}x = x.}$  
${\displaystyle \begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1
\end{bmatrix}}$  
>* The **matrix inverse** of **A** is denoted as ${\displaystyle A ^{-1}}$, and is defined as the matrix such that  
${\displaystyle A ^{-1} A = I_{n}.}$  
>* The system of linear equations above can now be solved using the following steps:  
${\displaystyle A x = b}$  
${\displaystyle A ^{-1} A x = A ^{-1} b}$  
${\displaystyle I_{n} x = A^{-1} b.}$
>* ${\displaystyle A^{-1}}$ is not always possible to find and is primarily useful as a theoretical tool.  

In [22]:
I = torch.eye(3)
I

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

In [25]:
A = torch.randn(3, 3)
A

tensor([[ 1.3471, -0.6786, -0.6896],
        [ 0.2669,  0.8253, -1.1286],
        [ 0.0098,  1.3549, -0.6677]])

In [26]:
A_inv = torch.linalg.inv(A)
A_inv

tensor([[ 1.0184, -1.4445,  1.3899],
        [ 0.1739, -0.9293,  1.3912],
        [ 0.3680, -1.9072,  1.3460]])

In [30]:
A @ A_inv

tensor([[ 1.0000e+00,  1.7881e-07, -1.1921e-07],
        [-4.4703e-08,  1.0000e+00, -1.1921e-07],
        [ 0.0000e+00,  0.0000e+00,  1.0000e+00]])

In [34]:
torch.abs(A @ A_inv - I)

tensor([[5.9605e-08, 1.7881e-07, 1.1921e-07],
        [4.4703e-08, 1.1921e-07, 1.1921e-07],
        [0.0000e+00, 0.0000e+00, 0.0000e+00]])

## 2.4 Linear Dependence and Span  
>* For ${\displaystyle A ^{-1}}$ to exist, ${\displaystyle A x = b}$ must have exactly one solution for every value of  ${\displaystyle b}$.  
>* The **span** of a set of vectors is the set of all points obtainable by linear combination of the original vectors.  The span of the columns of a matrix is known as the column space, or range.  
>* **Linear dependence** is the condition that no vector in a set is a linear combination of the other vectors.  These vectors contribute no points to the span of the set.
>* A matrix that (1) is square and (2) has linearly independent columns has exactly one solution for each value of b in equation ${\displaystyle A x = b}$.  This matrix has an inverse.
>* A square matrix with linearly *dependent* columns is called **singular**.  

## 2.5 Norms   