## Deep Learning Notations
by [Elvis Saravia](http://elvissaravia.com/)

---
**Aim:** This notebook contains useful notations widely used in deep learning papers and educational materials found online. I used similar notations used in the Deep Learning book written by Ian Goodfellow, Yoshua Bengio and Aaron Courville. I will also provide sample code using PyTorch to show the type of data structures and concepts these notation may represent.

**Uses:** You can reuse the notations in this notebook as a cheatsheet to assist you in writing your research papers, presentations, and blogs. It's also good resource for reviewing important mathematical notations used widely in deep learning research and other related fields. I provide example code in PyTorch but as an exercise, you can try generating similar code using Numpy or Tensorflow. (The code shouldn't be too different.)

**Requirements:** [PyTorch](http://pytorch.org/)

---

In [2]:
import torch

## Number and Arrays

### A scalar

$a$ - a scalar (integer or real)   
Latex: `$a$`

In [13]:
a = 2
print(a)

2


### A vector

$\boldsymbol a$ - a vector  
Latex: `$\boldsymbol a$`

In [12]:
### 1D vector (column vector)
a = torch.Tensor([1,2]) 
print(a)


 1
 2
[torch.FloatTensor of size 2]



In [11]:
### 1D vector (row form)
a = torch.Tensor([[1,2]])
print(a)


 1  2
[torch.FloatTensor of size 1x2]



### A matrix

$\boldsymbol A$ - a matrix  
Latex: `$\boldsymbol A$`

In [10]:
A = torch.Tensor([[1,2,4],[4,5,6]])
print(A)


 1  2  4
 4  5  6
[torch.FloatTensor of size 2x3]



### A Tensor

$\mathsf A$ - a tensor  
Latex: `$\mathsf A$`

In [15]:
A = torch.Tensor([[[1., 2.], [3., 4.]],
                  [[5., 6.], [7., 8.]]])
print(A)


(0 ,.,.) = 
  1  2
  3  4

(1 ,.,.) = 
  5  6
  7  8
[torch.FloatTensor of size 2x2x2]



### Identity matrix

$\boldsymbol I_n$ - identity matrix with $n$ rows and $n$ columns  
Latex: `$\boldsymbol I_n$`

In [18]:
I = torch.eye(4)
print(I)


 1  0  0  0
 0  1  0  0
 0  0  1  0
 0  0  0  1
[torch.FloatTensor of size 4x4]



### Standard Basis Vector

$\boldsymbol e^{(i)}$ - standard basic vector $[0,...,0,1,0,...,0]$ with a 1 at position $i$  
Latex: `$\boldsymbol e^{(i)}$ `

In [21]:
i = 5 # index
e = torch.zeros(9)
e[i]=1
print(e)


 0
 0
 0
 0
 0
 1
 0
 0
 0
[torch.FloatTensor of size 9]



### Diagonal Matrix

$\text{diag}(\boldsymbol a)$ - A square, diagonal matrix with diagonal entries given by $\boldsymbol a$  
Latex: `$\text{diag}(\boldsymbol a)$`

In [23]:
torch.diag(torch.randn(4))


-0.8443  0.0000  0.0000  0.0000
 0.0000  0.8987  0.0000  0.0000
 0.0000  0.0000 -0.7122  0.0000
 0.0000  0.0000  0.0000 -0.5813
[torch.FloatTensor of size 4x4]

### Random Variables

$\rm a$ - a scalar random variable  
Latex: `$\rm a$`

$\bf a$ - a vector-valued random variable  
Latex: `$\bf a$`

$\rm {a_i}$ - element $i$ of the random vector $\bf a$  
Latex: `$\rm {a_i}$`

$\bf A$ - a matrix-valued random variable  
Latex: `$\bf A$`

---

## Sets and Graphs

### A set

$\mathbb{A}$ - a set  
Latex: `$\mathbb{A}$`

$\mathbb{R}$ - the set of real numbers  
Latex: `$\mathbb{R}$`

$\{ 0,1\}$ - the set containing $0$ and $1$  
Latex: `$\{ 0,1\}$`

$\{ 0,1,...,n\}$ - the set of all integers between $0$ and $n$  
Latex: `$\{ 0,1,...,n\}$`

$\left[ a, b\right]$ - the real interval including $a$ and $b$  
Latex:  `$\left[ a, b\right]$`

$(a,b ]$ - the real interval excluding $a$ but not including $b$  
Latex: `$(a,b ]$`

$\mathbb{A} \backslash \mathbb{B}$ - set substraction, i.e., the set containing the elements of $\mathbb{A}$ that are not in $\mathbb{B}$  
Latex: `$\mathbb{A} \backslash \mathbb{B}$`

$\mathcal{G}$ - a graph  
Latex: `$\mathcal{G}$`

$Pa_{\mathcal{G}}(\rm x_{i})$ - the parents of $\rm x_{i}$ in $\mathcal{G}$

---

## Indexing

$a_i$ - the i-th element of a vector (indexing starting at 0)   
Latex: `$a_i$`

In [28]:
i = 1
a = torch.Tensor([1,2,3,4,5])
print(a[i])

2.0


$a_{-i}$ - all elements of vector $\boldsymbol a$ except for element $i$  
Latex: `$a_{-i}$`

In [35]:
i = 2 # element 3
[b for b in a if b != a[i]]

[1.0, 2.0, 4.0, 5.0]

$A_{ij}$ - element $i,j$ of a matrix $\boldsymbol A$  
Latex: `$A_{ij}$`

In [37]:
A = torch.randn((4,4))
i, j = 2,2
print(A, A[i][j])


-1.2969  1.3356 -0.8195 -1.5841
 0.0877  0.5207 -1.0496  1.7544
 0.4161  0.6629 -0.5315  0.5225
 0.4244 -0.4364 -1.0512  0.6456
[torch.FloatTensor of size 4x4]
 -0.5315296053886414


$\boldsymbol A_{i,:}$ - row $i$ of matrix $\boldsymbol A$  
Latex: `$A_{i,:}$`

In [41]:
i = 2 # i.e., row 3
A[2,:]


 0.4161
 0.6629
-0.5315
 0.5225
[torch.FloatTensor of size 4]

$\boldsymbol A_{:,i}$ - column $i$ of matrix $\boldsymbol A$  
Latex: `$\boldsymbol A_{:,i}$`

In [40]:
i = 2 # i.e., column 3
A[:,i]


-0.8195
-1.0496
-0.5315
-1.0512
[torch.FloatTensor of size 4]

$\mathsf A_{i,j,k}$ - element $(i,j,k)$ of a 3-D tensor $\mathsf A$  
Latex: `$\mathsf A_{i,j,k}$`

In [46]:
i, j , k = 1,1,2 
A = torch.randn((2,2,3))
print(A)
print(A[i, j ,k])


(0 ,.,.) = 
  0.1947 -1.8669 -1.1642
  0.2311  0.8365  0.2833

(1 ,.,.) = 
 -0.0558  0.4570  1.2349
 -1.7568 -1.2977  1.1586
[torch.FloatTensor of size 2x2x3]

1.1586178541183472


$\mathsf A_{:,:,i}$ - 2-D slice of a 3-D tensor  
Latex: `$\mathsf A_{:,:,i}$`

In [47]:
i = 2 
A[:,:,i]


-1.8669  0.8365
 0.4570 -1.2977
[torch.FloatTensor of size 2x2]

## Linear Algebra Operations

$\boldsymbol A^T$ - transpose of matrix $\boldsymbol A$  
Latex: `$\boldsymbol A^T$`

In [64]:
A = torch.randn((3,2))
print(A)
print(A.t())


 1.2385 -1.0991
 1.7441 -1.3487
-1.1186  0.3756
[torch.FloatTensor of size 3x2]


 1.2385  1.7441 -1.1186
-1.0991 -1.3487  0.3756
[torch.FloatTensor of size 2x3]



$\boldsymbol A^+$ - the  Moore-Penrose pseudoinverse pseudoinverse of matrix $\boldsymbol A$  
Latex: `$\boldsymbol A^+$`

$\boldsymbol A^{-1}$ - the inverse matrix of the square matrix $\boldsymbol A$  
Latex: `$\boldsymbol A^{-1}$`

In [63]:
A = torch.randn((2,2))
print(A)
print(torch.inverse(A))


-1.0794  0.4738
 1.5094  1.2649
[torch.FloatTensor of size 2x2]


-0.6080  0.2277
 0.7255  0.5188
[torch.FloatTensor of size 2x2]



$\boldsymbol A \bigodot \boldsymbol B$ - element-wise (Hadamard) product of $\boldsymbol A$ and $\boldsymbol B$  
Latex: `$\boldsymbol A \bigodot \boldsymbol B$ `

In [67]:
A = torch.randn((2,2))
B = torch.randn((2,2))
print(A.mul(B))


-0.1367 -1.6345
 1.1502  1.4149
[torch.FloatTensor of size 2x2]



$\text{det}(\boldsymbol A)$ - determinant of $\boldsymbol A$  
Latex: `$\text{det}(\boldsymbol A)$ `

---

## References
- [Latex commands for MathJax](http://www.onemathematicalcat.org/MathJaxDocumentation/TeXSyntax.htm)
- [Deep Learning Book](http://www.deeplearningbook.org/)
- [Deep Learning Book (Official Notations)](https://github.com/goodfeli/dlbook_notation)