<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#-Array-operations-in-PyTorch" data-toc-modified-id="-Array-operations-in-PyTorch-1"><span class="toc-item-num">1&nbsp;&nbsp;</span> Array operations in PyTorch</a></span><ul class="toc-item"><li><span><a href="#Multiplication" data-toc-modified-id="Multiplication-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Multiplication</a></span><ul class="toc-item"><li><span><a href="#Example-1." data-toc-modified-id="Example-1.-1.1.1"><span class="toc-item-num">1.1.1&nbsp;&nbsp;</span>Example 1.</a></span></li><li><span><a href="#Example-2." data-toc-modified-id="Example-2.-1.1.2"><span class="toc-item-num">1.1.2&nbsp;&nbsp;</span>Example 2.</a></span></li><li><span><a href="#Example-3." data-toc-modified-id="Example-3.-1.1.3"><span class="toc-item-num">1.1.3&nbsp;&nbsp;</span>Example 3.</a></span></li></ul></li><li><span><a href="#Transpose" data-toc-modified-id="Transpose-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Transpose</a></span><ul class="toc-item"><li><span><a href="#Example-1." data-toc-modified-id="Example-1.-1.2.1"><span class="toc-item-num">1.2.1&nbsp;&nbsp;</span>Example 1.</a></span></li><li><span><a href="#Example-2." data-toc-modified-id="Example-2.-1.2.2"><span class="toc-item-num">1.2.2&nbsp;&nbsp;</span>Example 2.</a></span></li><li><span><a href="#Example-3." data-toc-modified-id="Example-3.-1.2.3"><span class="toc-item-num">1.2.3&nbsp;&nbsp;</span>Example 3.</a></span></li></ul></li><li><span><a href="#Gradient" data-toc-modified-id="Gradient-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Gradient</a></span><ul class="toc-item"><li><span><a href="#Example-1." data-toc-modified-id="Example-1.-1.3.1"><span class="toc-item-num">1.3.1&nbsp;&nbsp;</span>Example 1.</a></span></li><li><span><a href="#Example-2." data-toc-modified-id="Example-2.-1.3.2"><span class="toc-item-num">1.3.2&nbsp;&nbsp;</span>Example 2.</a></span></li><li><span><a href="#Example-3." data-toc-modified-id="Example-3.-1.3.3"><span class="toc-item-num">1.3.3&nbsp;&nbsp;</span>Example 3.</a></span></li></ul></li><li><span><a href="#Eigenvalues,-eigenvectors" data-toc-modified-id="Eigenvalues,-eigenvectors-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Eigenvalues, eigenvectors</a></span><ul class="toc-item"><li><span><a href="#Example-1." data-toc-modified-id="Example-1.-1.4.1"><span class="toc-item-num">1.4.1&nbsp;&nbsp;</span>Example 1.</a></span></li><li><span><a href="#Example-2." data-toc-modified-id="Example-2.-1.4.2"><span class="toc-item-num">1.4.2&nbsp;&nbsp;</span>Example 2.</a></span></li><li><span><a href="#Example-3." data-toc-modified-id="Example-3.-1.4.3"><span class="toc-item-num">1.4.3&nbsp;&nbsp;</span>Example 3.</a></span></li></ul></li><li><span><a href="#Least-square-norm" data-toc-modified-id="Least-square-norm-1.5"><span class="toc-item-num">1.5&nbsp;&nbsp;</span>Least square norm</a></span></li></ul></li></ul></div>

<h1> Array operations in PyTorch</h1>
<p class="author">(Szabó Sándor, 27. May 2020)</p>

<p class="abstract">We will consider some common operations for matrices:</p>
<p>
<ul class="square">
    <li>multiplication</li>
    <li>transpose</li>
    <li>gradient</li>
    <li>eigenvalues, eigenvectors</li>
    <li>least square norm</li>
</ul>
</p> 

In [1]:
# Import torch and other required modules
import torch

<h2>Multiplication</h2>

<div class="background">
<p class="normal">
    Since multiplication is <span style="font-style: italic;">not</span> commutative, we need to take care of the order.
</p>
</div>

<h3>Example 1.</h3>

In [2]:
# Example 1
A = torch.randn(2, 3)
B = torch.randn(3, 4)
torch.mm(A, B)

tensor([[ 0.9140, -0.5622, -1.5553,  2.3507],
        [-1.3446,  0.9200,  1.5791, -1.9014]])

<h3>Example 2.</h3>

<div class="background">
<p class="normal">However if you change the order you obtain</p>
</div>

In [3]:
# Example 2 - breaking
torch.mm(B, A)

RuntimeError: size mismatch, m1: [3 x 4], m2: [2 x 3] at C:\Users\builder\AppData\Local\Temp\pip-req-build-9msmi1s9\aten\src\TH/generic/THTensorMath.cpp:197

<div class="background remarks">
    <p class="normal">
    <span class="code">torch.mm</span> does not broadcast. 
    For broadcasting matrix products, see <span class="code">torch.matmul()</span>.
    </p>
</div>

<h3>Example 3.</h3>

<div class="background">
<p class="normal">If your keyboard has the character '@', then you can write </p>
</div>

In [4]:
# Example 3
A @ B

tensor([[ 0.9140, -0.5622, -1.5553,  2.3507],
        [-1.3446,  0.9200,  1.5791, -1.9014]])

<h2>Transpose</h2>

<div class="background">
<p class="normal">When we transpose a matrix we transform rows to columns, and columns to rows.</p>
</div>

<h3>Example 1.</h3>

In [5]:
# Example 1 
torch.t(A)

tensor([[ 2.5627, -1.9438],
        [ 0.9010,  0.0195],
        [-0.6105,  1.0287]])

<h3>Example 2.</h3>

<div class="background">
<p class="normal">Again, you should thoughtful, because $(AB)^T\neq A^T B^T$, indeed</p>
</div>

In [6]:
# Example 2
C = torch.randn(3, 3)
D = torch.randn(3, 3)
torch.t(C @ D) - (torch.t(C) @ torch.t(D))

tensor([[-1.8713, -5.5473, -0.1341],
        [ 2.8685,  0.4661, -2.0915],
        [ 1.6289,  2.5733,  1.4052]])

<div class="background">
<p class="normal">The correct result is $(AB)^T=B^T A^T$.</p>
</div>

<h3>Example 3.</h3>

In [7]:
# Example 3
torch.t(C @ D) - (torch.t(D) @ torch.t(C))

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

<h2>Gradient</h2>

<div class="background">
    <p class="normal">In machine learning we should minimize different type 
        of error function.</p>
    <p class="normal">
    The derivative of a multivariable scalar valued function is a matrix, the so-called 
    <a href="https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant">
    Jacobi matrix</a>.
    </p>
</div>

<div class="background">
<p class="normal">If you want to calculate the derivative, you should give this option 
    in the definition of the tensor, using <span class="code">requires_grad=True</span></p>
<p class="normal">In Example 1 we use the vector norm (in fact, the Frobenius norm, which gives the 
    Eucledian vector norm in case of vectors).</p>
</div>

<h3>Example 1.</h3>

In [8]:
# Example 1
# Create tensors.
A = torch.randn(3, 3, requires_grad=True)
x = torch.randn(3, 1, requires_grad=True)
b = torch.randn(3, 1, requires_grad=True)
y = torch.norm(A @ x - b, p='fro')
print("A: \n", A)
print("x: \n", x)
print("b: \n", b)
print("y: \n", y)

A: 
 tensor([[ 1.8279,  3.2379,  0.1823],
        [-0.6493, -0.2778, -0.6017],
        [ 0.1859, -0.2227, -0.6932]], requires_grad=True)
x: 
 tensor([[ 1.1165],
        [-0.4929],
        [-0.2975]], requires_grad=True)
b: 
 tensor([[-0.1906],
        [ 0.9892],
        [-0.7800]], requires_grad=True)
y: 
 tensor(1.9980, grad_fn=<NormBackward0>)


<div class="background">
    <p class="normal">Now we calculate the derivatives $\dfrac{\partial y}{\partial A}$, 
    $\dfrac{\partial y}{\partial x}$, $\dfrac{\partial y}{\partial b}$.</p>
    <p class="normal">To compute the derivatives, we call the <span class="code">.backward 
        </span> method on our result $y$. </p>
</div>

In [9]:
# Compute derivatives
y.backward()

# Display gradients
print('∂y/∂A: \n', A.grad)
print('∂y/∂x: \n', x.grad)
print('∂y/∂b: \n', b.grad)

∂y/∂A: 
 tensor([[ 0.3249, -0.1434, -0.0865],
        [-0.7814,  0.3449,  0.2082],
        [ 0.7284, -0.3216, -0.1941]])
∂y/∂x: 
 tensor([[1.1075],
        [0.9912],
        [0.0218]])
∂y/∂b: 
 tensor([[-0.2910],
        [ 0.6998],
        [-0.6524]])


<h3>Example 2.</h3>

<div class="background">
    <p class="normal">Instead of Frobenius norm we can choose any $1\leq p<\infty$ norm.
    In the next example we choose $p=1$.</p>
</div>

In [10]:
# Example 2 
w = torch.norm(A @ x - b, p=1)
              
# Compute derivatives
w.backward()

# Display gradients
print('∂w/∂A: \n', A.grad)
print('∂w/∂x: \n', x.grad)
print('∂w/∂b: \n', b.grad)

∂w/∂A: 
 tensor([[ 1.4414, -0.6363, -0.3840],
        [-1.8979,  0.8379,  0.5056],
        [ 1.8450, -0.8145, -0.4915]])
∂w/∂x: 
 tensor([[3.7706],
        [4.2842],
        [0.1126]])
∂w/∂b: 
 tensor([[-1.2910],
        [ 1.6998],
        [-1.6524]])


<h3>Example 3.</h3>

<div class="background">
    <p class="normal">At this moment the infinity norm $p=\infty$ does not work.</p>
</div>

In [11]:
# Example 3 - breaking 
z = torch.norm(A @ x - b, p=inf)
              
# Compute derivatives
z.backward()

# Display gradients
print('∂z/∂A: \n', A.grad)
print('∂z/∂x: \n', x.grad)
print('∂z/∂b: \n', b.grad)

NameError: name 'inf' is not defined

<h2>Eigenvalues, eigenvectors</h2>

<div class="background">
    <p class="normal">When we work on linear operators (matrices) many times is a must to 
    determine their eigenvalues and eigenvectors. 
    To do this, we use <span class="code">torch.eig</span></p>
</div>

<h3>Example 1.</h3>

In [15]:
# Example 1 
(eigvalues, eigvectors) = torch.eig(A, eigenvectors=True)
for i in range(3):
    print('eigenvalue: ', eigvalues[i])
    print('eigvector: ', eigvectors[i])
    print('\n')

eigenvalue:  tensor([0.8710, 1.0656], grad_fn=<SelectBackward>)
eigvector:  tensor([ 0.9046,  0.0000, -0.4537], grad_fn=<SelectBackward>)


eigenvalue:  tensor([ 0.8710, -1.0656], grad_fn=<SelectBackward>)
eigvector:  tensor([-0.2718,  0.3032,  0.3336], grad_fn=<SelectBackward>)


eigenvalue:  tensor([-0.8852,  0.0000], grad_fn=<SelectBackward>)
eigvector:  tensor([ 0.0798, -0.0975,  0.8264], grad_fn=<SelectBackward>)




<div class="remarks">
    <p class="normal">Since an eigenvalue can be complex, the first element in the tensor is 
    the real part and the second element is the imaginary part of it.
    </p>

<h3>Example 2.</h3>

<div class="background">
    <p class="normal">
        Here is an example when the tensor has one eigenvalue with three different 
        eigenvectors.
    </p>
</div>

In [16]:
# Example 2 
C = torch.tensor([[0., -1., 0], [4., 4., 0], [2., 1., 2.]])

(eigvalues, eigvectors) = torch.eig(C, eigenvectors=True)
for i in range(3):
    print('eigenvalue: ', eigvalues[i])
    print('eigenvector: ', eigvectors[i])
    print('\n')

eigenvalue:  tensor([2., 0.])
eigenvector:  tensor([ 0.0000, -0.4472,  0.4082])


eigenvalue:  tensor([2.0000, 0.0000])
eigenvector:  tensor([ 0.0000,  0.8944, -0.8165])


eigenvalue:  tensor([2.0000, 0.0000])
eigenvector:  tensor([ 1.0000,  0.0000, -0.4082])




<h3>Example 3.</h3>

<div class="background">
    <p class="normal">Only square matrices can have eigenvalues.</p>
</div>

In [18]:
# Example 3 
D = torch.tensor([[0., -1., 0], [4., 4., 0]])

(eigvalues, eigvectors) = torch.eig(D, eigenvectors=True)

RuntimeError: invalid argument 1: A should be square at C:\Users\builder\AppData\Local\Temp\pip-req-build-9msmi1s9\aten\src\TH/generic/THTensorLapack.cpp:195

<h2>Least square norm</h2>