<center>
    <tr>
    <td><img src="images/Quansight_Logo_Lockup_1.png" width="25%"></img></td>
    </tr>
</center>

---

# PyTorch Tensor Basics

---
## Why PyTorch Tensors?


- [PyTorch](https://pytorch.org): Python-based scientific computing package offering:
  - Tensor support (replacement for NumPy)
  - Fast & flexible platform for deep learning research
+ [What is PyTorch?](https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html)  at [`pytorch.org`](https://pytorch.org) provides quick tour principal topics (e.g., tensor indexing, arithmetic operations, elementwise functions, linear algebra, etc.)
+ Ostensibly similar functionality/API to NumPy; why bother?

1. **GPU computation**:
  + GPUs (graphical processing units) widely available but challenging to program
  + PyTorch eases GPU programming burden with Python interface
2. **Automatic differentiation**: 
  + PyTorch includes [`autograd`](https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html) for *backpropagation*
  + Management of gradients (and associated memory) simplified with PyTorch

---

##  Quick Tour of Torch tensors

+ Principal PyTorch data structure: *tensor*
+ Similar to NumPy `ndarray` with:

    + support for GPU computing
    + support for automatic differentiation (see `autograd`)

In [None]:
import torch

t = torch.tensor([[1,2,3],[4,5,6]])
print(t)

In [None]:
t.t()

In [None]:
print(t.permute(-1,0)) # More general "transpose" in higher dimensions

In [None]:
print('shape:',t.shape)
print('size:',t.size())
print('dim:',t.dim())
print('type:',t.type())
print('num elements:', torch.numel(t))
print('device (cpu or gpu):', t.device)

+ As with NumPy, variety of PyTorch data types for arrays:

|  NumPy dtype | PyTorch dtype | Alternative | Tensor class |
|:-:|:-:|:-:|:-:|
| `np.int16`  |`torch.Int16`  |`torch.short` |`ShortTensor` |
| `np.int32`  |`torch.Int32`  |`torch.Int`   |`IntTensor`   |
| `np.int64`  |`torch.Int64`  |`torch.long`  |`LongTensor`  |
| `np.float16`|`torch.float16`|`torch.half`  |`HalfTensor`  |
| `np.float32`|`torch.float32`|`torch.float` |`FloatTensor` |
| `np.float64`|`torch.float64`|`torch.double`|`DoubleTensor`|

#### Changing tensor views

+ Generalization of reshaping

In [None]:
t = torch.tensor([[1,2,3],[4,5,6]])

In [None]:
print(t)
print('View example:\n', t.view(1,-1))
print('View example:\n', t.view(-1,1))
print('View example:\n', t.view(3,2))

#### Slicing

First row

In [None]:
print('Matlab or numpy style slicing:\n',t[1,:])

Second column

In [None]:
print('Matlab or numpy style slicing:\n',t[:,1])

Lower right most element

In [None]:
print('Matlab or numpy style slicing:\n',t[-1,-1])

Lower right most 1 x 1 submatrix

In [None]:
print('Matlab or numpy style slicing:\n',t[-1:,-1:])

Lower right most 2 x 2submatrix

In [None]:
print('Matlab or numpy style slicing:\n',t[-2:,-2:])

#### Torch tensors and Numpy arrays

+ Construction of PyTorch tensors from NumPy arrays

In [None]:
import numpy as np
a = np.random.randn(2, 4)
t = torch.from_numpy(a)

+ Back to numpy

In [None]:
b = t.numpy()

In [None]:
print('numpy:\n', a)
print('torch:\n', t)
print(type(a))
print(type(t))
print(type(b))

#### Creating PyTorch tensors

+ Many PyTorch functions/methods similar names to NumPy (e.g., `zeros`, `ones`, `rand`, `randn`, etc.)

In [None]:
print('Zero tensor:\n', torch.zeros(2,3,4))

In [None]:
print('Ones tensor:\n', torch.ones(2,3,4))

In [None]:
print('Random - Uniform, between 0 and 1):\n', torch.rand(2,3,4))
print('Random - Normal, mean 0 and standard deviation 1 :\n', torch.randn(2,3,4))

#### Tensor concatenation

In [None]:
t1 = torch.tensor([[1,2,3],[4,5,7]])
t2 = torch.tensor([[8,9,10],[11,12,13]])

In [None]:
print('t1:\n', t1)
print('t2:\n', t2)

Concatenating two tensors along 0 (first, rows in this case) dimension

In [None]:
print(torch.cat((t1,t2),0))

Concatenating two tensors along 1 (second, columns in this case) dimension

In [None]:
print(torch.cat((t1,t2),1))

In [None]:
t = torch.tensor([[1,2,3],[4,5,6],[7,8,9]])

Computing cumulative sum

In [None]:
print(t)
print('Sum along columns:\n', t.cumsum(-1))
print('Sum along rows:\n', t.cumsum(-2))

#### "Unsqueezing" tensors (i.e., adding dimensions)

<center>
    <tr>
    <td><img src="images/tensor-concatination.png" width="50%"></img></td>
    </tr>
</center>

+ First, NumPy way:

In [None]:
x = np.random.rand(3,4)
print('Before', x.shape)
x = x.reshape(3,1,4)
#x = x[:,np.newaxis,:]   # Equivalent to reshaping
#x = x[:,None,:]         # Equivalent to reshaping
print('After', x.shape)

+ Next, PyTorch way:

In [None]:
t = torch.rand(3,4)
print('Before:', t.shape)
t1 = t.unsqueeze(1) # Inserts singleton dimension in position 1
print('After:', t1.shape)

+ Let's consider another $3\times4$ matrix (e.g., grayscale image or frame)

In [None]:
t2 = torch.rand(3,4)
print(t2.shape)

+ Say we want to combine `t1` and `t2` as successive *frames* in first dimension

In [None]:
print(t, t.size())
t.unsqueeze_(0) # inplace unsqueeze, we just added a dimension
print(t, t.size())

In [None]:
print(t2, t2.size())
t2.unsqueeze_(0)
print(t2, t2.size())

In [None]:
t_and_t2 = torch.cat((t,t2),0) # concatenates in first dimension
print('First dimension iterators over frames:\n', t_and_t2.shape)
print(t_and_t2)

#### Other operations

+ Testing for equality
+ "Vectorized" functions

In [None]:
t1 = torch.tensor([[1,2,3],[4,5,6],[7,8,9]])
t2 = torch.tensor([[1,20,3],[40,5,6],[7,8,9]])

+ Elementwise equality test (inadvisable with floats)

In [None]:
print(torch.eq(t1,t2))

In [None]:
t = torch.rand(2,3)
print('t:\n', t)

Log

In [None]:
print('log t:\n', torch.log(t))

Negative

In [None]:
print('neg t:\n', torch.neg(t))

Power

In [None]:
print('power t:\n', torch.pow(t, 2))

Reciprocal

In [None]:
print('reciprocal t:\n', torch.reciprocal(t))

Round

In [None]:
print('round t:\n', torch.round(t))

Sigmoid

In [None]:
print('sigmoid t:\n', torch.sigmoid(t))

Sign

In [None]:
print('sign t:\n', torch.sign(t))

sqrt

In [None]:
print('sqrt t:\n', torch.sqrt(t))

argmax, along 0-th dimension (that moves along the rows)

In [None]:
print('argmax t:\n', torch.argmax(t, 0))

mean, along 1-th dimension (that moves along the columns)

In [None]:
print('mean t:\n', torch.mean(t, 1))

#### Vector & Matrix products

In [None]:
t1 = torch.tensor([0,1,0])
t2 = torch.tensor([1,0,0])
print(t1.cross(t2))

In [None]:
t1 = torch.randn(4,3)
t2 = torch.randn(4,3)

+ Row-wise vector cross product 

In [None]:
t1_cross_t2 = t1.cross(t2)
print(t1_cross_t2)

+ Confirm respective cross products are orthogonal to corresponding rows of `t1` and `t2`

In [None]:
for i in range(t1.size(0)):
    print('Row %d' % i, t1[i,:].dot(t1_cross_t2[i,:]))

+ Matrix multiplication through `torch.tensor.mm` method

In [None]:
m1 = torch.randn(4,3)
m2 = torch.randn(3, 2)

In [None]:
print('m1:\n', m1)
print('m2:\n', m2)
# Matrix multiplication
print('Matrix multiplication:\n', m1.mm(m2))

In [None]:
m1 = torch.tensor([[1,2],[3,4]], dtype=torch.float32)
m2 = torch.tensor([[2,4],[-1,6]], dtype=torch.float32)
print('m1:\n', m1)
print('m2:\n', m2)
# Element-wise multiplication ("Hadamard product")
print('Element-wise multiplication:\n', m1.mul(m2))

---

## CUDA GPU Support

+ Checking if CUDA GPU is available

In [None]:
result = torch.cuda.is_available()
print('CUDA available (T/F):', result)

+  How many CUDA devices are available?

In [None]:
result = torch.cuda.device_count()
print('Number of CUDA devices available:', result)

+  PyTorch tensors have `device` attribute; can be set according to availability of GPU hardware

In [None]:
a = torch.ones(10)
print(f'a.device: {a.device}')

#### Timing tensor computations

In [None]:
import time

In [None]:
a = torch.ones((300000))
print(f'a.device: {a.device}')
start_time = time.time()
print(a.sum().item())
end_time = time.time()
print('It took {} seconds'.format(end_time - start_time))
print('a is sitting on', a.device)

In [None]:
is_cuda = torch.cuda.is_available()
if not is_cuda:
    print('Nothing to do here')
else:
    a_ = a.cuda()
    print(f'a.device: {a.device}')
    start_time = time.time()
    print(a_.sum())
    end_time = time.time()
    print('It took {} seconds'.format(end_time - start_time))
    print('a is sitting on', a_.device)

#### Working with Images

+ Load an image usng PIL library

In [None]:
from PIL import Image

In [None]:
filename = 'images/3063.jpg'
image = Image.open(filename)
print(image) # Loaded into suitable Image object with metadata

+ Converting PIL image to numpy

In [None]:
import numpy as np
image_np = np.array(image, dtype='float32')/255.
print(image_np.shape)

+ Converting NumPy array to PyTorch tensor

In [None]:
image_tensor = torch.tensor(image_np)
print(image_tensor.shape)

+ Displaying image using Matplotlib

In [None]:
import matplotlib.pyplot as plt
plt.figure(figsize=(15,5))
plt.subplot(131)
plt.title('PIL image')
plt.imshow(image)
plt.subplot(132)
plt.title('Numpy array')
plt.imshow(image_np)
plt.subplot(133)
plt.title('Torch tensor')
plt.imshow(image_tensor);

+ Computing mean and variance of red, blue and green channels
  + First, convert  `(w x h x 3)` image to `(3 x w x h)`

In [None]:
print('Shape of image_tensor', image_tensor.shape)
x = image_tensor.transpose(0,2).transpose(1,2)
print('Shape of x', x.shape)

In [None]:
npixels = x.size(1) * x.size(2)
print(f'Number of pixels = {npixels}')

In [None]:
sums = torch.sum(x, dim=(1,2))
print('Sum of pixel intensities', sums)
print(f'Shape of sum {sums.shape}')

In [None]:
means = sums.view(3,-1) / npixels
print('Mean of pixel intensities', means)
print(f'Shape of means {means.shape}')

In [None]:
image_tensor.mean(dim=(0,1)) # Same computation using mean method

In [None]:
x_centered = (x.view(3,-1) - means).view(x.shape)
print(x_centered.shape)

In [None]:
y = x_centered.transpose(0,2).transpose(0,1)
print(y.shape)

In [None]:
plt.imshow(y)
print(torch.min(y))
print(torch.max(y))

---

## Backpropagation with `autograd`

+ [`autograd`](https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html) module supports *automatic differentiation*
+ Remember, gradients needed to train neural network parameters (weights & biases) with (stochastic) gradient descent
+ In PyTorch, automatic differentiation uses tensor `requires_grad` attribute
  + `torch.Tensor.requires_grad` can be set `True` on construction (default `False`)
  + Alternatively, method `torch.Tensor.requires_grad_( ... )` modifies tensor flag in-place (default `False`).

+ Once tensors have `requires_grad` attribute, additional space allocated for intermediate computations
+ Calling `torch.Tensor.backward()` computes all gradients recursively by backpropagation
+ Intermediate gradients accessible using `torch.Tensor.grad` attribute

#### Backpropagation example

+ Consider simple polynomial function applied to scalar value $x$:

$\begin{aligned} &\mathrm{Function:} & f(x) &= 3x^4 -2x^3 + 4x^2 - x + 5 \\
&\mathrm{Derivative:} & f'(x) &= 12x^3 -6 x^2 + 8x -1\end{aligned}$

1. Construct tensor `x` setting attribute `requires_grad=True` using constructor

In [None]:
x = torch.tensor(2.0, requires_grad=True)

2. Map polynomial function $f$ onto tensor `x` ; bind result to `y`
  + Verify explicitly $f(x)=51$ when $x=2$:
 $$f(2)=3(2)^4 - 2(2)^3 + 4(2)^2 -(2) +5 = 48-16+16-2+5 = 51$$

In [None]:
y = 3*x**4 - 2*x**3 + 4*x**2 - x + 5  # Write out computation of y explicitly.

print(f'Value of y: {y}') # Notice y has a new attribute: grad_fn
print(y.grad_fn)

+ Object `y` has associated gradient function accessible `y.grad_fn`
+ When `y` computed/stored, various algebraic operations are applied to tensor `x`
+ Derivatives/gradients of those operations known, so `autograd` package computes those derivatives
  + (that's what `AddBackward0` object is)
+ Invoking `y.backward()` computes value of *gradient* of `y` with respect to `x` evaluated at `x==2`:

$$f'(2) = 12(2^3) - 6(2^2) + 8(2) - 1 = 96-24+16-1 = 87. $$

+ Notice computed gradient value stored `x.grad` attribute of original tensor `x`

In [None]:
y.backward() # Compute derivatives and propagate values back through tensors on which y depends

print(x.grad)  # Expect the value 87 as a singleton tensor

+ Notice invoking `y.backward()` a second time raises exception
  + Intermediate arrays required for backpropagation released/deallocate by default

In [None]:
y.backward() # Yields a RuntimeError (just like before calling backward before forward)

#### Another backpropagation example

+ Use $z = \cos(u)$ with $u=x^2$ at $x=\sqrt{\frac{\pi}{3}}$
+ Expect $z=\frac{1}{2}$ when $x=\sqrt{\frac{\pi}{3}}$

In [None]:
x = torch.tensor([np.sqrt(np.pi/3)], requires_grad=True)
u = x ** 2
z = torch.cos(u)
print(f'x: {x}\nu: {u}\nz: {z}')

+ Expect 
  $$\frac{dz}{dx} = \frac{dz}{du} \frac{du}{dx} = (-\sin u) (2 x) = \sqrt{\pi}$$
  when $x=\sqrt{\frac{\pi}{3}}$

In [None]:
# Now apply backward for backpropagation of derivate values
z.backward()

In [None]:
print(f'x.grad:\t\t\t\t\t\t{x.grad}')
x, u = x.item(), u.item() # extract scalar values
print(f'Computed derivative using analytic formula:\t{-np.sin(u)*2*x}')

+ Notice tensors `x`, `u`, & `z` all *singleton* tensors (scalars)
+ Method `item` extracts scalar entry from singleton tensor

---
## Summary

- [PyTorch](https://pytorch.org): Python-based scientific computing package offering:
  - Tensor support (replacement for NumPy)
  - Fast & flexible platform for deep learning research
+ Supports **GPU computation**
+ Supports **Automatic differentiation**
---

<center>
    <tr>
    <td><img src="images/Quansight_Logo_Lockup_1.png" width="25%"></img></td>
    </tr>
</center>