# Pytorch

You can open this notebook either within a supported container or Google colaboratory [here](https://colab.research.google.com/github/slaclab/slacml-school/blob/master/IntroNN/Pytorch-00-Tensors.ipynb).

[Pytorch](https://pytorch.org/) is one of open-source, modern deep learning libraries out there and what we will use in this workshop. Other popular libraries include [Tensorflow](https://www.tensorflow.org/), [Keras](https://keras.io), [MXNet](https://mxnet.apache.org), [Spark ML](https://spark.apache.org/mllib/), etc. ...

All of those libraries works very similar. If you are new, probably any of Pytorch/Keras/Tensorflow would work well with lots of guidance/examples/discussion-forums online! Common things you have to learn include:

1. Array data types (_tensor_ )
2. Data loading tools (streamline prepping data into appropraite types from input files)
3. Implementation of a ML model

In this notebook, we cover the basics of the first item.

<a href="datatype"></a>
## 1. Tensor data types in PyTorch
In `pytorch`, we use `torch.Tensor` object to represent a data array. It is a lot like `numpy` array but not quite the same. `torch` provide APIs to easily convert data between `numpy` array and `torch.Tensor`. Let's play a little bit.

In [1]:
from __future__ import print_function
import numpy as np
import torch
SEED=123
np.random.seed(SEED)    # Setting the seed for reproducibility
torch.manual_seed(SEED) # This is how you do for torch!

<torch._C.Generator at 0x7f35c785fa50>

... yep, that's how we set pytorch random number seed!

### Creating a torch.Tensor

Pytorch provides constructors similar to numpy (and named same way where possible to avoid users having to look-up function names). Here are some examples.

In [2]:
# Tensor of 0s = numpy.zeros
t=torch.zeros(2,3)
print('torch.zeros:\n',t)

# Tensor of 1s = numpy.ones
t=torch.ones(2,3)
print('\ntorch.ones:\n',t)

# Tensor from a sequential integers = numpy.arange
t=torch.arange(0,6,1).reshape(2,3).float()
print('\ntorch.arange:\n',t)

# Normal distribution centered at 0.0 and sigma=1.0 = numpy.rand.randn
t=torch.randn(2,3)
print('\ntorch.randn:\n',t)

torch.zeros:
 tensor([[0., 0., 0.],
        [0., 0., 0.]])

torch.ones:
 tensor([[1., 1., 1.],
        [1., 1., 1.]])

torch.arange:
 tensor([[0., 1., 2.],
        [3., 4., 5.]])

torch.randn:
 tensor([[-0.1115,  0.1204, -0.3696],
        [-0.2404, -1.1969,  0.2093]])


... or you can create from a simple list, tuple, and numpy arrays.

In [3]:
# Create numpy array
data_np = np.zeros([10,10],dtype=np.float32)
# Fill something
np.fill_diagonal(data_np,1.)
print('Numpy data\n',data_np)

# Create torch.Tensor
data_torch = torch.Tensor(data_np)
print('\ntorch.Tensor data\n',data_torch)

# One can make also from a list
data_list = [1,2,3]
data_list_torch = torch.Tensor(data_list)
print('\nPython list :',data_list)
print('torch.Tensor:',data_list_torch)

Numpy data
 [[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]]

torch.Tensor data
 tensor([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])

Python list : [1, 2, 3]
torch.Tensor: tensor([1., 2., 3.])


Converting back from `torch.Tensor` to a numpy array can be easily done

In [4]:
# Bringing back into numpy array
data_np = data_torch.numpy()
print('\nNumpy data (converted back from torch.Tensor)\n',data_np)


Numpy data (converted back from torch.Tensor)
 [[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]]


Ordinary operations to an array also exists like `numpy`.

In [5]:
# mean & std
print('mean',data_torch.mean(),'std',data_torch.std(),'sum',data_torch.sum())

mean tensor(0.1000) std tensor(0.3015) sum tensor(10.)


We see the return of those functions (`mean`,`std`,`sum`) are tensor objects. If you would like a single scalar value, you can call `item` function.

In [6]:
# mean & std
print('mean',data_torch.mean().item(),'std',data_torch.std().item(),'sum',data_torch.sum().item())

mean 0.10000000149011612 std 0.30151134729385376 sum 10.0


### Tensor addition and multiplication
Common operations include element-wise multiplication, matrix multiplication, and reshaping. Read the [documentation](https://pytorch.org/docs/stable/tensors.html) to find the right function for what you want to do!

In [7]:
# Two matrices 
data_a = np.zeros([3,3],dtype=np.float32)
data_b = np.zeros([3,3],dtype=np.float32)
np.fill_diagonal(data_a,1.)
data_b[0,:]=1.
# print them
print('Two numpy matrices')
print(data_a)
print(data_b,'\n')

# Make torch.Tensor
torch_a = torch.Tensor(data_a)
torch_b = torch.Tensor(data_b)

print('torch.Tensor element-wise multiplication:')
print(torch_a*torch_b)

print('\ntorch.Tensor matrix multiplication:')
print(torch_a.matmul(torch_b))

print('\ntorch.Tensor matrix addition:')
print(torch_a-torch_b)

print('\nadding a scalar 1:')
print(torch_a+1)

Two numpy matrices
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
[[1. 1. 1.]
 [0. 0. 0.]
 [0. 0. 0.]] 

torch.Tensor element-wise multiplication:
tensor([[1., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

torch.Tensor matrix multiplication:
tensor([[1., 1., 1.],
        [0., 0., 0.],
        [0., 0., 0.]])

torch.Tensor matrix addition:
tensor([[ 0., -1., -1.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]])

adding a scalar 1:
tensor([[2., 1., 1.],
        [1., 2., 1.],
        [1., 1., 2.]])


### Reshaping

You can access the tensor shape via `.shape` attribute like numpy

In [8]:
print('torch_a shape:',torch_a.shape)
print('The 0th dimension size:',torch_a.shape[0])

torch_a shape: torch.Size([3, 3])
The 0th dimension size: 3


Similarly, there is a `reshape` function

In [9]:
torch_a.reshape(1,9).shape

torch.Size([1, 9])

... and you can also use -1 in the same way you used for numpy

In [10]:
torch_a.reshape(-1,3).shape

torch.Size([3, 3])

### Indexing (Slicing)

We can use a similar indexing trick like we tried with a numpy array

In [11]:
torch_a[0,:]

tensor([1., 0., 0.])

or a boolean mask generation

In [12]:
mask = torch_a == 0.
mask

tensor([[False,  True,  True],
        [ True, False,  True],
        [ True,  True, False]])

... and slicing with it using `masked_select` function

In [13]:
torch_a[mask]#.masked_select(~mask)

tensor([0., 0., 0., 0., 0., 0.])

### GPU acceleration
Putting `torch.Tensor` on GPU is as easy as calling `.cuda()` function (and if you want to bring it back to cpu, call `.cpu()` on a `cuda.Tensor`). Let's do a simple speed comparison. 

Create two arrays with an identical data type, shape, and values.

In [14]:
# Create 1000x1000 matrix
data_np=np.zeros([1000,1000],dtype=np.float32)
data_cpu = torch.Tensor(data_np).cpu()
data_gpu = torch.Tensor(data_np).cuda()

Time fifth power of the matrix on CPU

In [15]:
%%timeit
mean = (data_cpu ** 5).mean().item()

634 µs ± 16.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


... and next on GPU

In [16]:
%%timeit
mean = (data_gpu ** 5).mean().item()

57.6 µs ± 601 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


... which is more than x10 faster than the cpu counter part :)

But there's a catch you should be aware! Preparing a data on GPU does take time because data needs to be sent to GPU, which could take some time. Let's compare the time it takes to create a tensor on CPU v.s. GPU.

In [17]:
%%timeit
data_np=np.zeros([1000,1000],dtype=np.float32)
data_cpu = torch.Tensor(data_np).cpu()

84.8 µs ± 5.17 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [18]:
%%timeit
data_np=np.zeros([1000,1000],dtype=np.float32)
data_gpu = torch.Tensor(data_np).cuda()

1.03 ms ± 92.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


As you can see, it takes nearly 10 times longer time to create this particular data tensor on our GPU. This speed depends on many factors including your hardware configuration (e.g. CPU-GPU communication via PCI-e or NVLINK). It makes sense to move computation that takes longer than this data transfer time to perform on GPU.