PyTorch
====


**PyTorch** is an open source machine learning framework with two main features:
  *  tensor computation (GPU-accelerated), *a.k.a.* a replacement for NumPy,
  *  deep neural networks built on a tape-based autograd system.


In [60]:
# import the library
import torch
import numpy as np


Tensors
-----


`torch.Tensor` is the central class of the package. Tensors are similar to NumPy's `np.ndarray`s.

###  1. Create tensors


In [21]:
# construct an empty 5x3 matrix, uninitialized
x = torch.empty(5, 3)
print(x)


tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])


In [22]:
# construct a randomly initialized 5x3 matrix
x = torch.rand(5, 3)  # uniform
print(x)

x = torch.randn(5, 3)  # normal ~ N(0, 1)
print(x)

tensor([[0.8653, 0.6897, 0.0229],
        [0.9979, 0.6567, 0.9525],
        [0.3046, 0.9542, 0.3341],
        [0.0769, 0.9163, 0.7875],
        [0.1992, 0.2836, 0.3276]])
tensor([[-1.4974,  2.1544,  0.0557],
        [-1.1442,  0.3541,  0.2077],
        [-2.0445,  0.2598,  0.6199],
        [ 1.0577, -1.0154, -0.5955],
        [-0.1789, -2.0080, -0.7005]])


In [23]:
# construct a matrix filled with zeros, specify dtype as `long`
x = torch.zeros(5, 3, dtype=torch.long)
print(x)


tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])


In [24]:
# construct a tensor from data
x = torch.tensor([5.5, 3])  # from list
print(x)


tensor([5.5000, 3.0000])


In [25]:
# create tensors based on existing tensors, reusing their parameters
x = torch.ones(8, 4, dtype=torch.long)
print(x)
print('-----')

# new_* methods reuse source tensor params, unless overridden
print(x.new_ones(5, 3))
print(x)  # not modified - new_* always copy data
print('-----')

# zeros_like, ones_like, empty_like, full_like, rand_like, randint_like, randn_like
# create tensor of the same size, but different values
print(torch.randn_like(x, dtype=torch.double))
print(x)  # not modified


tensor([[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]])
-----
tensor([[1, 1, 1],
        [1, 1, 1],
        [1, 1, 1],
        [1, 1, 1],
        [1, 1, 1]])
tensor([[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]])
-----
tensor([[-0.2257,  0.6381,  0.3933, -1.8913],
        [-0.7063,  0.8128,  0.5137, -0.6481],
        [ 0.9507,  0.6035,  0.1347,  0.5544],
        [ 2.3073, -0.7443,  0.1624,  0.2621],
        [ 0.6736,  0.2323, -0.1784,  2.7506],
        [ 0.0745,  0.4225,  1.3295, -0.2610],
        [ 0.0169, -1.0924,  0.9874,  1.5124],
        [ 1.2773, -0.0356, -0.7142, -1.0971]], dtype=torch.float64)
tensor([[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1

###  2. Tensor attributes and operations


**tensor types**

[torch.tensor](https://pytorch.org/docs/stable/tensors.html#torch-tensor)


In [26]:
# indexing and slicing
x = torch.tensor([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print(x)
print('-----')

print(x[0, 3])
print(x[1][2])
print('-----')

print(x[:, :1])
print(x[::2])
print(x[::2][0])
print(x[::2][0][1]) # multiple indexing is applied to the returned tensor


tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])
-----
tensor(4)
tensor(7)
-----
tensor([[1],
        [5],
        [9]])
tensor([[ 1,  2,  3,  4],
        [ 9, 10, 11, 12]])
tensor([1, 2, 3, 4])
tensor(2)


In [27]:
# get value from a single-value tensor
x = torch.tensor([[[1]]], dtype=torch.uint8)
print(x)
print(x.item())
print('-----')

x = torch.tensor(2.5, dtype=torch.float64)
print(x)
print(x.item())


tensor([[[1]]], dtype=torch.uint8)
1
-----
tensor(2.5000, dtype=torch.float64)
2.5


In [50]:
# get tensor size
x = torch.rand(5, 3)
print(x.size())  # <- is a tuple
size_tuple = x.size()


torch.Size([5, 3])


In [29]:
# multiple syntaxes for adding
x = torch.ones(2, 3, dtype=torch.long)
y = torch.randint(high=3, size=(2, 3))
print(x + y)  # use `+` operator

print(torch.add(x, y))  # use torch function

result = torch.empty(2, 3)
torch.add(x, y, out=result)  # provide an output tensor <- will return value as well
print(result)

y.add_(x)  # in-place addition
print(y)

# any method followed by `_` modifies the tensor in-place
# e.g. `copy_`, `t_`


tensor([[2, 2, 2],
        [2, 2, 3]])
tensor([[2, 2, 2],
        [2, 2, 3]])
tensor([[2., 2., 2.],
        [2., 2., 3.]])
tensor([[2, 2, 2],
        [2, 2, 3]])


In [30]:
# resizing tensors
x = torch.rand(4, 4)
print(x.view(16))
print(x.view(1, 8, 2))
print('-----')

reshaped = x.view(2, -1, 4)  # `-1` indicates size inferred from other dimensions
print(reshaped.size())
print(reshaped)


tensor([0.3385, 0.5113, 0.0868, 0.7628, 0.7793, 0.4275, 0.6207, 0.4226, 0.1773,
        0.1693, 0.7534, 0.4059, 0.2917, 0.7941, 0.6732, 0.9274])
tensor([[[0.3385, 0.5113],
         [0.0868, 0.7628],
         [0.7793, 0.4275],
         [0.6207, 0.4226],
         [0.1773, 0.1693],
         [0.7534, 0.4059],
         [0.2917, 0.7941],
         [0.6732, 0.9274]]])
-----
torch.Size([2, 2, 4])
tensor([[[0.3385, 0.5113, 0.0868, 0.7628],
         [0.7793, 0.4275, 0.6207, 0.4226]],

        [[0.1773, 0.1693, 0.7534, 0.4059],
         [0.2917, 0.7941, 0.6732, 0.9274]]])


All tensor operations are described in the [docs](https://pytorch.org/docs/torch).


###  3. PyTorch & NumPy


When creating numpy array from torch tensor they will share their memory locations (if the tensor is on CPU).


In [31]:
a = torch.ones(5)
print(a)

b = a.numpy() #  convert tensor to `np.ndarray`
print(b)

a.add_(1)
print(a)
print(b)


tensor([1., 1., 1., 1., 1.])
[1. 1. 1. 1. 1.]
tensor([2., 2., 2., 2., 2.])
[2. 2. 2. 2. 2.]


Same happens when converting from numpy array to torch tensor.


In [32]:
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print(a)
print(b)


[2. 2. 2. 2. 2.]
tensor([2., 2., 2., 2., 2.], dtype=torch.float64)


Unless you use `torch.tensor`, which copies the data.


In [33]:
a = np.ones(5)
b = torch.tensor(a)
np.add(a, 1, out=a)
print(a)
print(b)


[2. 2. 2. 2. 2.]
tensor([1., 1., 1., 1., 1.], dtype=torch.float64)


###  4. CUDA


Tensors can be moved between devices using `.to` method. This part will work only if you use a machine with CUDA GPU.


In [35]:
if torch.cuda.is_available():
    gpu = torch.device("cuda")
    x = torch.ones(5, 5, device=gpu)
    y = torch.randint(3, (5, 5))
    x = x.to(gpu)
    y = y.to(gpu)
    result = x + y
    print(result)
    print(result.to("cpu", dtype=torch.double))
else:
    print('No CUDA device available.')


tensor([[1., 1., 1., 1., 2.],
        [2., 1., 3., 1., 1.],
        [3., 1., 2., 1., 2.],
        [2., 2., 3., 2., 1.],
        [3., 2., 2., 2., 3.]], device='cuda:0')
tensor([[1., 1., 1., 1., 2.],
        [2., 1., 3., 1., 1.],
        [3., 1., 2., 1., 2.],
        [2., 2., 3., 2., 1.],
        [3., 2., 2., 2., 3.]], dtype=torch.float64)


Exercises
-----


1. Create two tensors of shape $\left(27, 19, 31\right)$ and $\left(31, 111\right)$. Use any of the random tensor creation methods. Make sure their dtype is floating-point.


In [45]:
tensorX = torch.randn_like(torch.rand(27,19,31), dtype=torch.float)
tensorY = torch.randn_like(torch.rand(31,111), dtype=torch.float)
print(x)
print(y)
print(type(tensorX))
print(type(tensorY))

tensor([[[-0.5211,  0.5016,  1.4059,  ..., -1.4688, -0.4714,  0.6462],
         [ 0.5891,  0.3432, -1.2955,  ...,  0.8807, -0.5762,  0.3791],
         [-0.0474, -0.6959, -0.6475,  ..., -1.2809, -0.9280,  1.2150],
         ...,
         [ 0.9075,  0.2371,  0.5077,  ...,  1.4299,  0.8652, -1.2710],
         [-0.4003,  0.5827, -2.0458,  ...,  0.1652, -1.0804, -1.9235],
         [ 0.2413,  0.1743, -0.4491,  ...,  1.0869,  1.8087, -0.8467]],

        [[ 1.6383, -1.2787, -0.8466,  ..., -1.2047, -1.8936, -1.0016],
         [-0.9249,  0.1175,  0.6370,  ...,  1.6973, -1.5488, -1.1826],
         [ 1.0062, -0.3697,  0.1979,  ...,  1.6525,  0.7016,  0.4198],
         ...,
         [-0.5146, -2.1110,  0.6454,  ...,  1.1740, -0.0562,  0.3085],
         [ 0.2872,  2.0993, -0.2583,  ..., -0.6642, -0.8256, -0.6790],
         [ 0.4240,  1.1966,  0.0477,  ...,  0.4980,  1.0071, -0.6121]],

        [[ 0.4824, -0.5257, -0.8523,  ...,  0.7688,  0.2490, -0.0434],
         [-0.7563, -0.9057, -1.0934,  ...,  0

2. Perform matrix multiplication of the tensors (`@`, `torch.matmul` or `tensor.matmul_`). What is the size of the new tensor?

In [52]:
tensorMultResult = torch.matmul(tensorX,tensorY)
print(tensorMultResult)
print(type(tensorMultResult))
print(tensorX.size())
print(tensorY.size())
print(tensorMultResult.size())

tensor([[[  7.8226,  -7.0971,   2.1471,  ...,   5.0785,  -2.2803,  -6.8552],
         [ -1.9034,  -4.6123,  -0.2734,  ...,  -0.6800,   2.2557,  -5.3794],
         [  1.2714,   0.9417,   7.3842,  ...,  11.3143,  -6.6351,  -7.9711],
         ...,
         [ -5.7391,  -2.3119,   5.3632,  ...,  -0.0421,  -3.4115,   5.2718],
         [ -5.3429,   8.0538,   5.0371,  ...,   5.2902,  -6.9418,   3.1659],
         [  2.6246,   6.6340,  -1.7441,  ...,   1.6550,   1.5632,   8.4305]],

        [[ -1.1502,  -3.4884,  -6.8190,  ..., -12.2715,   6.5180,   1.2694],
         [ -0.2346,  -3.1180,  -0.9065,  ...,  -8.3755,   0.6934,  11.9936],
         [ -2.5514,   8.3006,   0.6703,  ...,   4.0431,  -8.7031,  -1.0655],
         ...,
         [ -0.1263,   2.3281,  -1.0237,  ...,   1.0072,  -0.2368,   7.7174],
         [ -8.4567,  -7.0858,   5.4032,  ...,   4.5992,  -0.7764,   5.8018],
         [ -7.4101,   2.3124,   5.4602,  ...,  -0.6294,   4.1564,   4.6614]],

        [[ -2.9305,  -9.5212,   5.5438,  ...

3. Perform summing across the last dimension (pass optional argument `dim=-1` to `torch.sum`). What is the size of the new tensor?

In [54]:
sumResult = torch.sum(tensorMultResult, dim=-1)
print(sumResult)
print(sumResult.size())

tensor([[  -8.3852,  162.5751,   30.5216,   -8.1050,   20.6285,  -25.2803,
           14.8773,  -65.8042,   10.3653,   47.5125,   37.4197,   51.6901,
           15.7422,   18.7500,  -18.5757,  -35.0929,  -25.0597,  -11.4783,
           33.2534],
        [  74.5325,   38.0117,   35.9144, -114.7429,  -25.7250,   25.5807,
          -23.3000,  -15.1230,   72.7348,   16.7536,   38.6069,   51.8358,
          -30.7232,   72.4842,  -17.1810,  -73.2527,   40.9812,  -42.5609,
           25.5210],
        [  42.4340,   51.6075,    3.3328,    2.5444,   -8.1346,   -5.9341,
           37.8788,   -7.0323,   38.6901,   71.4333,  -41.2967,   21.0989,
           32.5095,   13.9905,  -14.0897,  -69.3006,    4.7837,    8.1869,
           12.7165],
        [ -22.7002,    9.8645,  -94.2144,    1.8394,  -79.0693,  -48.5473,
           21.5035,  -52.3408,   -2.3429,   63.5343,  -73.0852,  -62.6064,
           59.8210,   41.5489,    3.3236,   63.1898,  -12.9795,   30.7664,
           56.4094],
        [   1.83

4. Use `torch.mean` to calculate average across the first dimension of the `summed` tensor.

In [59]:
meanResultOfSum = torch.mean(sumResult)
print(meanResultOfSum)
print(type(meanResultOfSum))

tensor(1.2504)
<class 'torch.Tensor'>
