# Fall 2020: DS-GA 1011 NLP with Representation Learning
## Lab 1: 04-Sep-2020, Friday
## [PyTorch](https://pytorch.org)

All of deep learning is computations on tensors, which are
generalizations of a matrix that can be indexed in more than 2
dimensions.

PyTorch is a Python based scientific computing package (tensorial library) that uses the power of GPUs and provides maximum flexibility and speed.

In [1]:
# Before importing, install package in your virtual environment 
# using statement from the PyTorch website based on your OS, Package Manager

# !conda install pytorch torchvision -c pytorch
import torch

In [2]:
# Fixing random seed
# Snippet given below is for all random methods, for now we need only available ones
seed = 1011
# random.seed(seed)
# np.random.seed(seed)
torch.manual_seed(seed)
# if torch.cuda.is_available(): # We will see how this works in the last section
#     torch.cuda.manual_seed(seed)
#     torch.cuda.manual_seed_all(seed)
#     torch.backends.cudnn.benchmark = False
#     torch.backends.cudnn.deterministic = True

<torch._C.Generator at 0x7f7a76970ab0>

---
### Creating Tensors

Tensors can be created from Python lists with the torch.tensor()
function.

In [3]:
# torch.tensor(data) creates a torch.Tensor object with the given data.
V_data = [1., 2., 3.]
V = torch.tensor(V_data)
print(V)

# Creates a matrix
M_data = [[1., 2., 3.]]
M = torch.tensor(M_data)
print(M)

# Create a 3D tensor of size 2x2x2.
T_data = [[[1., 2.], [3., 4.]],
          [[5., 6.], [7., 8.]]]
T = torch.tensor(T_data)
print(T)

tensor([1., 2., 3.])
tensor([[1., 2., 3.]])
tensor([[[1., 2.],
         [3., 4.]],

        [[5., 6.],
         [7., 8.]]])


In [4]:
print(V.shape)
print(M.shape)
print(T.shape)

torch.Size([3])
torch.Size([1, 3])
torch.Size([2, 2, 2])


use **.item()** to get a Python number from it 

In [5]:
type(V)

torch.Tensor

In [6]:
V.dtype

torch.float32

In [7]:
print(T[0, 1, 1])
print(T[0, 1, 1].item())

tensor(4.)
4.0


You can create a tensor with random data and the supplied dimensionality
with **torch.randn()**




In [8]:
x = torch.randn((3, 4, 5))
print(x)

tensor([[[ 1.2052,  0.1094, -0.8832,  0.3788, -0.7768],
         [-0.6804,  1.2327,  1.0014, -1.5491,  0.1858],
         [ 1.4104,  1.7762,  0.6130, -0.0582,  1.0088],
         [-0.4520,  0.3206, -0.2076,  1.2377, -1.1629]],

        [[-1.0091,  0.0051, -0.2265,  0.4963, -0.0476],
         [ 1.5747, -0.1061, -0.5696,  0.1116,  0.7162],
         [-0.6617,  1.7941,  0.7287, -0.8194, -0.9019],
         [ 1.1155,  0.9972, -0.9204, -0.2496, -0.3251]],

        [[ 2.1975,  2.9558,  1.6646, -0.5872,  0.5809],
         [-0.5396, -1.2176, -0.3379, -0.4550, -0.5796],
         [-0.9330,  0.1720,  1.0668, -0.5753,  0.2944],
         [ 0.2626,  0.5561, -0.0462,  0.6692,  2.3364]]])


---
### Operations with Tensors

Similar to NumPy, PyTorch tensors are also broadcastable.



In [9]:
x = torch.tensor([1., 2., 3.])
y = torch.tensor([4.])
z = x * y
print(z)

tensor([ 4.,  8., 12.])


In [10]:
x = torch.tensor([3., 5.])
y = torch.tensor([[1., 2.],[3., 4.]]) 
z = x * y
print(z)

tensor([[ 3., 10.],
        [ 9., 20.]])


In [11]:
import numpy as np
x = torch.tensor([[0.],[1.],[2.],[3.]])
y = torch.tensor(np.ones(5))

print(x)
print(y)
print(x + y)

tensor([[0.],
        [1.],
        [2.],
        [3.]])
tensor([1., 1., 1., 1., 1.], dtype=torch.float64)
tensor([[1., 1., 1., 1., 1.],
        [2., 2., 2., 2., 2.],
        [3., 3., 3., 3., 3.],
        [4., 4., 4., 4., 4.]], dtype=torch.float64)


**torch.cat()**

In [12]:
# By defult, dim = 0; it concatenates tensors along the first axis (concatenate row-wise)
x_1 = torch.randn(2, 5)
y_1 = torch.randn(3, 5)
z_1 = torch.cat([x_1, y_1])
print(x_1)
print(y_1)
print(z_1)

print(z_1.shape)

tensor([[-0.8491, -0.8160,  0.3474,  1.4565,  0.7865],
        [ 1.4771,  0.0311,  1.1400, -0.3954,  0.6512]])
tensor([[-1.6450e-01,  7.3995e-01, -1.0988e+00, -2.9694e-02, -1.7262e+00],
        [ 1.9209e+00,  9.5517e-02,  6.6707e-01, -1.1831e+00, -2.1504e+00],
        [-1.8599e-01,  3.0147e-01, -1.4919e+00, -1.3604e-03,  1.2583e+00]])
tensor([[-8.4905e-01, -8.1601e-01,  3.4742e-01,  1.4565e+00,  7.8647e-01],
        [ 1.4771e+00,  3.1138e-02,  1.1400e+00, -3.9542e-01,  6.5121e-01],
        [-1.6450e-01,  7.3995e-01, -1.0988e+00, -2.9694e-02, -1.7262e+00],
        [ 1.9209e+00,  9.5517e-02,  6.6707e-01, -1.1831e+00, -2.1504e+00],
        [-1.8599e-01,  3.0147e-01, -1.4919e+00, -1.3604e-03,  1.2583e+00]])
torch.Size([5, 5])


In [13]:
# Concatenate column-wise:
x_2 = torch.randn(2, 3)
y_2 = torch.randn(2, 5)

# second arg specifies which axis to concat along
z_2 = torch.cat([x_2, y_2], dim=1)

print(x_2)
print(y_2)
print(z_2)

print(z_2.shape)

#If your tensors are not compatible, you will get the error.  Uncomment to see the error
#torch.cat([x_1, x_2])

tensor([[ 0.6190, -0.0619, -0.1383],
        [ 1.2936, -0.9997, -1.5579]])
tensor([[-0.1909,  1.2352, -1.4324,  2.0024, -0.9337],
        [-1.9675,  0.2076, -0.7082,  0.0527, -1.8856]])
tensor([[ 0.6190, -0.0619, -0.1383, -0.1909,  1.2352, -1.4324,  2.0024, -0.9337],
        [ 1.2936, -0.9997, -1.5579, -1.9675,  0.2076, -0.7082,  0.0527, -1.8856]])
torch.Size([2, 8])


---
### Reshaping Tensors

Many neural network components expect their inputs to have
a certain shape. Often you will need to reshape before passing your data
to the component.




**.view()**

In [14]:
x = torch.randn(2, 3, 4)
x
#print(x.shape)

tensor([[[-2.2171,  2.2268, -0.0554,  0.8204],
         [ 2.2787,  0.5273,  0.4185, -0.1374],
         [ 0.4606, -1.0243, -1.4957, -1.9919]],

        [[-0.6965,  0.6811,  1.5641,  0.5920],
         [ 0.0257,  0.5130,  0.0565,  0.1281],
         [-0.3367, -0.4313,  1.5731, -0.1342]]])

In [15]:
# Reshape to 2 rows, 12 columns
print(x.view(2, 12))
print(x.view(2, 12).shape)

tensor([[-2.2171,  2.2268, -0.0554,  0.8204,  2.2787,  0.5273,  0.4185, -0.1374,
          0.4606, -1.0243, -1.4957, -1.9919],
        [-0.6965,  0.6811,  1.5641,  0.5920,  0.0257,  0.5130,  0.0565,  0.1281,
         -0.3367, -0.4313,  1.5731, -0.1342]])
torch.Size([2, 12])


In [16]:
# If one of the dimensions is -1, its size can be inferred
print(x.view(3, -1))
print(x.view(3, -1).shape)

tensor([[-2.2171,  2.2268, -0.0554,  0.8204,  2.2787,  0.5273,  0.4185, -0.1374],
        [ 0.4606, -1.0243, -1.4957, -1.9919, -0.6965,  0.6811,  1.5641,  0.5920],
        [ 0.0257,  0.5130,  0.0565,  0.1281, -0.3367, -0.4313,  1.5731, -0.1342]])
torch.Size([3, 8])


**.squeeze()** and **.unsqueeze()**

In [17]:
a = torch.randn(24)
print(a.shape)

torch.Size([24])


In [18]:
a

tensor([ 0.0352, -1.0040,  0.1654,  0.6201, -0.5623,  1.3023, -0.8539,  0.0973,
         0.9731, -0.6629, -0.9247, -0.6265, -0.4206, -0.2975, -0.0177, -1.1858,
         0.6329, -1.2217,  0.5775, -1.3303, -0.1493, -1.0621, -0.4970, -1.2579])

In [19]:
# .unsqueeze() adds a superficial 1 dimension to the tensor at a specific dimension
b = a.unsqueeze(dim=0).unsqueeze(dim=0)
print(b)
print(b.shape)

tensor([[[ 0.0352, -1.0040,  0.1654,  0.6201, -0.5623,  1.3023, -0.8539,
           0.0973,  0.9731, -0.6629, -0.9247, -0.6265, -0.4206, -0.2975,
          -0.0177, -1.1858,  0.6329, -1.2217,  0.5775, -1.3303, -0.1493,
          -1.0621, -0.4970, -1.2579]]])
torch.Size([1, 1, 24])


In [20]:
# .squeeze() removes all 1 dimensions of the tensor
c = b.squeeze()
print(c)
print(c.shape)

tensor([ 0.0352, -1.0040,  0.1654,  0.6201, -0.5623,  1.3023, -0.8539,  0.0973,
         0.9731, -0.6629, -0.9247, -0.6265, -0.4206, -0.2975, -0.0177, -1.1858,
         0.6329, -1.2217,  0.5775, -1.3303, -0.1493, -1.0621, -0.4970, -1.2579])
torch.Size([24])


---
### Using GPU(s) and CUDA
- **GPU** (Graphics Processing Unit) is needed for enough computational power to execute a machine learning or deep learning algorithm on a large dataset. They are designed to rapidly manipulate and alter memory especially for image processing.
- [**CUDA** (Compute Unified Device Architecture)](https://developer.nvidia.com/cuda-zone) is a parallel computing platform and application programming interface (API) model created by Nvidia for general-purpose computing on GPUs (GPGPU).

In [21]:
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


In [22]:
a = torch.tensor([1., 2.], device=device)
b = torch.tensor([3., 4.]).to(device)
# c = a.cuda()
# a = c.cpu()

In [23]:
c = a + b
print(c)

tensor([4., 6.], device='cuda:0')


In [24]:
print(c.device)

cuda:0


In [25]:
d = torch.tensor([4., 5.])
print(d.device)

cpu


In [26]:
c + d

RuntimeError: ignored

In [None]:
---
## References
PyTorch Tutorial: Robert Guthrie