## Natural Language Processing with PyTorch

### Chapter 1 - PyTorch Basics

In [1]:
# import libraries
import torch
import numpy as np

In [4]:
# verify the GPU
def verify_gpu():
    print(f"cuda.is_available: {torch.cuda.is_available()}")
    print(f"cuda.device_count: {torch.cuda.device_count}")
    print(f"cuda.current_device: {torch.cuda.current_device()}")
    print(f"cuda.device: {torch.cuda.device(0)}")
    print(f"cuda.get_device_name: {torch.cuda.get_device_name()}")

In [5]:
verify_gpu()

cuda.is_available: True
cuda.device_count: <functools._lru_cache_wrapper object at 0x00000123CE3131C0>
cuda.current_device: 0
cuda.device: <torch.cuda.device object at 0x00000123D790F250>
cuda.get_device_name: NVIDIA GeForce RTX 3090


### Example 1-3 Creating a tensor in PyTorch with torch.Tensor

In [36]:
def describe(x):
    print(f"Type: {x.type()}")
    print(f"Shape/size: {x.shape}")
    print(f"values: \n{x}")

In [7]:
describe(torch.Tensor(2, 3))

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: tensor([[8.3960e-07, 8.3654e+20, 4.4153e-05],
        [8.4641e-07, 8.4708e-07, 5.2648e+22]])



### Example 1-4 - Creating a randomly initialized tensor

In [8]:
# uniform distribution between [0, 1)
describe(torch.rand(2, 3))
# standard normal distribution
describe(torch.randn(2, 3))


Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: tensor([[0.8267, 0.0974, 0.1520],
        [0.1496, 0.9291, 0.0998]])

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: tensor([[-2.2190,  0.7079, -0.1769],
        [ 0.4223,  0.6083, -0.3414]])



### Example 1-5 - Creating a filled tensor

Any PyTorch method with an underscore (_) refers to an in-place operation

In [9]:
describe(torch.zeros(2, 3))
x = torch.ones(2, 3)
describe(x)

# fill x in place
x.fill_(5)
describe(x)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: tensor([[0., 0., 0.],
        [0., 0., 0.]])

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: tensor([[1., 1., 1.],
        [1., 1., 1.]])

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: tensor([[5., 5., 5.],
        [5., 5., 5.]])



### Example 1-6 - Creating and initializing a tensor from lists

In [10]:
x = torch.Tensor([[1, 2, 3],
                  [4, 5, 6]])

describe(x)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: tensor([[1., 2., 3.],
        [4., 5., 6.]])



### Example 1-7 - Creating and initializing a tensor form NumPy

Note that when I convert a numpy array to a torch tensor, the data type is DoubleTensor as opposed to FloatTensor.
This corresponds with the data type of the NumPy random matrix, a float64.

In [11]:
npy = np.random.rand(2, 3)
describe(torch.from_numpy(npy))

Type: torch.DoubleTensor
Shape/size: torch.Size([2, 3])
values: tensor([[0.2695, 0.6521, 0.9914],
        [0.9120, 0.0102, 0.0956]], dtype=torch.float64)



### Example 1-8 - Tensor properties

Casting using the constructor of a specific tensor type and by providing the dtype

In [12]:
# a float tensor is a 32-bit floating point
x = torch.FloatTensor([[1, 2, 3],
                       [4, 5, 6]])
describe(x)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: tensor([[1., 2., 3.],
        [4., 5., 6.]])



In [13]:
# a long tensor is a 64-bit integer (signed)
x = x.long()
describe(x)

Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
values: tensor([[1, 2, 3],
        [4, 5, 6]])



In [14]:
x = torch.tensor([[1, 2, 3],
                  [4, 5, 6]], dtype=torch.int64)
describe(x)

Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
values: tensor([[1, 2, 3],
        [4, 5, 6]])



In [15]:
x = x.float()
describe(x)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: tensor([[1., 2., 3.],
        [4., 5., 6.]])



### Example 1-9 - Tensor operations: addition

In [16]:
x = torch.randn(2, 3)
describe(x)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: tensor([[-0.2216, -0.1924,  1.6756],
        [ 1.8278,  0.1557, -0.6030]])



In [17]:
describe(torch.add(x, x))

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: tensor([[-0.4432, -0.3848,  3.3513],
        [ 3.6556,  0.3115, -1.2061]])



In [19]:
describe(x + x)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: tensor([[-0.4432, -0.3848,  3.3513],
        [ 3.6556,  0.3115, -1.2061]])



### Example 1-10 - Dimension-based tensor operations

In [20]:
x = torch.arange(6)
describe(x)

Type: torch.LongTensor
Shape/size: torch.Size([6])
values: tensor([0, 1, 2, 3, 4, 5])



**I just noticed that by default, torch is creating the tensors in the CPU. If I want to create a gpu tensor, then I can specify the device as shown below**

https://pytorch.org/tutorials/beginner/introyt/tensors_deeper_tutorial.html#:~:text=By%20default%2C%20new%20tensors%20are,cuda.

In [21]:
x = torch.arange(6, device='cuda')
describe(x)

Type: torch.cuda.LongTensor
Shape/size: torch.Size([6])
values: tensor([0, 1, 2, 3, 4, 5], device='cuda:0')



### Reshaping a tensor with the torch.tensor.view() method

From the docs:

- Returns a new tensor with the same data as the self tensor but of a different shape. The returned tensor shares the same data and must have the same number of elements, but may have a different size.

https://pytorch.org/docs/stable/generated/torch.Tensor.view.html

In [22]:
x = x.view(2, 3)
describe(x)

Type: torch.cuda.LongTensor
Shape/size: torch.Size([2, 3])
values: tensor([[0, 1, 2],
        [3, 4, 5]], device='cuda:0')



In [24]:
# summing across the rows
describe(torch.sum(x, dim=0))

Type: torch.cuda.LongTensor
Shape/size: torch.Size([3])
values: tensor([3, 5, 7], device='cuda:0')



In [25]:
# summing across the columns
describe(torch.sum(x, dim=1))

Type: torch.cuda.LongTensor
Shape/size: torch.Size([2])
values: tensor([ 3, 12], device='cuda:0')



In [26]:
# transposing
describe(torch.transpose(x, 0, 1))

Type: torch.cuda.LongTensor
Shape/size: torch.Size([3, 2])
values: tensor([[0, 3],
        [1, 4],
        [2, 5]], device='cuda:0')



### Example 1-11 - Slicing and indexing a tensor

In [27]:
x = torch.arange(6).view(2, 3)
describe(x)

Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
values: tensor([[0, 1, 2],
        [3, 4, 5]])



In [28]:
describe(x[:1, :2])

Type: torch.LongTensor
Shape/size: torch.Size([1, 2])
values: tensor([[0, 1]])



In [29]:
describe(x[0, 1])

Type: torch.LongTensor
Shape/size: torch.Size([])
values: 1



### Example 1-12. Complex indexing: noncontinguous indexing of a tensor

In [31]:
indices = torch.LongTensor([0, 2])
describe(torch.index_select(x, dim=1, index=indices))

Type: torch.LongTensor
Shape/size: torch.Size([2, 2])
values: tensor([[0, 2],
        [3, 5]])



In [32]:
indices = torch.LongTensor([0, 0])
describe(torch.index_select(x, dim=0, index=indices))

Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
values: tensor([[0, 1, 2],
        [0, 1, 2]])



In [33]:
row_indices = torch.arange(2).long()
col_indices = torch.LongTensor([0, 1])
describe(x[row_indices, col_indices])

Type: torch.LongTensor
Shape/size: torch.Size([2])
values: tensor([0, 4])



### Example 1-13. Concatenating tensors

In [34]:
x = torch.arange(6).view(2, 3)
describe(x)

Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
values: tensor([[0, 1, 2],
        [3, 4, 5]])



In [35]:
describe(torch.cat([x, x], dim=0))

Type: torch.LongTensor
Shape/size: torch.Size([4, 3])
values: tensor([[0, 1, 2],
        [3, 4, 5],
        [0, 1, 2],
        [3, 4, 5]])



In [37]:
describe(torch.cat([x, x], dim=1))

Type: torch.LongTensor
Shape/size: torch.Size([2, 6])
values: 
tensor([[0, 1, 2, 0, 1, 2],
        [3, 4, 5, 3, 4, 5]])


In [38]:
describe(torch.stack([x, x]))

Type: torch.LongTensor
Shape/size: torch.Size([2, 2, 3])
values: 
tensor([[[0, 1, 2],
         [3, 4, 5]],

        [[0, 1, 2],
         [3, 4, 5]]])


### Example 1-14. Linear Algebra on tensors: multiplication

In [43]:
x1 = torch.arange(6, dtype=torch.float).view(2, 3)
describe(x1)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
values: 
tensor([[0., 1., 2.],
        [3., 4., 5.]])


In [44]:
x2 = torch.ones(3, 2)
x2[:, 1] += 1
describe(x2)

Type: torch.FloatTensor
Shape/size: torch.Size([3, 2])
values: 
tensor([[1., 2.],
        [1., 2.],
        [1., 2.]])


In [45]:
describe(torch.mm(x1, x2))

Type: torch.FloatTensor
Shape/size: torch.Size([2, 2])
values: 
tensor([[ 3.,  6.],
        [12., 24.]])


## Tensors and Computational Graphs

### Example 1-15. Creating tensors for gradient bookkeeping

In [46]:
x = torch.ones(2, 2, requires_grad=True)
describe(x)
print(x.grad is None)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 2])
values: 
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
True


In [47]:
y = (x + 2) * (x + 5) + 3
describe(y)
print(x.grad is None)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 2])
values: 
tensor([[21., 21.],
        [21., 21.]], grad_fn=<AddBackward0>)
True


In [48]:
z = y.mean()
describe(z)
z.backward()
print(x.grad is None)

Type: torch.FloatTensor
Shape/size: torch.Size([])
values: 
21.0
False


## CUDA Tensors

### Example 1-16. Creating CUDA tensors

In [49]:
print(torch.cuda.is_available())

True


In [50]:
# preferred method: device agnostic tensor instantiation
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


In [51]:
x = torch.rand(3, 3).to(device)
describe(x)

Type: torch.cuda.FloatTensor
Shape/size: torch.Size([3, 3])
values: 
tensor([[0.2698, 0.1131, 0.9295],
        [0.9810, 0.5941, 0.3047],
        [0.9233, 0.8231, 0.4034]], device='cuda:0')


### Example 1-17. Mixing CUDA tensors with CPU-bound tensors

In [52]:
y = torch.rand(3, 3)
x + y

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

In [53]:
cpu_device = torch.device("cpu")
y = y.to(cpu_device)
x = x.to(cpu_device)
x + y

tensor([[1.1779, 0.4946, 1.4890],
        [1.1832, 1.2387, 0.9590],
        [1.2896, 1.7957, 1.0951]])