<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Tensors" data-toc-modified-id="Tensors-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Tensors</a></span><ul class="toc-item"><li><span><a href="#Tensor:-Shape" data-toc-modified-id="Tensor:-Shape-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Tensor: Shape</a></span></li><li><span><a href="#Tensor:-Reshape" data-toc-modified-id="Tensor:-Reshape-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Tensor: Reshape</a></span></li></ul></li><li><span><a href="#Loading-Data,-Devices,-and-CUDA" data-toc-modified-id="Loading-Data,-Devices,-and-CUDA-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Loading Data, Devices, and CUDA</a></span><ul class="toc-item"><li><span><a href="#Numpy-to-PyTorch-tensor" data-toc-modified-id="Numpy-to-PyTorch-tensor-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Numpy to PyTorch tensor</a></span></li><li><span><a href="#PyTorch-Tensor-to-Numpy" data-toc-modified-id="PyTorch-Tensor-to-Numpy-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>PyTorch Tensor to Numpy</a></span></li><li><span><a href="#GPU-Tensors" data-toc-modified-id="GPU-Tensors-2.3"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>GPU Tensors</a></span></li></ul></li><li><span><a href="#Creating-Parameters" data-toc-modified-id="Creating-Parameters-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Creating Parameters</a></span><ul class="toc-item"><li><span><a href="#First-Attempt" data-toc-modified-id="First-Attempt-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>First Attempt</a></span></li><li><span><a href="#Second-Attempt" data-toc-modified-id="Second-Attempt-3.2"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Second Attempt</a></span></li><li><span><a href="#Third-Attempt" data-toc-modified-id="Third-Attempt-3.3"><span class="toc-item-num">3.3&nbsp;&nbsp;</span>Third Attempt</a></span></li><li><span><a href="#Fourth-Attempt" data-toc-modified-id="Fourth-Attempt-3.4"><span class="toc-item-num">3.4&nbsp;&nbsp;</span>Fourth Attempt</a></span></li></ul></li><li><span><a href="#Autograd" data-toc-modified-id="Autograd-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Autograd</a></span></li></ul></div>

In [1]:
import os
import numpy as np
import pandas as pd

import torch

`Torch has similar functions to numpy for creating Scalar/Vector/Matrix like ones, zeros and random numbers`
`All numbers are created as tensor objects`

## Tensors

In [2]:
import torch

# creating a scalar, and three tensors (vector, matrix, tensor)
scalar = torch.tensor(3.14159)
vector = torch.tensor([1, 2, 3])
matrix = torch.ones((2, 3), dtype=torch.float)
tensor = torch.randn((2, 3, 4), dtype=torch.float)

print(scalar)
print(vector)
print(matrix)
print(tensor)

tensor(3.1416)
tensor([1, 2, 3])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[[-0.1906, -1.1328,  0.4272, -0.7133],
         [ 1.4926,  2.4325, -0.0694, -0.6471],
         [ 0.4810,  0.6902, -1.3658,  0.8439]],

        [[-0.1628,  1.9884, -0.4888,  0.1109],
         [-0.2573, -0.1808,  0.0487,  0.8678],
         [ 1.0138,  0.6575, -0.2455, -2.1898]]])


`We can get dimensions of tensor using either size() or shape method`

### Tensor: Shape

In [3]:
print(tensor.shape, tensor.size())

torch.Size([2, 3, 4]) torch.Size([2, 3, 4])


`All the tensors have shapes, but scalars have empty shapes since they are dimansionless (or zero dimansions)`

In [4]:
scalar = torch.tensor(3.14159)
print(scalar.shape, scalar.size())

torch.Size([]) torch.Size([])


### Tensor: Reshape

`We can reshape tensors either using view() method or reshape() method`
- `The view() method only returns a tensor with the desired shape that shares the underlying data with the original tensor; it does not create a new, independent tensor`
- `The reshape() method may or may not create a copy! for this behavior view() is preferred`

In [7]:
matrix = torch.ones((2, 3), dtype=torch.float)

# Reshaping the tensor to shape of (1, 6)
reshape_matrix_1 = matrix.view(1, 6)

print(matrix, reshape_matrix_1)

tensor([[1., 1., 1.],
        [1., 1., 1.]]) tensor([[1., 1., 1., 1., 1., 1.]])


In [8]:
# If we change one entry of reshape_matrix_1, then that change will get reflected
# in matrix
reshape_matrix_1[0, 1] = 4
print(matrix, reshape_matrix_1)

tensor([[1., 4., 1.],
        [1., 1., 1.]]) tensor([[1., 4., 1., 1., 1., 1.]])


`To create new, independent tensor with different shape, we can use new_tensor() or clone() methods`

In [9]:
# Creating new and independent matrix using new_tensor() method
new_matrix = matrix.new_tensor(matrix.view(1, 6))

# Chaging the element of new matrix
new_matrix[0, 1] = 3

print(new_matrix)
print(matrix)

tensor([[1., 3., 1., 1., 1., 1.]])
tensor([[1., 4., 1.],
        [1., 1., 1.]])


  new_matrix = matrix.new_tensor(matrix.view(1, 6))


`Above we can see that pytorch throws warning while using new_tensor(), So clone() method is prefered over new_tensor()`

In [10]:
matrix = torch.ones((2, 3), dtype=torch.float)

# Creating duplicate and independent matrix using clone and view method
another_matrix = matrix.view(1, 6).clone().detach()

# Changing one of the value of another_matrix
another_matrix[0, 1] = 9

print(another_matrix)
print(matrix)

tensor([[1., 9., 1., 1., 1., 1.]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])


- `We can observe that change in another_matrix is not reflected to matrix`
- `detach() method is used to remove the tensor from computation graph`

## Loading Data, Devices, and CUDA

- `as_tensor() method is used to convert the numpy object to pytorch tensors`
- `as_tensor() preserves the type of the array`

### Numpy to PyTorch tensor

In [15]:
import numpy as np

x_train = np.random.rand(2, 3)

x_train_tensor = torch.as_tensor(x_train)
print(x_train.dtype, x_train_tensor.dtype)

float64 torch.float64


`To cast down the tensor from float64 to float32, we can use float() method to tensor`

In [16]:
float_tensor = x_train_tensor.float()
print(x_train_tensor.dtype, float_tensor.dtype)

torch.float64 torch.float32


`Important: Both as_tensor() and from_numpy() return a tensor that shares the underlying data with the original Numpy array. Similar to what happened when we used view() method. I we modify original Numpy array, corresponding PyTorch tensor too gets modified and vice-versa`

In [19]:
np_array = np.ones(shape=(2, 3))

pt_tensor = torch.as_tensor(np_array)

# changing the value of pt_tensor
pt_tensor[1, 2] = 4
print(np_array)
print(pt_tensor)

[[1. 1. 1.]
 [1. 1. 4.]]
tensor([[1., 1., 1.],
        [1., 1., 4.]], dtype=torch.float64)


`Difference between as_tensor() and torch.tensor(): torch.tensor() always makes a copy of the data instead of sharing the underlu=ying data woth the Numpy array`

### PyTorch Tensor to Numpy

`We can convert PyTorch tensor back to numpy array using numpy() method`

In [21]:
array = np.random.rand(2, 3)

pt_tensor = torch.as_tensor(array)

print(pt_tensor.numpy(), type(pt_tensor.numpy()))

[[0.11799011 0.28302073 0.63657481]
 [0.92466622 0.24974654 0.53397068]] <class 'numpy.ndarray'>


### GPU Tensors

`Check if CUDA is available`

In [22]:
device = "cuda" if torch.cuda.is_available() else "cpu"
print(device)

cpu


`Checking cuda count and it's names`

In [24]:
n_cudas = torch.cuda.device_count()
for i in range(n_cudas):
    print(torch.cuda.get_device_name(i))

`Turning tensor into GPU tensor`

In [25]:
# converting the training numpy array to a tensor first and then to a GPU tensor
gpu_tensor = torch.as_tensor(x_train).to(device)
print(gpu_tensor[0])

tensor([0.6479, 0.1499, 0.5774], dtype=torch.float64)


`Converting GPU tensor to Numpy`

In [30]:
# PyTorch tensor
pt_tensor = torch.ones(size=(2, 3)).to(device)

numpy_array = pt_tensor.cpu().numpy()
print(numpy_array)

[[1. 1. 1.]
 [1. 1. 1.]]


`Unfortunately, Numpy cannot handle GPU tensors. You need to make them CPU tensors first using cpu() method. So to avoid this error, first use cpu() and then numpy() even if you are using CPU.`

## Creating Parameters

`Parameters can be created in three ways`

### First Attempt

- `Creating tensor with parameter requires_grad=True`

In [39]:
torch.manual_seed(42)

b = torch.randn(1, requires_grad=True, dtype=torch.float)
w = torch.randn(1, requires_grad=True, dtype=torch.float)
print(b, w)

tensor([0.3367], requires_grad=True) tensor([0.1288], requires_grad=True)


`This method will create parameter only for CPU, It will not work for GPU`

### Second Attempt

`Creating Parameters with requires_grad=True and sending it to GPU using .to(device) method`

In [37]:
device = "cuda" if torch.cuda.is_available() else "cpu"

In [38]:
torch.manual_seed(42)
b = torch.randn(1, requires_grad=True, dtype=torch.float).to(device)
w = torch.randn(1, requires_grad=True, dtype=torch.float).to(device)
print(b, w)

tensor([0.3367], requires_grad=True) tensor([0.1288], requires_grad=True)


`In case of GPU, the outcome will be:`
- `tensor([0.3367], decive="cuda:0", grad_fn=<CopyBackward>)`
- `tensor([0.1288], device="cuda:0", grad_fn=<CopyBackward>)`

`With this approach, gradient of parameter is lost while transferring the parameter from CPU to GPU`

### Third Attempt

`In this approach we send our tensors to the device and then use the requires_grad_() method to set its requires_grad() attribute to True`

**In PyTorch, every method that ends with an underscore ( _ ) like the requires_grad_() method above, makes changes in-place; meaning, they will modify the underlying variable**

In [41]:
device = "cuda" if torch.cuda.is_available() else "cpu"

torch.manual_seed(42)
b = torch.randn(1, dtype=torch.float).to(device)
w = torch.randn(1, dtype=torch.float).to(device)

b.requires_grad_()
w.requires_grad_()

print(b, w)

tensor([0.3367], requires_grad=True) tensor([0.1288], requires_grad=True)


`In case of GPU, the output will be:`
- `tensor([0.3367], device="cuda:0", requires_grad=True)`
- `tensor([0.1288], device="cuda:0", requires_grad=True)`

`This approach works fine but there is a better way to handle this and remove the effort of using requires_grad_() for each trainable parameter`

### Fourth Attempt

In [43]:
device = "cuda" if torch.cuda.is_available() else "cpu"

torch.manual_seed(42)
b = torch.randn(1, requires_grad=True, dtype=torch.float, device=device)
w = torch.randn(1, requires_grad=True, dtype=torch.float, device=device)

print(b, w)

tensor([0.3367], requires_grad=True) tensor([0.1288], requires_grad=True)


`In case of GPU, the output will be:`
- `tensor([0.1940], device="cuda:0", requires_grad=True)`
- `tensor([0.1391], device="cuda:0", requires_grad=True)`

`Although torch.manual_seeds are same but for GPU, tensor values will be different because for GPU random generator generates different sequence compared to CPU`