# Introduction to Pytorch

PyTorch is a Python-based scientific computing package serving two broad purposes:
- A replacement for NumPy to use the power of GPUs and other accelerators.
- An automatic differentiation library that is useful to implement neural networks.

## Installing and importing pytorch

Use **`!pip install torch`** to install pytorch on your system.

In [2]:
# Install pytorch
!pip install torch



In [2]:
# Import pytorch and numpy
import torch
import numpy as np

## Basics of pytorch

### Tensors
Tensors are a specialized data structure that are very similar to arrays and matrices. In PyTorch, we use tensors to encode the inputs and outputs of a model, as well as the model’s parameters. Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or other hardware accelerators. 

In [3]:
# Example of a tensor
data = [[1, 2], [3, 4]]

x_data = torch.tensor(data)
print(x_data)

tensor([[1, 2],
        [3, 4]])


In [4]:
# Converting a tensor from numpy array
np_array = np.array(data)
x_np = torch.from_numpy(np_array)

print(x_np)

tensor([[1, 2],
        [3, 4]], dtype=torch.int32)


In [5]:
# The new tensor retains the properties of the argument tensor, unless explicitly overridden.
x_ones = torch.ones_like(x_data) # retains the properties of x_data and prints a tensor of 1s
print(x_ones)

x_rand = torch.rand_like(x_data, dtype = torch.float) # overrides the datatype of x_data
print(x_rand)

tensor([[1, 1],
        [1, 1]])
tensor([[0.6033, 0.1902],
        [0.4158, 0.3065]])


In [9]:
# shape is a tuple of tensor dimensions.
shape = (2,3,)

# Generates a random tensor
print('Random Tensor:\n', torch.rand(shape), end = '\n')

# Generates a 1s tensor
print('Ones tensor:\n', torch.ones(shape), end = '\n')

# Generates a 0s tensor
print('Zeros tensor:\n', torch.zeros(shape), end = '\n')

Random Tensor:
 tensor([[0.3945, 0.9494, 0.7761],
        [0.1430, 0.5545, 0.4171]])
Ones tensor:
 tensor([[1., 1., 1.],
        [1., 1., 1.]])
Zeros tensor:
 tensor([[0., 0., 0.],
        [0., 0., 0.]])


#### Attributes of a Tensor

Tensor attributes describe their shape, datatype, and the device on which they are stored.

In [14]:
tensor = torch.rand(3,4)
print(tensor)

# Prints the shape of the tensor
print('Shape of tensor:', tensor.shape)

# Prints the datatype of the tensor
print('Datatype of tensor:', tensor.dtype)

# Prints the storage on which the tensor is stored; cpu or gpu
print('Device tensor is stored on:', tensor.device)

tensor([[0.0989, 0.3990, 0.0591, 0.9137],
        [0.2104, 0.9316, 0.5641, 0.0670],
        [0.9706, 0.0558, 0.1472, 0.9215]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


#### Operations on a tensor

By default, tensors are created on the CPU. We need to explicitly move tensors to the GPU using `.to` method (after checking for GPU availability).

In [15]:
# We move our tensor to the GPU if available
if torch.cuda.is_available():
    tensor = tensor.to('cuda')

##### Indexing and Slicing

In [18]:
# Generate a random tensor
tensor = torch.rand(3,4)
print(tensor)

# Indexing and Slicing
# Print the first row
print('First row:', tensor[0])

# Print the first column
print('First column:', tensor[:, 0])

# Print the last column
print('Last column:', tensor[..., -1])

# Replace the values
tensor[:, 1] = 0

print('Changed Tensor:\n', tensor)

tensor([[8.0727e-02, 4.0366e-01, 1.1625e-01, 6.8079e-01],
        [5.7560e-01, 7.3348e-01, 2.0385e-05, 1.2446e-01],
        [9.7190e-01, 4.3251e-01, 6.0153e-01, 9.1177e-01]])
First row: tensor([0.0807, 0.4037, 0.1163, 0.6808])
First column: tensor([0.0807, 0.5756, 0.9719])
Last column: tensor([0.6808, 0.1245, 0.9118])
Changed Tensor:
 tensor([[8.0727e-02, 0.0000e+00, 1.1625e-01, 6.8079e-01],
        [5.7560e-01, 0.0000e+00, 2.0385e-05, 1.2446e-01],
        [9.7190e-01, 0.0000e+00, 6.0153e-01, 9.1177e-01]])


##### Joining Tensors

In [19]:
# Use the .cat function to conctenate tensors
t1 = torch.cat([tensor, tensor, tensor], dim = 1)
print(t1)

tensor([[8.0727e-02, 0.0000e+00, 1.1625e-01, 6.8079e-01, 8.0727e-02, 0.0000e+00,
         1.1625e-01, 6.8079e-01, 8.0727e-02, 0.0000e+00, 1.1625e-01, 6.8079e-01],
        [5.7560e-01, 0.0000e+00, 2.0385e-05, 1.2446e-01, 5.7560e-01, 0.0000e+00,
         2.0385e-05, 1.2446e-01, 5.7560e-01, 0.0000e+00, 2.0385e-05, 1.2446e-01],
        [9.7190e-01, 0.0000e+00, 6.0153e-01, 9.1177e-01, 9.7190e-01, 0.0000e+00,
         6.0153e-01, 9.1177e-01, 9.7190e-01, 0.0000e+00, 6.0153e-01, 9.1177e-01]])


##### Arithmetic Operations

In [23]:
# tensor.T returns the transpose of a tensor
# Multiplication between the tensor and its transpose
result_1 = tensor @ tensor.T
print(result_1)

# Use the .matmul operator for multiplication
result_2 = tensor.matmul(tensor.T)
print(result_2)

# Use the .matmul from the torch library
result_3 = torch.rand_like(y1)
torch.matmul(tensor, tensor.T, out = result_3)
print(result_3)

tensor([[0.4835, 0.1312, 0.7691],
        [0.1312, 0.3468, 0.6729],
        [0.7691, 0.6729, 2.1377]])
tensor([[0.4835, 0.1312, 0.7691],
        [0.1312, 0.3468, 0.6729],
        [0.7691, 0.6729, 2.1377]])
tensor([[0.4835, 0.1312, 0.7691],
        [0.1312, 0.3468, 0.6729],
        [0.7691, 0.6729, 2.1377]])


##### Single element tensors

In [24]:
# Get the sum of the tensor
agg = tensor.sum()
print(agg)

# Convert it to a python numerical value using item()
agg_item = agg.item()

# Check the value
print(agg_item, type(agg_item))

tensor(4.0630)
4.063040256500244
4.063040256500244 <class 'float'>


##### In-place operations

Operations that store the result into the operand are called in-place. They are denoted by a **_** suffix. For example: **`x.copy_(y), x.t_()`**, will change x.

In [25]:
print(tensor)

# Add 5 to the matrix in-place
tensor.add_(5)

# Print the updated tensor
print(tensor)

tensor([[8.0727e-02, 0.0000e+00, 1.1625e-01, 6.8079e-01],
        [5.7560e-01, 0.0000e+00, 2.0385e-05, 1.2446e-01],
        [9.7190e-01, 0.0000e+00, 6.0153e-01, 9.1177e-01]]) 

tensor([[5.0807, 5.0000, 5.1163, 5.6808],
        [5.5756, 5.0000, 5.0000, 5.1245],
        [5.9719, 5.0000, 5.6015, 5.9118]])


#### Relation with Numpy

In [28]:
# Print the tensor
print(tensor)

# Convert to numpy
n = tensor.numpy()

# Numpy array
print(n) 

tensor([[5.0807, 5.0000, 5.1163, 5.6808],
        [5.5756, 5.0000, 5.0000, 5.1245],
        [5.9719, 5.0000, 5.6015, 5.9118]])
[[5.080727  5.        5.1162515 5.680785 ]
 [5.5755997 5.        5.0000205 5.124462 ]
 [5.971895  5.        5.601529  5.91177  ]]


In [29]:
# Convert numpy array to tensor
t = torch.from_numpy(n)

# Add the numpy array, changes will be reflected in the tensor too
np.add(n, 1, out = n)

print(t)
print(n)

tensor([[6.0807, 6.0000, 6.1163, 6.6808],
        [6.5756, 6.0000, 6.0000, 6.1245],
        [6.9719, 6.0000, 6.6015, 6.9118]])
[[6.080727  6.        6.1162515 6.680785 ]
 [6.5755997 6.        6.0000205 6.124462 ]
 [6.971895  6.        6.601529  6.91177  ]]


### Gradients

The gradient is used to find the derivatives of the function. In mathematical terms, derivatives mean differentiation of a function partially and finding the value.

In [30]:
# Create tensors with requires_grad = True
x = torch.tensor(3.)
w = torch.tensor(4., requires_grad=True)
b = torch.tensor(5., requires_grad=True)

print(x, w, b)

tensor(3.) tensor(4., requires_grad=True) tensor(5., requires_grad=True)


In [31]:
# Arithmetic operations
y = w * x + b
print(y)

tensor(17., grad_fn=<AddBackward0>)


To compute the derivatives, we can call the **`.backward()`** method on our result y

In [32]:
# Compute derivatives
y.backward()

# Display gradients
print('dy/dx:', x.grad)
print('dy/dw:', w.grad)
print('dy/db:', b.grad)

dy/dx: None
dy/dw: tensor(3.)
dy/db: tensor(1.)


**dy/dw** has the same value as x i.e. 3, and **dy/db** has the value 1. x.grad is **None**, because x doesn't have **requires_grad set to True**.