# PyTorch Basics
This notebook summarizes the content covered in the "60 Minute Blitz Tutorial" of PyTorch that covers the basics of PyTorch library.

## Tensors
Tensors are basically like stronger numpy arrays. These are special arrays that can do math very quick (needed for neural networks) and can run on CPU, GPU or even specialized TPUs (hardware designed for neural network training from the ground up). In fact tensors and numpy arrays are so alike, they are usually connected through a bridge that allows conversion between the two as they can share the same underlying memory location.

In [2]:
# Import stuff
import torch
import numpy as np

### Tensor Initialization
Tensors can be initialized in a ton of different ways, to allow flexibility depending on what the ML pipeline can be.

In [3]:
# From slower native lists
data = [[1,2], [3,4]]
data_tensor = torch.tensor(data)

# From numpy arrays
data_numpy = np.array(data)
data_tensor_2 = torch.tensor(data)

# Similar to numpy they have a bunch of random or fixed value initializations
shape = (2, 3,)
# Init random values
torch.rand(shape)
# Init ones
torch.ones(shape)
# Init zeros
torch.zeros(shape)

tensor([[0., 0., 0.],
        [0., 0., 0.]])

### Tensor Attributes
Each tensor that's created has few information associated with it.

In [5]:
test = torch.ones(shape)

print(test.shape) # dimensions of tensor
print(test.device) # where the tensor is stored now, always created on CPU by default
print(test.dtype) # Data type of the tensor

torch.Size([2, 3])
cpu
torch.float32


### Tensor Operations
There are some tensor-specific operations that can be performed. Some of them are essentially the same as numpy operations with one exception below.

In [7]:
test_gpu = test.to('cuda') # move the tensor to GPU for faster calculation
print(test_gpu.device)

cuda:0


## AutoGrad

AutoGrad is one of the critical components that allow PyTorch to build neural networks. As you might know every neural network consists of two key steps.

*Forward Propagation:* Model uses data and its current parameters to make a guess of the end objective (a number for regression, label for classification and so on).

*Backward Propagation:* Based on a loss function that tells us how bad our model's guesses were, this tries to get derivatives of the loss function with respect to the parameters to update said parameters by a small value that is opposite to the greatest ascent (hence obviously called gradient descent) multiplied with the learning rate.

AutoGrad helps to achieve the derivative needed that makes backward propagation possible for neural networks to "Learn".