### Learning Pytorch
### Pytorch led by Facebook AI Research (FAIR) is a minimal library built off Torch as a python extension of Lua which can directly use CUDA GPUs for accelerated processing - and has good codeflow.


### The main modules of pytorch as as below and they are self explanatory in sense. 

<img src = "images/pytorch_modules.png">

<br><br>
- torch.nn = Base class for all neural network modules.
    - torch.nn.functional = functionality to blocks of network
<br><br>
- torch.Tensor = multi-dimensional matrix containing elements of a single data type.
<br><br>
- torch.autograd = provides classes and functions implementing automatic differentiation. A recorder records what operations have performed, and then it replays it backward to compute the gradients. This method is especially powerful when building neural networks to save time on one epoch by calculating differentiation of the parameters at the forward pass. 
<br><br>
- torch.utils.data = the heart of PyTorch data loading utility is the torch.utils.data.DataLoader class. It represents a Python iterable over a dataset.

#### Make sure you have an virutal env set up with anaconda, and conda installed pytorch, cudatoolkit, and jupyter.

### Verify the install


In [40]:
# Should succeed without hesitation
import torch

In [41]:
print(torch.__version__)

1.7.1


In [42]:
# Verify you have CUDA-supported NVIDIA GPU
torch.cuda.is_available()

True

In [43]:
# Should match the version of cudatoolkit installed
torch.version.cuda

'10.2'

### Sidenote: Why GPUs
For tasks such as mathematical computation, it is possbile to achieve solution through parallel computing because even complex equations can be broken down into smaller computations that are independent with respect to one another. GPUs are much more capable of parallel computation because of their massive core counts.

CUDA is a API library that has to be downloaded (like cudatoolkit) to enable this feature on NVIDIA GPUs.

Pytorch is verstile in that it can selectively hold mathematical objects such as tensors on different devices such as the GPU or CPU. 
#### Why not use a GPU for everything?
Moving simple computation into GPU will slow down code even more because the time it would take a modern CPU to complete the code can be done by the time the GPU receives the task. Remember GPUs are only amazing densely mathematical tasks that can be done in parallel.

In [44]:
# simply creating Pytorch tensor object
t = torch.Tensor([1,2,3])
t

tensor([1., 2., 3.])

In [48]:
# viewing GPU
device = torch.device('cuda:0')
device

device(type='cuda', index=0)

In [49]:
# Moving tensor object to GPU
t = t.cuda()
t

RuntimeError: CUDA error: unspecified launch failure

In [47]:
print(t.dtype)
print(t.device)
print(t.layout)

torch.float32
cpu
torch.strided


### Tensors
Tensors, in short, is a general term to describe a representation of numbers which can have rank, axes, and shapes. Think about an matrix, but with more dimensionability. 

Number of dimensions present within a tensor. How many indices you need to access an element - for example the above tensor 't' has a rank of 1 because each element only needs one index for access like t[1]. Axes are measures of which elements can exist, and multiple axes create more dimensions for elements to exist. 

Shapes of tensors encode all relevant info about rank, axes and even indices. Tensors are constantly reshaped and transformed (which is why their named tensors). Tensors can represent the same underlying data but have different shapes. Think about it: if 'a' was a list of lengths of toothpicks (or something weird like that) in the code below, changing its shape wouldn't change those lengths of the the toothpicks.


Shapes can be more than 2 or 3 dimensions. For example, an image can typically have three dimensions represented by height, width, and RGB value of a pixel: [C,H,W]. However, in deep learning, more dimensions can be added, such as batch size (an amount of images in a subgroup). Now an image will have 4 dimensions and look like: [B,C,H,W]. So now as a rank 4 tensor we can navigate to a specific pixel in a specific image with a specific color. 

Features can be extracted from these values by transforming them with specific operations. See more later on this.


In [25]:
a = [
    [4,5,6],
    [7,8,9]
]

In [26]:
a[1][2]

9

In [27]:
type(a)

list

In [32]:
b = torch.Tensor(a)
b

tensor([[4., 5., 6.],
        [7., 8., 9.]])

In [29]:
b.shape

torch.Size([2, 3])

In [30]:
b.reshape(1,6)

tensor([[4, 5, 6, 7, 8, 9]])

In [31]:
b.reshape(1,6).shape

torch.Size([1, 6])

In [52]:
import numpy as np

In [56]:
data = np.array([1,2,3])
type(data)

numpy.ndarray

#### Different ways to initiate a tensor and exploring their data types

In [86]:
#notice this is a different construction of a tensor then the next 4: Pytorch Class Constructor of a tensor  - unchanged by numpy array manipulation

t1 = torch.Tensor(data)
print(t1.dtype)

torch.float32


In [87]:
#factory function (accepts parameter inputs and outputs tensor objects) - unchanged by numpy array manipulation

t2 = torch.tensor(data)
print(t2.dtype)

torch.int64


In [88]:
#also a factory function --> changing the numpy array (variable 'data') will change these values

t3 = torch.as_tensor(data)
print(t3.dtype)

torch.int64


In [74]:
#also a factory function --> changing the numpy array (variable 'data') will change these values

t4 = torch.from_numpy(data)
print(t4.dtype)

torch.int64


### Recap
torch.Tensor(data) & torch.tensor(data) = **data is copied** but also different data types
<br>
torch.as_tensor(data) & torch.from_numpy(data) = **data is shared**(saves memory space) and same data types
<br><br>
torch.tensor() = best for everyday use
<br>
torch.as_tensor() = best for fine-tuning

In [90]:
# data type also is inferred by incoming data
print(torch.get_default_dtype()) 

torch.float32


In [91]:
a = torch.tensor(np.array([1,2,3]))
print(a)
a.dtype

tensor([1, 2, 3])


torch.int64

In [92]:
a = torch.tensor(np.array([1.,2.,3.]))
print(a)
a.dtype

tensor([1., 2., 3.], dtype=torch.float64)


torch.float64

#### Creating multi-dimensional tensors

In [61]:
torch.eye(2)

tensor([[1., 0.],
        [0., 1.]])

In [62]:
torch.zeros(2,2)

tensor([[0., 0.],
        [0., 0.]])

In [63]:
torch.ones(2,2)

tensor([[1., 1.],
        [1., 1.]])

In [66]:
#between 0-1 and four decimals out
torch.rand(2,2)

tensor([[0.2592, 0.8560],
        [0.4719, 0.9232]])

## Tensor operation types:

    1. Reshaping operations
    2. Element-wise operations
    3. Reduction operations
    4. Access operations
